How do FDA uses statistics to approve covid vaccines to the world?

This is for people who connect dots. With this pandemic situation, a bunch of aspiring Data Science, ML engineers are already into training models predicting infection rates, creating data stories and writing articles on Covid-19. Now models have almost built and stories are written, what’s next? Deploying them in practice right? Have you ever wondered how FDA chooses to approve those vaccines? Have you ever wondered how they choose between a couple of tough competing vaccines? As a Data aspirant, have you ever wondered doing this bothers you? Yes, It does.

Let’s take real world examples, Pfizer and Moderna vaccines are believed to show more than 90% positive results, when injected. Their peers include sputnik, casino, Novavax etc. So, in the race to save the human race, Pfizer and Moderna’s vaccine are in the final battlegrounds. What FDA actually does? Not always biological sciences helps all the time, you can’t ignore mathematics. This blog discusses on how vaccines are approved in a statistical way, I’m no where a pharmacist.

Coming to the point, what the FDA does?
Assume FDA has already built a model that predicts the effectiveness (recovery rate) of Pfizer’s vaccine for different individuals based on their body conditions. Now FDA decides to test the claim of Moderna stating, Moderna’s vaccine has a higher recovery rate than Pfizer’s Vaccine. To test this claim FDA uses a statistical technique which is discussed as follows.

FDA selects a random set of population (say n=1000).

Fig.1 Conventional way of splitting the data

The general way of split would be to go half way between the populations. But, one point to note here is, only with notional claims FDA can’t afford to play with the lives of the people, while Pfizer’s vaccine is already doing a decent job. At the same time, the FDA can’t opt to reject the claim of Moderna.

So, the sampling is done as follows, FDA chooses to send a smaller chunk of the population to test on the new vaccine. With Pfizer’s vaccines already doing a decent job, the effectiveness of the new vaccine is tested starting in smaller trails.

Note: In all Fig. Control refers to already existing model and Treatment refers to the newly tested model.

Fig.2 Testing with smaller chunk of population

The remaining 50 is administered with “No drug”.

The volunteers sampled in “Control” are administered with Pfizer’s and those in “Treatment” are administered with Moderna’s vaccine. Now, FDA finds out that Volunteers administered with Pfizer’s vaccines took an average of say, 4hrs to recover and those with Moderna’s vaccine took 3.8hrs to recover, which is 0.2hrs less than the prior. Those with no drug took more than 40hrs and is still recovering.

With this trail, would that be fair enough to conclude that Moderna is effective than Pfizer.

No? Why?

1.The first probable reason would be the size of the sample being very small. 2.The next one is that the test results are dependent on point estimates (Average values).

Limitation of point estimates: As point estimates are estimates about samples, conclusions about a population can’t be drawn with average values, as they are highly volatile across different samples.

The best possible alternative will be constructing confidence Interval.
Assume the sample follows a normal distribution, with volunteers of Pfizer’s having a sample mean recovery time of 4hrs and sample standard deviation of 0.5hrs and that of Moderna, has sample mean recovery time of 3.8hrs and sample standard deviation of 0.8hrs.

Fig 3. Formula to calculate Confidence interval

For medicinal tests, general rule of thumb is to choose 99% confidence interval, from the z-table 99% confidence interval corresponds to a z-score, 2.303.

With 99% confidence the recovery time for modern is 3.96hrs to 4.04hrs, while the Pfizer’s recovery time lies between 3.61hrs to 3.98hrs. The averages though suggest a clear win for Moderna, now when it comes to confidence interval the range is widened, when compared to Pfizer, the results of Moderna is slightly inconsistent.

Fig 4 Confidence Interval comparison

Here we could see the confidence interval is clearly overlapping each other, for sampling numbers 850 and 100, which means the new model doesn’t have a significant difference to replace the existing model. So, on repeating the same experiment with different sample numbers say,

Fig 5. Changing the sampling size and running the model

If FDA finds the range of variation to gain consistency with increasing samples (say) at 99% C.I. if the recovery rates of Moderna is between 3.70hrs to 3.90hrs (Previous sample gave us 3.61hrs to 3,98hrs), FDA keeps repeating the experiments till (100,850).

As we increase the sample size, if the later shows a positive impact on reducing the average recovery time say for example with sample numbers being (100-Pfizier and 850-Moderna)

Fig 6: Moderna’s model having high sample weightage (850)

At 99% confidence interval assume the Moderna’s recovery range lies between 3.78hrs to 3.82hrs,

Fig. 6a Final results of model testing

There is no overlap in confidence intervals of two vaccines, which in turn infers Moderna’s Vaccine shows a faster recovery rate compared to Pfizer at 99% confidence interval for a larger sample size. The range difference for Pfizer is 4hrs (+/- 0.05hrs ) whereas the range difference for Moderna is 3.8hrs (+/- 0.02hrs).

Yes, you are right Moderna is a clear winner.

This is one of the many methods which is used by FDA in selection of best drugs, and this method is called as “A/B testing” or “Split run” or “Controlled experiments”.

For people who aren’t convinced thinking, like what if even after repeating the experiment say 10 times, what if every time we sample from the population, Moderna gets a well responding individuals than that received by Pfizer?, though it is a valid question to have, in such cases we can make a slight variation to the above steps.

Consider a same set of population say n=1000,

Fig 7. Splitting the population into three different groups

What we have done here is, we have split the population to 3 groups while the first two groups slightly differ in the data but are administered with Pfizer and the same model is run on those groups. While the third group uses the new model that tests the Moderna vaccine’s effectiveness.

Now,

Fig 8. Recovery rate of different models.

Here, if we compare the 1st group with 3rd one we will easily conclude that Pfizer works better. But, the conclusion changes when we compare them with the 2nd model. Though the model is same the data slightly differs which accounts for the change in recovery rate. The variance between model 1 and model 2 is 0.1hrs. By repeating the experiment by changing the sample values between 3 models, like Fig 9.

Fig 9. Models with different sample values

As we gradually increase the sample size,we can clearly see that variance reduces between first two models and at the same time the average hrs to recover also turns out to be better for Moderna’s vaccine. Now, it is clearly evident that Moderna’s vaccine shows faster recovery rates than Pfizer. This method of testing is called as A/A/B testing.

Disclaimer: The Vaccine names used here are for mere examples, solely used for understanding how statistical concepts are applied in real world.

ML Data Associate | Amazon