Estimation and interpretation of vaccine efficacy in COVID-19 randomized clinical trials

Hege Michiels; An Vandebosch; Stijn Vansteelandt

doi:10.1515/scid-2022-0003

Article Publicly Available

Estimation and interpretation of vaccine efficacy in COVID-19 randomized clinical trials

, and

Published/Copyright: September 7, 2022

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information Explore this Subject

From the journal Statistical Communications in Infectious Diseases Volume 14 Issue 1

Abstract

Objectives

An exceptional effort by the scientific community has led to the development of multiple vaccines against COVID-19. Efficacy estimates for these vaccines have been widely communicated to the general public, but are nonetheless challenging to compare because they are based on phase 3 trials that differ in study design, definition of vaccine efficacy and the handling of cases arising shortly after vaccination. We investigate the impact of these choices on vaccine efficacy estimates, both theoretically and by re-analyzing the Janssen and Pfizer COVID-19 trial data under a uniform protocol. We moreover study the causal interpretation that can be assigned to per-protocol analyses typically performed in vaccine trials. Finally, we propose alternative estimands to measure the intrinsic vaccine efficacy in settings with delayed immune response.

Methods

The data of the Janssen COVID-19 trials were recreated, based on the published Kaplan-Meier curves. An estimator for the alternative causal estimand was developed using a Structural Distribution Model.

Results

In the data analyses, we observed rather large differences between intention-to-treat and per-protocol effect estimates. In contrast, the causal estimand and the different estimators used for per-protocol effects lead approximately to the same estimates.

Conclusions

In these COVID-10 vaccine trials, per-protocol effects can be interpreted as the number of cases that can be avoided by vaccination, if the vaccine would immediately induce an immune response. However, it is unclear whether this interpretation also holds in other settings.

Keywords: causal inference; COVID-19; estimand; intention-to-treat analysis; per-protocol analysis; vaccine efficacy trial

Introduction

The SARS-CoV-2 pandemic presents an extraordinary challenge to global health. In an attempt to curb the spread of disease and to control the pandemic, multiple COVID-19 vaccine candidates have been developed (World Health Organization 2021). For example, FDA issued emergency use authorizations for two mRNA vaccines, developed by Pfizer and Moderna, in December 2020 (Food and Drug Administration 2020b; Food and Drug Administration 2020c) and an adenovector vaccine, developed by Janssen, in February 2021 (Food and Drug Administration 2021), based on safety and efficacy data from phase 3 randomized double-blind, placebo-controlled trials. The demonstration of vaccine efficacy and safety in phase 3 trials is essential for authorization and to inform policy makers about potential uses of vaccines (World Health Organization 2020a). Vaccine efficacy is defined as the relative reduction of the risk of disease for vaccinated participants compared to unvaccinated controls (Halloran et al. 1991).

Although the intention-to-treat principle is central to clinical trial research, vaccine efficacy trials usually conduct a per-protocol approach where the population is restricted to eligible, fully compliant participants receiving all doses (Hudgens, Gilbert, and Self 2004; World Health Organization 2020a). These analyses moreover include a delay, typically starting after the last dose of the vaccine plus the maximum incubation period, to allow the immune response to develop and to account for the time between infection and symptom onset (Dean et al. 2019). The delay between vaccination and the development of a robust immune response is referred to as the vaccine ramp-up time (Dean, Halloran, and Longini 2018). Per-protocol analyses aim to estimate the intrinsic efficacy of the vaccine, but are vulnerable to post-randomization biases (Dean et al. 2019).

The efficacy estimates from per-protocol analyses of the aforementioned trials far exceeded the FDA and WHO thresholds of an observed 50% reduction of symptomatic disease with a lower limit above 30% for the confidence interval (Food and Drug Administration 2020a; World Health Organization 2020b). However, as seen from Table 1, results are difficult to compare between trials because they have variable dosing regimens, endpoints, risk measures, ramp-up periods and follow-up times. The differences in study length, calendar time and locations of the study centers make it particularly difficult to compare vaccine efficacy estimates as the virus mutates (Patterson et al. 2021; Senn 2022). In addition, these trials consider different populations, vaccine efficacy estimators and estimands and disagree in the way intercurrent events are handled (Baden et al. 2021; Polack et al. 2020; Sadoff et al. 2021; Voysey et al. 2021). For example, the primary analysis of the Janssen trial investigated the efficacy of the experimental vaccine against confirmed moderate to severe/critical COVID-19 (Sadoff et al. 2021), while the other companies included all confirmed cases (Baden et al. 2021; Polack et al. 2020; Voysey et al. 2021). Moreover, the Moderna study (Baden et al. 2021) was conducted at different centers in the United States, while the other studies took place on sites across different continents. Consequently, the distribution of the ethnic groups of trial participants is also different (Figure 1 in Appendix A). All trials had a case-driven study design, requiring a particular number of cases to trigger the primary analysis (European Medicines Agency 2021; Food and Drug Administration 2020b; Food and Drug Administration 2020c; Food and Drug Administration 2021).

Table 1:

Design and reported results of COVID-19 vaccine trials. Information is based on the following publications: AstraZeneca: pooled analysis of ISRCTN89951424 and ClinicalTrials.gov NCT04400838 (Voysey et al. 2021), Janssen: ClinicalTrials.gov NCT04505722 (Sadoff et al. 2021), Moderna: ClinicalTrials.gov NCT04470427 (Baden et al. 2021), Pfizer: ClinicalTrials.gov NCT04368728 (Polack et al. 2020).

	AstraZeneca	Janssen	Moderna	Pfizer
Dosing regimen	Two doses of ChAdOx1 nCoV-19 or placebo 28 days apart	Single dose of Ad26.COV2.S or placebo	Two intramuscular injections of mRNA-1273 or placebo 28 days apart	Two doses of BNT162b2 or placebo 21 days apart
Endpoint	Virologically confirmed, symptomatic COVID-19	Confirmed moderate to severe/critical COVID-19	Symptomatic COVID-19	Confirmed COVID-19
Risk measure	Incidence rate (Poisson model)	Incidence rate (Poisson model)	Hazard rate (Cox model)	Incidence rate (Bayesian beta-binomial model)
Ramp-up period	Infections before day 15 after second dose were removed	Infections before day 14 after vaccination were removed	Infections before day 14 after second dose were censored	Infections before day 7 after second dose were removed
Follow-up	Average: 43 days	Median: 58 days Range: [1; 124]	Median: 63 days Range: [0; 97] (after second dose)	Average: 46 days (starting from 7 days after second dose)
Vaccine efficacy (95% CI)	70.4% ([54.8; 80.6])	66.9% ([59.0; 73.4])	94.1% ([89.3; 96.8])	95.0% ([90.3; 97.6])

Rapaka, Hammershaimb, and Neuzil (2021) compared different phase 3 COVID-19 vaccine trials and concluded that comparisons of vaccine efficacy estimates must be made with careful consideration because of the differences in study design, study population and characteristics of circulating virus variants. Senn (2022) makes additional considerations on the differences in endpoints, power calculations, stopping boundaries and approaches to blinding in 5 large COVID-19 trials. In this paper, we investigate the implications of different choices made in the statistical analyses (rather than the study design or study population). In particular, we review the impact of the choice of the risk measure, investigate the change in vaccine efficacy over time and examine how cases arising shortly after vaccination are handled, both theoretically and by re-analyzing the Janssen and Pfizer COVID-19 data. In addition, we develop insight into the causal interpretation of different definitions of vaccine efficacy and propose alternative estimands to measure vaccine efficacy in settings with delayed immune response.

Vaccine efficacy

The primary efficacy endpoint in phase 3 vaccine trials is often defined with respect to clinical disease with laboratory confirmation, since the goal of vaccination is typically to prevent disease and not necessarily to prevent infection (Hudgens, Gilbert, and Self 2004; World Health Organization 2020a). Vaccine efficacy (VE) typically has the form VE = 1 − RR, with RR a measure of relative risk of disease in vaccinated subjects compared to placebo subjects. This measure of vaccine efficacy takes values in the interval [−∞, 1], with 1 indicating complete protection by the vaccine, 0 expressing no effect, and a negative value representing an increase in risk of disease due to vaccination (Hudgens, Gilbert, and Self 2004). As risk measure the cumulative incidence, incidence or hazard rate is typically used (Halloran et al. 2010).

When using the cumulative incidence, vaccine efficacy represents the relative reduction in risk of developing disease during the duration of the trial attributable to vaccination (Hudgens, Gilbert, and Self 2004). It has been argued that this definition is especially appropriate if the vaccine has an “all-or-nothing” mode of action, meaning that vaccination renders a part of the population completely immune while offering no protection for the remainder (Hudgens, Gilbert, and Self 2004; Smith, Rodrigues, and Fine 1984). This is because it can be interpreted as a number of cases that can be avoided by vaccination, or the probability that vaccination prevents infection before the considered time, for individuals who would be infected before that time if not vaccinated (Appendix B.1). This interpretation is justified under the assumption that vaccination never shortens the infection times. Without this assumption, the vaccine efficacy definition using cumulative incidence can still be interpreted as a lower bound for this probability (Appendix B.1).

The interpretation of vaccine efficacy estimates using the incidence or hazard rate as risk measure is not straightforward. In particular, these effects cannot easily be transferred to a proportion of cases that can be avoided by vaccination. However, these measures of vaccine efficacy are arguably useful if the vaccine is “leaky”, meaning that vaccination reduces the hazard of the disease by a multiplicative factor for all vaccinated subjects (Hudgens, Gilbert, and Self 2004; Smith, Rodrigues, and Fine 1984). The mode of action of a vaccine is usually unknown, but for rare diseases with constant incidence rate, all these risk measures lead to approximately the same vaccine efficacy estimands (Hudgens, Gilbert, and Self 2004) (see Appendix B.2). The variance of the estimated vaccine efficacy defined using cumulative incidence, usually based on the Kaplan–Meier estimator, is expected to be larger than when using the hazard or incidence rate, because of the (semi-)parametric nature of the Cox and Poisson model (Tsiatis 2007).

Delayed immunization

Preventive vaccine trials typically employ a per-protocol (PP) approach wherein only fully compliant patients are included (Hudgens, Gilbert, and Self 2004; World Health Organization 2020a). In addition, cases are often defined as patients who became ill after a fixed time lag beyond randomization in order to take into account the incubation period and to allow the vaccinee to develop a protective immune response (Hudgens, Gilbert, and Self 2004; World Health Organization 2020a). For example, the per-protocol analysis in the Janssen COVID-19 trial was restricted to cases occurring at least 14 and 28 days after vaccination respectively (Food and Drug Administration 2021). Infections observed before these days were removed from the analysis set. In this section, we investigate the effect of taking into account this additional vaccine ramp-up time for immunity to develop.

Per-protocol effects in COVID-19 trials

The aim of per-protocol analyses is to obtain insight into the intrinsic efficacy of the vaccine, i.e. the relative reduction in infections due to vaccination in fully compliant subjects, after completion of the prescribed vaccination regimen and achieving adequate immune response (Horne, Lachenbruch, and Goldenthal 2000; Hudgens, Gilbert, and Self 2004). However, this approach can negatively impact power relative to including all cases observed since baseline, especially in settings with low disease incidence (Dean, Halloran, and Longini 2018). It is moreover vulnerable to selection bias as per-protocol analyses entail comparison of subgroups selected post randomization (Hudgens et al., 2004).

In the Janssen trial, the cumulative incidence in the full analysis set was similar in both the vaccine and placebo arm until around day 14, after which the curves diverged, with more cases accumulating in the placebo group than the vaccine group (Figure 1). Therefore, we expect the possible selection bias in the per-protocol analysis to be more severe when cases observed before day 28 are removed compared to when cases before day 14 are removed. Selection bias may occur if cases develop during the ramp-up period and vaccinated and placebo groups are no longer at comparable risk of infection after this period (Horne et al. 2000). In the Janssen trial, approximately 77 (18%) of the placebo cases and 75 (39%) of the vaccine cases were observed during the first 14 days, even though it is expected that only a small proportion of the total cases occurs shortly after randomization (Horne et al. 2000). If the vaccine has no effect on infections and there are no side effects during the ramp-up period, it is expected that removing cases observed during this period does not introduce selection bias, as these cases are then comparable across arms. One then obtains the vaccine effect for a subgroup of the study population so that the per-protocol effect may nevertheless differ from the intention-to-treat (ITT) effect that takes into account all cases after randomization (Appendix C.1).

Figure 1:

Recreated data similar to the COVID-19 vaccine trials conducted by Pfizer (Food and Drug Administration 2020c) and Janssen (Food and Drug Administration 2021). The grey vertical lines indicate the visits at which a dose of the vaccine is given and the pink line indicates the end of the ramp-up period, as specified in the study protocol.

Other pharmaceutical companies also included some time lag beyond completion of the last vaccination dose to allow for optimal immunity (Table 1). In the Pfizer trial, the cumulative incidence in the full analysis set was again similar in both arms until approximately 14 days after randomization, at which time point the survival curves diverged (Figure 1). However, a ramp-up period of 28 days (after randomization) was specified, which resulted in the removal of approximately 94 (34%) of the placebo and 41 (82%) of the vaccine cases. For the AstraZeneca and Moderna trials, it also appears that the vaccine already had a clear effect on infection before the end of the specified ramp-up period (Figure 7 in European Medicines Agency (2021) and Figure 2 in Food and Drug Administration (2020b)).

In the Moderna trial, vaccine efficacy was estimated using a hazard ratio that was obtained by fitting a stratified Cox proportional hazards model (Cox 1972), where patients who got infected before day 14 after the second dose were censored (Food and Drug Administration 2020b). This approach is also subject to a possible selection bias, since the implicit assumption of non-informative censoring is violated as early cases are censored based on their infection time. None of the aforementioned pharmaceutical companies reported a rationale for the choice of the length of the ramp-up period in the protocol (European Medicines Agency 2021; Food and Drug Administration 2020b, 2020c, 2021), even though it is recommended by the WHO (World Health Organization 2020a).

The WHO (World Health Organization 2020a) argued that, in general, the ITT estimate will tend to be diluted compared to the PP vaccine efficacy estimates, since individuals typically fail to comply with the protocol for reasons related to the vaccine itself (Dean et al. 2019). In addition, including cases that arise during the ramp-up time will typically lead to smaller vaccine efficacy estimates since the vaccine is not yet fully effective during this period. Since ITT vaccine efficacy estimates provide information about the effectiveness of a public health strategy using the vaccine, and because they reflect the speed at which the vaccine becomes protective, they may be more meaningful than per-protocol effects to compare vaccines with different dosing regimens or ramp-up periods (World Health Organization 2020a). Therefore, it has been recommended to report both vaccine efficacy estimates (Horne et al. 2000).

Hypothetical vaccine efficacy estimands

In this section, we propose two new estimands that can be used to measure vaccine efficacy in settings with delayed immune response and give insight into the intrinsic effect of the vaccine after achieving adequate immune response. In addition, we discuss new estimators for these estimands and clarify under what assumptions they can be approximated by standard per-protocol estimators.

Vaccine efficacy if infections during ramp-up can be prevented

First, we consider the vaccine efficacy that would have been observed if cases during the ramp-up period could have been avoided. This is an example of a hypothetical estimand (International Council for Harmonisation 2019) and might be of particular interest since it is an effect that can be realized in practice for several viruses. For example, influenza vaccines cause antibodies to develop in the body about two weeks after vaccination (Centers for Disease Control and Prevention 2021); influenza infections during this period can be avoided by vaccinating people early enough, i.e. at least two weeks before flu season begins. Further, vaccines against diseases endemic to certain countries, e.g. Malaria in Africa (Centers for Disease Control and Prevention 2020), are given early enough before traveling, allowing the immune response to be developed before arriving in these countries. In the COVID-19 setting, cases occurring shortly after vaccination can be avoided by quarantine. In general, we consider a setting where cases during the ramp-up period can be avoided.

It is not immediately clear how this effect can be identified without relying on strong assumptions. In particular, the observed infection hazard ratios after the ramp-up period cannot simply be used as substitution for the infection hazard ratio in the hypothetical setting because the populations not at risk for infections at a given time are likely not exchangeable between the observed and hypothetical setting. In Appendix C.2, we show that the per-protocol estimator, which removes cases observed during the ramp-up time, is unbiased for this effect only under strong assumptions. In particular, one must assume that patients who are infected during the ramp-up time would have comparable infection times as patients who were not infected during this period, if infections during this period could have been avoided. This is likely implausible as early cases may well be selective.

Vaccine efficacy if ramp-up period can be eliminated

Next, we consider the vaccine efficacy, if the vaccine would immediately induce an immune response; i.e. if there was no ramp-up period. This estimand provides insight into the intrinsic effect of the vaccine because it represents the effect once subjects are fully immunized.

In Appendix C.3, we propose an estimator for this hypothetical VE which relies on a Structural Distribution Model (SDM) for identification (Robins 1994; Vansteelandt and Joffe 2014). This model maps percentiles of the distribution of infection times under placebo into percentiles of the distribution of infection times under vaccine. It makes assumptions about the effect of the vaccine on the infection times on population level, but we refrain from making assumptions on individual infection times. In particular, if the vaccine would work immediately, we would impose that vaccination multiplies the quantiles of the infection distribution by a factor exp(ψ), for a scalar ψ. However, we assume that the vaccine effect is limited during the ramp-up time, and therefore, the quantiles are multiplied by a (possibly) smaller factor exp(ρψ) with ρ ∈ [0, 1]. Formally, let α denote the length of the ramp-up time, S(t|X=0) the survival function at time t in the placebo arm and S _SDM(t|X=1; α, ρ, ψ) the modeled survival function at time t in the vaccine arm. If we assume that the vaccine is fully effective after the ramp-up period, this leads to model

(1) S SDM ( t | X = 1 ; α , ρ , ψ ) = S t exp ( ρ ψ ) | X = 0 if t < α S t − α ( 1 − exp ( ψ ( 1 − ρ ) ) ) exp ( ψ ) | X = 0 if t ≥ α .

In this model, ψ represents the vaccine effect, with higher values indicating higher efficacy. Parameter ρ indicates how much weaker the vaccine effect is during the ramp-up time than after. The choice ρ=0 expresses no vaccine effect during the ramp-up period, while ρ=1 indicates full vaccine effect from baseline (Figure 2). Model (1) allows to impose that the vaccine has a different effect during and after the ramp-up time. The parameters can be estimated by comparing the mapped survival function (1) to the observed survival function in the vaccine arm. The hypothetical vaccine efficacy can then be estimated by setting the ramp-up period to 0 days in model (1) (or ρ to 1). Details about this model and estimation can be found in Appendix C.3. Other models than model (1) could be used, e.g. one could impose that the effect of the vaccine increases during the ramp-up time by assuming that the quantiles of infection during the ramp-up time are multiplied by exp (ψ(1 + (t − α)/α)). However, in the remainder of the paper we will focus on model (1).

Figure 2:

Illustration of the impact of parameter ρ on the cumulative incidence in model (1) for fixed ψ=0.8 and α=20. Plots are based on simulated data.

Data analysis

In this section, we compare the discussed vaccine efficacy estimands and risk measures by performing data analyses on data similar to the Janssen and Pfizer COVID-19 trials.

Data

The data of the Janssen (ClinicalTrials.gov NCT0450572) and Pfizer (ClinicalTrials.gov NCT04368728) COVID-19 trials were recreated, based on the published Kaplan-Meier curves (Figure 1 in Food and Drug Administration (2021) and Figure 2 in Food and Drug Administration (2020c)), as described in Appendix C.4.1. Figure 1 visualizes the obtained Kaplan–Meier curves, which agree very well with the published curves. The R-code used to create these datasets is provided in Appendix D.1.

Methods

For both trials, the ITT and PP vaccine efficacy effects were estimated every week for the entire study duration, using the cumulative incidence, the hazard and incidence rate as risk measures. ITT effects included all cases observed since randomization, while for the PP effects cases observed during the ramp-up time were removed or censored. Different lengths of ramp-up times (α=7, 14, 28 and 35 days) were investigated for both trials. Hazard ratios were estimated by fitting a Cox proportional hazards model (Cox 1972) and cumulative incidences were obtained by estimating Kaplan–Meier curves (Kaplan and Meier 1958). Incidence rates were acquired by fitting a Poisson model (Nauta 2010) with the logarithm of the observation time as offset to account for follow-up time. All estimators assume censoring to be non-informative within each treatment arm. The hypothetical estimand “if the ramp-up period could be eliminated” was also estimated every week, using the cumulative incidence as risk measure, as described in Appendix C.4.3. R-code for these estimators is provided in Appendix D.2. Standard errors (SE) were obtained using 1,000 non-parametric bootstrap replications (Efron 1979).

Results

Table 2 shows the obtained vaccine efficacy estimates and SEs for the Pfizer and Janssen COVID-19 vaccine trials, using the ramp-up period as specified in the study protocols. For both trials, the ITT effect estimates are approximately 10% smaller than the PP effects. This is in contrast with the results of Horne et al. (Horne et al. 2000) who generally observed little difference between ITT and PP effects. However, these authors also found a few trials reported in the last 20 years where efficacy estimates under the two approaches gave discordant results.

Table 2:

Results of the data analysis performed on the Pfizer and Janssen dataset. Pfizer: vaccine efficacy is measured at day 112 and the length of the ramp-up period is α=28 days. Janssen: vaccine efficacy is measured at day 125 and the length of the ramp-up period is α=14 days.

Effect	Risk measure	Vaccine efficacy estimate (SE)
		Pfizer	Janssen
Intention-to-treat	Hazard rate	0.82 (0.03)	0.55 (0.04)
	Cumulative incidence	0.86 (0.03)	0.54 (0.10)
	Incidence rate	0.82 (0.03)	0.55 (0.04)
Per-protocol (removing cases before α)	Hazard rate	0.95 (0.02)	0.67 (0.04)
	Cumulative incidence	0.93 (0.04)	0.61 (0.11)
	Incidence rate	0.95 (0.02)	0.67 (0.04)
Per-protocol (censoring cases before α)	Hazard rate	0.95 (0.02)	0.67 (0.03)
	Cumulative incidence	0.93 (0.04)	0.61 (0.10)
	Incidence rate	0.95 (0.02)	0.67 (0.04)
VE if ramp-up can be eliminated	Cumulative incidence	0.93 (0.04)	0.66 (0.13)

In addition, PP effects where cases observed during the ramp-up period are removed versus censored coincide in our results. However, the vaccine efficacy estimates differ by the risk measure used, even though that was not expected because of the small incidence rates (Hudgens et al. 2004). Moreover, for most effects, the obtained standard errors were two to three times larger when using the cumulative incidence compared to the hazard or incidence rate. This can be attributed to the (semi-)parametric nature of the Cox and Poisson model but comes at the risk of bias when this model is not correct. Figure 3 shows that the effect estimates converge over time. In the Janssen trial, an increase in VE was noticed 84-91 days after vaccination when using the cumulative incidence as risk measure. This is probably because many patients are censored during that period (Figure 1) and is not observed when using the other risk measures.

Figure 3:

Vaccine efficacy estimates ±2SE for the Pfizer and Janssen dataset are shown over time, both for the cumulative incidence, the hazard and incidence rate as risk measure. Every point represents the VE estimate that would be obtained if the trial were stopped at the corresponding visit and all information up till that visit was used.

Tables 1 and 2 and Figures 10 and 11 in Appendix C.4.2 show the results when using other lengths of the ramp-up period. From these results, it turns out that choosing too short a period is more problematic than too long a period. In particular, when α is set to 7 days in the Janssen trial and the Pfizer trial, diluted vaccine efficacy estimates are obtained compared to the original PP effects. In both trials, the PP effects converge to approximately the same limit when specifying a ramp-up period of 14, 28 or 35 days. However, in the Janssen trial, the obtained SEs are somewhat larger when more cases are removed/censored.

Table 2 also shows estimates for the hypothetical estimand “if there was no ramp-up time”. Although different estimators were used (Appendix C.4.3), we only show results for the one-parameter model method with the ramp-up period specified as in the study protocol. The obtained effect estimates are very close to the PP estimates with only a slightly larger SE (Table 2). For the Pfizer trial, the Structural Distribution Model fitted the observed data very well (Figures 13–15 in Appendix C.4.3), but less so for the Janssen study (Figures 17 and 18 in Appendix C.4.3).

Discussion

In this paper, we compared the primary estimand and estimator, based on the per-protocol principle of different phase 3 trials with four investigational vaccines, with a common objective to evaluate efficacy and safety in preventing COVID-19. In particular, the different vaccine efficacy estimands have been investigated and discussed. Using cumulative incidence as risk measure is the most interesting in terms of interpretation, since the corresponding vaccine efficacy represents a number of cases that can be avoided by vaccination. The hazard and incidence rate do not lead to a straightforward interpretation of vaccine efficacy, but show more stability when estimated (semi-)parametrically and result in approximately the same vaccine efficacy estimates in most settings with rare diseases. However, in the data analyses performed on the recreated Janssen and Pfizer COVID-19 trials, we obtained differences up to 6 percent points, depending on the chosen risk measure.

We also observed differences of approximately 10 percent points between intention-to-treat analyses, taking into account all cases since baseline, and per-protocol analyses, removing or censoring cases occurring shortly after vaccination, while other authors generally found little difference (Horne et al. 2000). Since per-protocol analyses are subject to possible selection bias (Dean et al. 2019) and are not aligned with a relevant estimand (International Council for Harmonisation 2019), we have proposed two hypothetical estimands that give insight into the intrinsic effect of the vaccine in settings with delayed immune response. The first estimand considers the vaccine efficacy that would have been observed if cases during the ramp-up could be prevented. We argued that strong assumptions are needed for the per-protocol analysis to unbiasedly estimate this effect. The second estimand considers the vaccine efficacy if the ramp-up period can be eliminated. We developed a novel estimator for this estimand using a Structural Distribution Model (Robins 1994; Vansteelandt and Joffe 2014). This proposal relies on modeling assumptions, which can partially be checked by comparing the modeled vaccine survival curve to the observed curve. However, because it is partially untestable, caution is warranted when interpreting the obtained results for the Janssen and Pfizer trials and results are intended for illustrative purposes only. In contrast to the principal stratification estimand, the proposed estimand targets an effect for the entire trial population. In addition, our proposal does not rely on monotonicity or principal ignorability assumptions that are typically needed to estimate principal stratification effects (Ding and Lu 2017).

To conclude, the per-protocol vaccine efficacy estimates can, in this setting, be interpreted as the number of cases that can be avoided by vaccination if the vaccine would immediately induce an immune response. The use of naive per-protocol effects may be problematic in other settings, in which case we recommend estimation of the proposed estimand. In addition, we recommend to always report intention-to-treat effects along with per-protocol effects, as the former do not suffer from potential selection bias. Moreover, intention-to-treat effects can be helpful to evaluate the effectiveness of a public health strategy as the ramp-up period is taken into account. The handling of other intercurrent events than “COVID-19 infection during the ramp-up period” during vaccine trials, e.g. death or non-compliance with the prescribed vaccine doses, is beyond the scope of this paper.

The conclusions drawn in this article are not only useful in the COVID-19 trial setting, but can also be applied to other vaccine efficacy trials.

Corresponding author: Hege Michiels, Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium, E-mail: hege.michiels@ugent.be

Funding source: VLAIO (Flemish Innovation and Entrepreneurship)

Award Identifier / Grant number: Baekeland grant agreement HBC.2019.2155

Acknowledgments

The authors are grateful to the editors and reviewers for very detailed feedback which substantially improved an earlier version of this paper.

Research funding: This work was supported by VLAIO (Flemish Innovation and Entrepreneurship) [Baekeland grant agreement HBC.2019.2155].
Author contribution: All authors have accepted responsibility for the entire content of this manuscript and approved its submission.
Competing interests: The authors declare no potential conflict of interests.
Informed consent: None.
Ethical approval: None.

References

Baden, L. R., H. M. El Sahly, B. Essink, K. Kotloff, S. Frey, R. Novak, D. Diemert, S. A. Spector, N. Rouphael, C. B. Creech, J. McGettigan, S. Khetan, N. Segall, J. Solis, A. Brosz, C. Fierro, H. Schwartz, K. Neuzil, L. Corey, P. Gilbert, H. Janes, D. Follmann, M. Marovich, J. Mascola, L. Polakowski, J. Ledgerwood, B. S. Graham, H. Bennett, R. Pajon, C. Knightly, B. Leav, W. Deng, H. Zhou, S. Han, M. Ivarsson, J. Miller, and T. Zaks. 2021. “Efficacy and Safety of the mRNA-1273 SARS-Cov-2 Vaccine.” New England Journal of Medicine 384 (5): 403–16, https://doi.org/10.1056/nejmoa2035389.Search in Google Scholar

Centers for Disease Control and Prevention 2020. Malaria. Also available at https://wwwnc.cdc.gov/travel/diseases/malaria.Search in Google Scholar

Centers for Disease Control and Prevention 2021. Key Facts About Seasonal Flu Vaccine. Also available at https://www.cdc.gov/flu/prevent/keyfacts.htm.Search in Google Scholar

Cox, D. R. 1972. “Regression Models and Life-Tables.” Journal of the Royal Statistical Society: Series B (Methodological) 34 (2): 187–202. https://doi.org/10.1111/j.2517-6161.1972.tb00899.x.Search in Google Scholar

Dean, N. E., P. S. Gsell, R. Brookmeyer, V. De Gruttola, C. A. Donnelly, M. E. Halloran, M. Jasseh, M. Nason, M. Nason, X. Riveros, C. Watson, A. M. Henao-Restrepo, and I. M. Longini. 2019. “Design of vaccine efficacy trials during public health emergencies.” Science Translational Medicine 11 (499): eaat0360.10.1126/scitranslmed.aat0360Search in Google Scholar PubMed PubMed Central

Dean, N. E., M. E. Halloran, and I. M. Longini. 2018. “Design of Vaccine Trials During Outbreaks with and without a Delayed Vaccination Comparator.” The Annals of Applied Statistics 12 (1): 330. https://doi.org/10.1214/17-aoas1095.Search in Google Scholar PubMed PubMed Central

Ding, P., and J. Lu. 2017. “Principal Stratification Analysis Using Principal Scores.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 79 (3): 757–77. https://doi.org/10.1111/rssb.12191.Search in Google Scholar

Efron, B. 1979. “Bootstrap Methods: Another look at the Jackknife.” The Annals of Statistics 7: 1–26. https://doi.org/10.1214/aos/1176344552.Search in Google Scholar

European Medicines Agency. 2021. Assessment Report: Covid-19 Vaccine Astrazeneca. European Medicines Agency.Search in Google Scholar

Food and Drug Administration. 2020a. Development and Licensure of Vaccines to Prevent Covid-19: Guidance for Industry.Search in Google Scholar

Food and Drug Administration. 2020b. “Fda Briefing Document: Moderna Covid-19 Vaccine.” In Vaccines and Related Biological Products Advisory Committee Meeting. December 17, 2020.Search in Google Scholar

Food and Drug Administration. 2020c. “Fda Briefing Document: Pfizer-Biontech Covid-19 Vaccine.” In Vaccines and Related Biological Products Advisory Committee Meeting. December 10, 2020.Search in Google Scholar

Food and Drug Administration. 2021. “Fda Briefing Document: Janssen ad26. Cov2. S Vaccine for the Prevention of Covid-19.” In Vaccines and Related Biological Products Advisory Committee Meeting. February 26, 2021.Search in Google Scholar

Halloran, M. E., M. Haber, I. M. LonginiJr, and C. J. Struchiner. 1991. “Direct and Indirect Effects in Vaccine Efficacy and Effectiveness.” American Journal of Epidemiology 133 (4): 323–31. https://doi.org/10.1093/oxfordjournals.aje.a115884.Search in Google Scholar PubMed

Halloran, M. E., I. M. Longini, C. J. Struchiner, and I. M. Longini. 2010. Design and Analysis of Vaccine Studies, Vol 18. New York: Springer.10.1007/978-0-387-68636-3Search in Google Scholar

Horne, A. D., P. A. Lachenbruch, and K. L. Goldenthal. 2000. “Intent-to-Treat Analysis and Preventive Vaccine Efficacy.” Vaccine 19 (2–3): 319–26. https://doi.org/10.1016/s0264-410x(00)00152-3.Search in Google Scholar PubMed

Hudgens, M. G., P. B. Gilbert, and S. G. Self. 2004. “Endpoints in Vaccine Trials.” Statistical Methods in Medical Research 13 (2): 89–114. https://doi.org/10.1191/0962280204sm356ra.Search in Google Scholar PubMed

International Council for Harmonisation. 2019. Addendum on Estimands and Sensitivity Analysis in Clinical Trials. Also available at https://database.ich.org/sites/default/files/E9-R1_Step4_Guideline_2019_1203.pdf.Search in Google Scholar

Kaplan, E. L., and P. Meier. 1958. “Nonparametric Estimation from Incomplete Observations.” Journal of the American Statistical Association 53 (282): 457–81. https://doi.org/10.1080/01621459.1958.10501452.Search in Google Scholar

Nauta, J. 2010. Statistics in Clinical Vaccine Trials. Berlin, Heidelberg: Springer Science, Business Media.10.1007/978-3-642-14691-6Search in Google Scholar

Patterson, S., B. Fu, Y. Meng, F. Bailleux, and J. Chen. 2021. “Statistical Observations on Vaccine Clinical Development for Pandemic Diseases.” Statistics in Biopharmaceutical Research 14: 1–5. https://doi.org/10.1080/19466315.2021.1919197.Search in Google Scholar

Polack, F. P., S. J. Thomas, N. Kitchin, J. Absalon, A. Gurtman, S. Lockhart, J. L. Perez, G. P. Marc, E. D. Moreira, C. Zerbini, R. Bailey, K. A. Swanson, S. Roychoudhury, K. Koury, P. Li, W. V. Kalina, D. Cooper, R. W. Frenck, L. L. Hammitt, Ö Türeci, H. Nell, A. Schaefer, S. Ünal, D. B. Tresnan, S. Mather, P. R. Dormitzer, U. Sahin, K. U. Jansen, and W. C. Gruber. 2020. “Safety and Efficacy of the BNT162b2 mRNA Covid-19 Vaccine.” New England Journal of Medicine 383: 2603–15, https://doi.org/10.1056/nejmoa2034577.Search in Google Scholar PubMed PubMed Central

Rapaka, R. R., E. A. Hammershaimb, and K. M. Neuzil. 2021. “Are Some Covid Vaccines Better than Others? Interpreting and Comparing Estimates of Efficacy in Trials of Covid-19 Vaccines.” Clinical Infectious Diseases 74: 352–8.10.1093/cid/ciab213Search in Google Scholar PubMed PubMed Central

Robins, J. M. 1994. “Correcting for Non-Compliance in Randomized Trials Using Structural Nested Mean Models.” Communications in Statistics-Theory and Methods 23 (8): 2379–412. https://doi.org/10.1080/03610929408831393.Search in Google Scholar

Sadoff, J., G. Gray, A. Vandebosch, V. Cárdenas, G. Shukarev, B. Grinsztejn, P. A. Goepfert, C. Truyers, H. Fennema, B. Spiessens, K. Offergeld, G. Scheper, K. L. Taylor, M. L. Robb, J. Treanor, D. H. Barouch, J. Stoddard, M. F. Ryser, M. A. Marovich, K. M. Neuzil, L. Corey, N. Cauwenberghs, T. Tanner, K. Hardt, J. Ruiz-Guiñazú, M. Le Gars, H. Schuitemaker, J. Van Hoof, F. Struyf, and M. Douoguih. 2021. “Safety and Efficacy of Single-Dose Ad26. Cov2. S Vaccine Against Covid-19.” New England Journal of Medicine 384 (23): 2187–201, https://doi.org/10.1056/nejmoa2101544.Search in Google Scholar PubMed PubMed Central

Senn, S. 2022. “The Design and Analysis of Vaccine Trials for COVID‐19 for the Purpose of Estimating Efficacy.” Pharmaceutical Statistics 21 (4): 790–804.10.1002/pst.2226Search in Google Scholar PubMed PubMed Central

Smith, P., L. Rodrigues, and P. Fine. 1984. “Assessment of the Protective Efficacy of Vaccines Against Common Diseases Using Case-Control and Cohort Studies.” International Journal of Epidemiology 13 (1): 87–93. https://doi.org/10.1093/ije/13.1.87.Search in Google Scholar PubMed

Tsiatis, A. 2007. Semiparametric Theory and Missing Data. New York, NY: Springer Science, Business Media.Search in Google Scholar

Vansteelandt, S., and M. Joffe. 2014. “Structural Nested Models and G-Estimation: The Partially Realized Promise.” Statistical Science 29 (4): 707–31. https://doi.org/10.1214/14-sts493.Search in Google Scholar

Voysey, M., S. A. C. Cldfemens, S. A. Madhi, L. Y. Weckx, P. M. Folegatti, P. K. Aley, B. Angus, V. L. Baillie, S. L. Barnabas, Q. E. Bhorat, S. Bibi, C. Briner, P. Cicconi, A. M. Collins, R. Colin-Jones, C. L. Cutland, T. C. Darton, K. Dheda, C. J. A. Duncan, K. R. W. Emary, K. J. Ewer, L. Fairlie, S. N. Faust, S. Feng, D. M. Ferreira, A. Finn, A. L. Goodman, C. M. Green, C. A. Green, P. T. Heath, C. Hill, H. Hill, I. Hirsch, S. H. C. Hodgson, A. Izu, S. Jackson, D. Jenkin, C. C. D. Joe, S. Kerridge, A. Koen, G. Kwatra, R. Lazarus, A. M. Lawrie, A. Lelliott, V. Libri, P. J. Lillie, R. Mallory, A. V. A. Mendes, E. P. Milan, A. M. Minassian, A. McGregor, H. Morrison, Y. F. Mujadidi, A. Nana, P. J. O’Reilly, S. D. Padayachee, A. Pittella, E. Plested, K. M. Pollock, M. N. Ramasamy, S. Rhead, A. V. Schwarzbold, N. Singh, A. Smith, R. Song, M. D. Snape, E. Sprinz, R. K. Sutherland, R. Tarrant, E. C. Thomson, M. E. Török, M. Toshner, D. P. J. Turner, J. Vekemans, T. L. Villafana, M. E. E. Watson, C. J. Williams, A. D. Douglas, A. V. S. Hill, T. Lambe, S. C. Gilbert, and A. J. Pollard. 2021. “Safety and Efficacy of the Chadox1 Ncov-19 Vaccine (Azd1222) against Sars-Cov-2: An Interim Analysis of Four Randomised Controlled Trials in Brazil, South Africa, and the UK.” The Lancet 397 (10269): 99–111.10.1016/S0140-6736(20)32661-1Search in Google Scholar PubMed PubMed Central

World Health Organization. 2020a. Design of Vaccine Efficacy Trials to be Used during Public Health Emergencies—Points of Considerations and Key Principles.Search in Google Scholar

World Health Organization. 2020b. Who Target Product Profiles for Covid-19 Vaccines. Also available at https://www.who.int/publications/m/item/who-target-product-profiles-for-covid-19-vaccines.Search in Google Scholar

World Health Organization. 2021. Covid-19 Vaccine Tracker and Landscape. Also available at https://www.who.int/publications/m/item/draft-landscape-of-covid-19-candidate-vaccines (accessed August 27, 2021).Search in Google Scholar

Supplementary Material

The online version of this article offers supplementary material (https://doi.org/10.1515/scid-2022-0003).

Received: 2022-02-08

Revised: 2022-08-05

Accepted: 2022-08-12

Published Online: 2022-09-07

Supplementary Material Details

Articles in the same Issue

https://doi.org/10.1515/scid-2022-0003

Keywords for this article

causal inference; COVID-19; estimand; intention-to-treat analysis; per-protocol analysis; vaccine efficacy trial