Abstract
A fundamental probabilistic equivalence between randomized experiments and observational studies is presented. Given a detailed scenario, the reader is asked to consider which of two possible study designs provides more information regarding the expected difference in an outcome due to a time-fixed treatment. A general solution is described, and a particular worked example is also provided. A mathematical proof is given in the appendix. The demonstrated equivalence helps to clarify common ground between randomized experiments and observational studies, and to provide a foundation for considering both the design and interpretation of studies.
Below we describe a fundamental probabilistic equivalence between randomized experiments and observational studies. This equivalence clarifies the logical common ground between randomized experiments and observational studies. This equivalence is one of equal bounds. Bounds are the minimum and maximum values of the parameter (e. g., risk difference) that are consistent with the observed data distribution. Such bounds were described by Robins (1989), Manski (1990), and Balke and Pearl (1994). Moreover, this equivalence illustrates a fundamental limitation of observational studies, as compared to randomized experiments.
1 Two study designs
Say you are charged with the task of learning if and to what extent a binary, time-fixed treatment, denoted as
The first study design is a randomized experiment (Fisher 1926). Specifically, you randomly assign
Assume
2 Fundamental equivalence
Without additional context, both study designs presented provide the exact same amount of information (i. e., width of bounds) about the average difference in
For both studies, when we observe
In the second (observational) study design, counterfactual consistency again allows us to observe 1/2 of the potential outcomes, namely
Because potential outcomes are missing due to an unknown mechanism under either design we cannot with certainty obtain an unbiased estimator of the expected difference in the outcome
3 An example
Table 1 provides a simple worked numerical example. In the upper panel of Table 1 the experimental study is represented. There are
Illustrative example of a randomized experiment and observational study.
Randomized experiment. | ||||
---|---|---|---|---|
Missing outcomes | Observed nonevents | Observed events | Total | |
Treatment | ||||
0.25 | 0.225 | 0.025 | 0.5 | |
0.25 | 0.200 | 0.050 | 0.5 | |
Total | 0.50 | 0.425 | 0.075 | 1.0 |
Observed risk difference is: 0.025/0.25 – 0.05/0.25 = 0.1 – 0.2 = – 0.1
Bounds are: –0.55, 0.45
Observational study. | |||||
---|---|---|---|---|---|
Nonevents | Events | Total | |||
Treatment | |||||
? | 0.45 | ? | 0.05 | 0.5 | |
0.40 | ? | 0.10 | ? | 0.5 | |
Total | 1.0 |
Observed risk difference is: 0.05/0.5 – 0.1/0.5 = 0.1–0.2 = –0.1
Bounds are: –0.55, 0.45.
In the lower panel of Table 1 the observational (cohort) study is represented. Again, there are
The upper bound occurs if we assume all untreated participants would have experienced an event if, contrary to fact, they had been treated, and all treated participants would have not experienced the event if, contrary to fact, they had been untreated. Under this scenario, we assume that all
4 Discussion
With a pair of treatments and assuming no additional context beyond what is provided above, the bounds for the risk difference from an observational study are equivalent to the analogous bounds for a randomized experiment with 50 % missing outcomes. Of course the described experiment may be labeled as “broken” because of the missing outcomes (Frangakis and Rubin 1999; Little et al. 2012). However, experience suggests that all real-world experiments are broken to varying degrees. In real-world settings each successive additional piece of context will unbalance the equivalence given in this example, giving preference to one design over the other. A central point of this paper is that we cannot conclude one design is better than the other without additional context. Such additional context, could be used to “unbalance” the equivalence and allow for an informed design choice. Yet identifying this balancing point, or equivalence, between a randomized experiment with missing outcome assessments and an observational study is useful. When a randomized experiment is feasible, this equivalence indicates that an experiment will be preferable to an observational study provided less than 50 % of outcome assessments in the experiment are missing. However, experiments are sometimes unethical and are often prohibitively expensive. When a randomized experiment is infeasible, nonexperimental observational studies can help to refine our knowledge, or sharpen our (probabilistic) bounds about the effect of a treatment. Moreover, balancing points, such as described here, provide a natural foundation when considering both the design and interpretation of experimental and nonexperimental studies.
Funding statement: Dr. Cole was supported in part by grants R01AI100654, R24AI067039, U01AI103390, and P30AI50410 from the National Institutes of Health. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Appendix: Proof
For each design we will calculate the two bounds for
First consider the randomized experiment where
In the above, the first equality holds by randomization of treatment
For the experiment described in the main text, missingness does not depend on treatment group, implying
The first, second and fourth terms are identified in the observed data, but the third term is not. Therefore, setting the unidentified third term
Likewise, in the observational study we can expand
By counterfactual consistency we have
Again, the first, second and fourth terms are identified in the observed data, but the third term is not. Therefore, setting the unidentified third term
References
Robins, J. M. (1989). The analysis of randomized and nonrandomized AIDS treatment trials using a new approach to causal inference in longitudinal studies. In: Health Service Research Methodology: A Focus on AIDS, L. Sechrest, H. Freeman, and A. Mulley (Eds.), 113–159. Washington, DC: US Public Health Service.Search in Google Scholar
Manski, C. F. (1990). Nonparametric bounds on treatment effects. The American Economic Review 80:319–323.Search in Google Scholar
Balke, A., and Pearl, J. (1994). Counterfactual probabilities: Computational methods, bounds, and applications. In: Uncertainty in Artifical Intelligence, R. Lopez de Mantara, and D. Poole (Eds.), 46–54. San Mateo, CA: Morgan Kaufman.10.1016/B978-1-55860-332-5.50011-0Search in Google Scholar
Cole, S. R., Hudgens, M. G., Brookhart, M. A., Westreich, D. (2015). Risk. American Journal of Epidemiology, 181(4):246–250.10.1093/aje/kwv001Search in Google Scholar
Fisher, R. A. (1926). The arrangement of field experiments. Journal of the Ministry of Agriculture of Great Britain, 33:503–513.10.1007/978-1-4612-4380-9_8Search in Google Scholar
Cole, S. R., and Frangakis, C. E. (2009). The consistency statement in causal inference: a definition or an assumption? Epidemiology, 20(1):3–5.10.1097/EDE.0b013e31818ef366Search in Google Scholar
Vander Weele, T. J. (2009). Concerning the consistency assumption in causal inference. Epidemiology, 20(6):880–883.10.1097/EDE.0b013e3181bd5638Search in Google Scholar
Neyman, J., Dabrowska, D. M., Speed, T. P. (1990). On the application of probability theory to agricultural experiments: Essay on principles, section 9 (1923). Statistical Science, 5:465–480.10.1214/ss/1177012032Search in Google Scholar
Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology 66:688–701.10.1037/h0037350Search in Google Scholar
Robins, J. M. (1986). A new approach to causal inference in mortality studies with a sustained exposure period: Application to control of the healthy worker survivor effect. Math Modelling, 7:1393–1512.10.1016/0270-0255(86)90088-6Search in Google Scholar
Holland, P. W. (1986). Statistics and causal inference. JASA, 81:945–970.10.1080/01621459.1986.10478354Search in Google Scholar
Pearl, J. (2010). On the consistency rule in causal inference: Axiom, definition, assumption, or theorem? Epidemiology, 21(6):872–875.10.1097/EDE.0b013e3181f5d3fdSearch in Google Scholar PubMed
Frangakis, C. E., and Rubin, D. B. (1999). Addressing complications of intention-to-treat analysis in the combined presence of all-or-none treatment-noncompliance and subsequent missing outcomes. Biometrika, 86:365–379.10.1093/biomet/86.2.365Search in Google Scholar
Little, R. J., D‘Agostino, R., Cohen, M. L., Dickersin, K., Emerson, S. S., Farrar, J. T., Frangakis, C., Hogan, J. W., Molenberghs, G., Murphy, S. A., Neaton, J. D., Rotnitzky, A., Scharfstein, D., Shih, W. J., Siegel, J. P., Stern, H. (2012). The prevention and treatment of missing data in clinical trials. New England Journal of Medicine, 367(14):1355–1360.10.1056/NEJMsr1203730Search in Google Scholar PubMed PubMed Central
© 2016 by De Gruyter
Articles in the same Issue
- Frontmatter
- Estimating Effects with Rare Outcomes and High Dimensional Covariates: Knowledge is Power
- A Note on the Mantel-Haenszel Estimators When the Common Effect Assumptions Are Violated
- Revisiting g-estimation of the Effect of a Time-varying Exposure Subject to Time-varying Confounding
- Estimation of the Overall Treatment Effect in the Presence of Interference in Cluster-Randomized Trials of Infectious Disease Prevention
- Evaluating the Impact of a HIV Low-Risk Express Care Task-Shifting Program: A Case Study of the Targeted Learning Roadmap
- Predicting Overall Vaccine Efficacy in a New Setting by Re-calibrating Baseline Covariate and Intermediate Response Endpoint Effect Modifiers of Type-Specific Vaccine Efficacy
- A Fundamental Equivalence between Randomized Experiments and Observational Studies
- Interaction Testing: Residuals-Based Permutations and Parametric Bootstrap in Continuous, Count, and Binary Data
Articles in the same Issue
- Frontmatter
- Estimating Effects with Rare Outcomes and High Dimensional Covariates: Knowledge is Power
- A Note on the Mantel-Haenszel Estimators When the Common Effect Assumptions Are Violated
- Revisiting g-estimation of the Effect of a Time-varying Exposure Subject to Time-varying Confounding
- Estimation of the Overall Treatment Effect in the Presence of Interference in Cluster-Randomized Trials of Infectious Disease Prevention
- Evaluating the Impact of a HIV Low-Risk Express Care Task-Shifting Program: A Case Study of the Targeted Learning Roadmap
- Predicting Overall Vaccine Efficacy in a New Setting by Re-calibrating Baseline Covariate and Intermediate Response Endpoint Effect Modifiers of Type-Specific Vaccine Efficacy
- A Fundamental Equivalence between Randomized Experiments and Observational Studies
- Interaction Testing: Residuals-Based Permutations and Parametric Bootstrap in Continuous, Count, and Binary Data