Abstract
Many causal models of interest in epidemiology involve longitudinal exposures, confounders and mediators. However, repeated measurements are not always available or used in practice, leading analysts to overlook the time-varying nature of exposures and work under over-simplified causal models. Our objective is to assess whether – and how – causal effects identified under such misspecified causal models relates to true causal effects of interest. We derive sufficient conditions ensuring that the quantities estimated in practice under over-simplified causal models can be expressed as weighted averages of longitudinal causal effects of interest. Unsurprisingly, these sufficient conditions are very restrictive, and our results state that the quantities estimated in practice should be interpreted with caution in general, as they usually do not relate to any longitudinal causal effect of interest. Our simulations further illustrate that the bias between the quantities estimated in practice and the weighted averages of longitudinal causal effects of interest can be substantial. Overall, our results confirm the need for repeated measurements to conduct proper analyses and/or the development of sensitivity analyses when they are not available.
Acknowledgments
The authors are grateful to Stijn Vansteelandt for insightful comments on preliminary versions of this article, and to the reviewers of the International Journal of Biostatistics for valuable comments and suggestions.
-
Author contribution: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.
-
Research funding: None declared.
-
Conflict of interest statement: The authors declare no conflicts of interest regarding this article.
-
Disclaimers: Where authors are identified as personnel of the International Agency for Research on Cancer/World Health Organization, the authors alone are responsible for the views expressed in this article and they do not necessarily represent the decisions, policy or views of the International Agency for Research on Cancer/World Health Organization.
Appendix A. Proof of Theorem 1
Consider a longitudinal model (L) and assume that the only available data regarding the exposure of interest consists in
Now assume that there exists some
Because
with
where the second equality comes from the fact that
where the sums are over all
The proof of the result under condition (T.Uncond) follows from similar, but simpler, arguments and is therefore omitted.
References
1. Agudo, A, Bonet, C, Travier, N, González, C, Vineis, P, Bueno-de Mesquita, H, et al.. Impact of cigarette smoking on cancer risk in the european prospective investigation into cancer and nutrition study. J Clin Oncol 2012;30:4550–7. https://doi.org/10.1200/jco.2011.41.0183.Search in Google Scholar PubMed
2. Bagnardi, V, Rota, M, Botteri, E, Tramacere, I, Islami, F, Fedirko, V, et al.. Alcohol consumption and site-specific cancer risk: a comprehensive dose-response meta-analysis. Br J Cancer 2015;112:580–93. https://doi.org/10.1038/bjc.2014.579.Search in Google Scholar PubMed PubMed Central
3. Lauby-Secretan, B, Scoccianti, C, Loomis, D, Grosse, Y, Bianchini, F, Straif, K. Body fatness and cancer - viewpoint of the iarc working group. N Engl J Med 2016;375:794–8. https://doi.org/10.1056/nejmsr1606602.Search in Google Scholar PubMed PubMed Central
4. Bradbury, KE, Appleby, PN, Tipper, SJ, Travis, RC, Allen, NE, Kvaskoff, M, et al.. Circulating insulin-like growth factor i in relation to melanoma risk in the european prospective investigation into cancer and nutrition. Int J Cancer 2019;144:957–66. https://doi.org/10.1002/ijc.31854.Search in Google Scholar PubMed PubMed Central
5. Chan, AT, Ogino, S, Giovannucci, EL, Fuchs, CS. Inflammatory markers are associated with risk of colorectal cancer and chemopreventive response to anti-inflammatory drugs. Gastroenterology 2011;140:799–808. https://doi.org/10.1053/j.gastro.2010.11.041.Search in Google Scholar PubMed PubMed Central
6. Dossus, L, Lukanova, A, Rinaldi, S, Allen, N, Cust, AE, Becker, S, et al.. Hormonal, metabolic, and inflammatory profiles and endometrial cancer risk within the epic cohort—a factor analysis. Am J Epidemiol 2013;177:787–99. https://doi.org/10.1093/aje/kws309.Search in Google Scholar PubMed
7. Hernan, MA, Robins, JM. Causal inference: what if. Boca Raton: Chapman & Hall/CRC; 2020 [forthcoming].Search in Google Scholar
8. Pearl, J. Causal inference in statistics: an overview. Stat Surv 2009;3:96–146. https://doi.org/10.1214/09-ss057.Search in Google Scholar
9. Pearl, J. Causality: models, reasoning, and inference. New York: Cambridge University Press; 2009.10.1017/CBO9780511803161Search in Google Scholar
10. Robins, J. A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect. Math Model 1986;7:1393–512. https://doi.org/10.1016/0270-0255(86)90088-6.Search in Google Scholar
11. Rosenbaum, PR, Rubin, DB. The central role of the propensity score in observational studies for causal effects. Biometrika 1983;70:41–55. https://doi.org/10.1093/biomet/70.1.41.Search in Google Scholar
12. Daniel, RM, Cousens, S, DE Stavola, BL, Kenward, MG, Sterne, JA. Methods for dealing with time-dependent confounding. Stat Med 2012;32:1584–618. https://doi.org/10.1002/sim.5686.Search in Google Scholar PubMed
13. VanderWeele, TJ. Explanation in causal inference - methods for mediation and interaction. Oxford: Oxford University Press; 2015.10.1093/ije/dyw277Search in Google Scholar PubMed PubMed Central
14. VanderWeele, TJ, Tchetgen Tchetgen, E. Mediation analysis with time-varying exposures and mediators. J Roy Stat Soc B 2017;79:917–38. https://doi.org/10.1111/rssb.12194.Search in Google Scholar PubMed PubMed Central
15. Sofrygin, O, Zhu, Z, Schmittdiel, JA, Adams, AS, Grant, RW, van der Laan, MJ, et al.. Targeted learning with daily ehr data. Stat Med 2019;38:3073–90. https://doi.org/10.1002/sim.8164.Search in Google Scholar PubMed
16. Aalen, O, Røysland, K, Gran, J, Kouyos, R, Lange, T. Can we believe the dags? A comment on the relationship between causal dags and mechanisms. Stat Methods Med Res 2016;25:2294–314. https://doi.org/10.1177/0962280213520436.Search in Google Scholar PubMed PubMed Central
17. Maxwell, SE, Cole, DA. Bias in cross-sectional analyses of longitudinal mediation. Psychol Methods 2007;12:23–44. https://doi.org/10.1037/1082-989x.12.1.23.Search in Google Scholar PubMed
18. Maxwell, SE, Cole, DA, Mitchell, MA. Bias in cross-sectional analyses of longitudinal mediation: partial and complete mediation under an autoregressive model. Multivariate Behav Res 2011;46:816–41. https://doi.org/10.1080/00273171.2011.606716.Search in Google Scholar PubMed
19. Huang, Y, Valtorta, M. Identifiability in causal bayesian networks: a sound and complete algorithm. In: Proceedings of the twenty-first national conference on artificial intelligence (AAAI 2006). AAAI Press, Menlo Park, CA; 2006:1149–56 pp.Search in Google Scholar
20. Shpitser, I, Pearl, J. Identification of joint interventional distributions in recursive semi-markovian causal models. In: Proceedings of the 21st national conference on artificial intelligence and the 18th innovative applications of artificial intelligence conference (AAAI 2006). AAAI Press, Menlo Park, CA; 2006:1219–26 pp.Search in Google Scholar
21. Tian, J, Pearl, J. A general identification condition for causal effects. In: Proceedings of the eighteenth national conference on artificial intelligence. AAAI Press/The MIT Press, Menlo Park, CA; 2002:567–73 pp.Search in Google Scholar
22. Tian, J, Pearl, J. On the identification of causal effects. Technical report, cognitive systems laboratory, Los Angeles: University of California; 2003, Technical report 290-L.Search in Google Scholar
23. Arnold, M, Charvat, H, Freisling, H, Noh, H, Adami, H-O, Soerjomataram, I, et al.. Adulthood overweight and survival from breast and colorectal cancer in Swedish women. Cancer Epidemiol Biomarker Prevention 2019;18:1518–24. https://doi.org/10.1158/1055-9965.EPI-19-0075.Search in Google Scholar PubMed
24. Arnold, M, Freisling, H, Stolzenberg-Solomon, R, Kee, F, O’Doherty, M, Ordóẽz Mena, JM, et al.. Overweight duration in older adults and cancer risk: a study of cohorts in europe and the United States. Eur J Epidemiol 2016;31:893–904. https://doi.org/10.1007/s10654-016-0169-z.Search in Google Scholar PubMed PubMed Central
25. De Rubeis, V, Cotterchio, M, Smith, BT, Griffith, LE, Borgida, A, Gallinger, S, et al.. Trajectories of body mass index, from adolescence to older adulthood, and pancreatic cancer risk; a population-based case–control study in ontario, Canada. Cancer Causes Control 2019;30:955–66. https://doi.org/10.1007/s10552-019-01197-9.Search in Google Scholar PubMed PubMed Central
26. Fan, AZ, Russell, M, Stranges, S, Dorn, J, Trevisan, M. Association of lifetime alcohol drinking trajectories with cardiometabolic risk. J Clin Endocrinol Metabol 2008;93:154–61. https://doi.org/10.1210/jc.2007-1395.Search in Google Scholar PubMed PubMed Central
27. Kunzmann, AT, Coleman, HG, Huang, W-Y, Berndt, SI. The association of lifetime alcohol use with mortality and cancer risk in older adults: a cohort study. PLoS Med 2018;15:1–18. https://doi.org/10.1371/journal.pmed.1002585.Search in Google Scholar PubMed PubMed Central
28. Platt, A, Sloan, F, Costanzo, P. Alcohol-consumption trajectories and associated characteristics among adults older than age 50. J Stud Alcohol Drugs 2010;71:169–79. https://doi.org/10.15288/jsad.2010.71.169.Search in Google Scholar PubMed PubMed Central
29. Yang, Y, Dugu, P-A, Lynch, ' BM, Hodge, AM, Karahalios, A, MacInnis, RJ, et al.. Trajectories of body mass index in adulthood and all-cause and cause-specific mortality in the melbourne collaborative cohort study. BMJ Open 2019;9. https://doi.org/10.1136/bmjopen-2019-030078.Search in Google Scholar PubMed PubMed Central
30. Zheng, R, Du, M, Zhang, B, Xin, J, Chu, H, Ni, M, et al.. Body mass index (bmi) trajectories and risk of colorectal cancer in the plco cohort. Br J Cancer 2018;119:130–2. https://doi.org/10.1038/s41416-018-0121-y.Search in Google Scholar PubMed PubMed Central
31. Pearl, J. An introduction to causal inference. Int J Biostat 2010;6: Article 7. https://doi.org/10.2202/1557-4679.1203.Search in Google Scholar PubMed PubMed Central
32. Hernan, MA, VanderWeele, TJ. Compound treatments and transportability of causal inference. Epidemiology 2011;22:368–77. https://doi.org/10.1097/ede.0b013e3182109296.Search in Google Scholar
33. VanderWeele, TJ, Hernan, MA. Causal inference under multiple versions of treatment. J Causal Inference 2013;1:1–20. https://doi.org/10.1515/jci-2012-0002.Search in Google Scholar PubMed PubMed Central
34. Greenland, S. Quantifying biases in causal models: classical confounding vs collider-stratification bias. Epidemiology 2003;14:300–6. https://doi.org/10.1097/01.ede.0000042804.12056.6c.Search in Google Scholar
35. Hernán, M. The hazards of hazard ratios. Epidemiology 2010;21:13–5. https://doi.org/10.1097/ede.0b013e3181c1ea43.Search in Google Scholar PubMed PubMed Central
36. Hernán, M, Hernández-Díaz, S, Robins, J. A structural approach to selection bias. Epidemiology 2004;15:615–25. https://doi.org/10.1097/01.ede.0000135174.63482.43.Search in Google Scholar PubMed
37. Peng, D, Luke, WM. To adjust or not to adjust? Sensitivity analysis of m-bias and butterfly-bias. J Causal Inference 2015;3:41–57. https://doi.org/10.1515/jci-2013-0021.Search in Google Scholar
38. Adams, R, Saria, S, Rosenblum, M. The impact of time series length and discretization on longitudinal causal estimation methods; 2020. arXiv preprint arXiv:2011.15099.Search in Google Scholar
39. Ferreira Guerra, S, Schnitzer, M, Amelie, F, Blais, L. Impact of discretization of the timeline for longitudinal causal inference methods. Stat Med 2020;39:4069–85. https://doi.org/10.1002/sim.8710.Search in Google Scholar PubMed
40. Beesley, LJ, Salvatore, M, Fritsche, LG, Pandit, A, Rao, A, Brummett, C, et al.. The emerging landscape of health research based on biobanks linked to electronic health records: existing resources, statistical challenges, and potential opportunities. Stat Med 2020;39:773–800. https://doi.org/10.1002/sim.8445.Search in Google Scholar PubMed PubMed Central
41. Agniel, D, Kohane, IS, Weber, GM. Biases in electronic health record data due to processes within the healthcare system: retrospective observational study. BMJ 2018;361:1–9. https://doi.org/10.1136/bmj.k1479.Search in Google Scholar PubMed PubMed Central
42. Beesley, L, Mukherjee, B. Statistical inference for association studies using electronic health records: handling both selection bias and outcome misclassification. Biometrics 2020. https://doi.org/10.1111/biom.13400.Search in Google Scholar PubMed
43. Beesley, LJ, Mukherjee, B. Bias reduction and inference for electronic health record data under selection and phenotype misclassification: three case studies. medRxiv 2020. https://doi.org/10.1101/2020.12.21.20248644.Search in Google Scholar PubMed PubMed Central
44. Verma, T, Pearl, J. Causal networks: semantics and expressiveness. In: Proceedings of the fourth workshop on uncertainty in artificial intelligence; 1988:352–9 pp.Search in Google Scholar
Supplementary Material
The online version of this article offers supplementary material (https://doi.org/10.1515/ijb-2020-0081).
© 2021 Walter de Gruyter GmbH, Berlin/Boston
Articles in the same Issue
- Frontmatter
- Research Articles
- Doubly robust adaptive LASSO for effect modifier discovery
- Increasing the efficiency of randomized trial estimates via linear adjustment for a prognostic score
- Review
- Review and comparison of treatment effect estimators using propensity and prognostic scores
- Research Articles
- Error rate control for classification rules in multiclass mixture models
- Regression trees and ensembles for cumulative incidence functions
- Causal inference under over-simplified longitudinal causal models
- Causal inference under interference with prognostic scores for dynamic group therapy studies
- Bayesian multi-response nonlinear mixed-effect model: application of two recent HIV infection biomarkers
- A Bayesian semiparametric accelerate failure time mixture cure model
- Quantifying the extent of visit irregularity in longitudinal data
- An improved method for analysis of interrupted time series (ITS) data: accounting for patient heterogeneity using weighted analysis
- A robust hazard ratio for general modeling of survival-times
- Penalized likelihood estimation of the proportional hazards model for survival data with interval censoring
- A parametric approach to relaxing the independence assumption in relative survival analysis
- The number of response categories in ordered response models
- A comparison of joint dichotomization and single dichotomization of interacting variables to discriminate a disease outcome
- Spike detection for calcium activity
Articles in the same Issue
- Frontmatter
- Research Articles
- Doubly robust adaptive LASSO for effect modifier discovery
- Increasing the efficiency of randomized trial estimates via linear adjustment for a prognostic score
- Review
- Review and comparison of treatment effect estimators using propensity and prognostic scores
- Research Articles
- Error rate control for classification rules in multiclass mixture models
- Regression trees and ensembles for cumulative incidence functions
- Causal inference under over-simplified longitudinal causal models
- Causal inference under interference with prognostic scores for dynamic group therapy studies
- Bayesian multi-response nonlinear mixed-effect model: application of two recent HIV infection biomarkers
- A Bayesian semiparametric accelerate failure time mixture cure model
- Quantifying the extent of visit irregularity in longitudinal data
- An improved method for analysis of interrupted time series (ITS) data: accounting for patient heterogeneity using weighted analysis
- A robust hazard ratio for general modeling of survival-times
- Penalized likelihood estimation of the proportional hazards model for survival data with interval censoring
- A parametric approach to relaxing the independence assumption in relative survival analysis
- The number of response categories in ordered response models
- A comparison of joint dichotomization and single dichotomization of interacting variables to discriminate a disease outcome
- Spike detection for calcium activity