Abstract
In this paper, we review some important early developments on causal inference in medical statistics and epidemiology that were inspired by questions in oncology. We examine two classical examples from the literature and point to a current area of ongoing methodological development, namely the estimation of optimal adaptive treatment strategies. While causal approaches to analysis have become more routine in oncology research, many exciting challenges and open problems remain, particularly in the context of censored outcomes.
1 Introduction
Philosophers, theologians, and mathematicians have long sought to understand and define causality [1]. One of the earliest known writings on the topic comes to us from the fourth century BCE, during which Plato [2] wrote.
Everything that becomes or changes must do so owing to some cause; for nothing can come to be without a cause.
As the notions of causality continue to be debated, refined, and rejected or reaffirmed [1], there was a marked shift in the seventeenth century toward a view of causality as determinism; uncertainty arises from deficits in knowledge. This view is quite well-aligned with modern causal inference and in particular, the potential outcome framework that we describe below.
Statistical inference – distinct from prediction or classification – aims to quantify the effect of an exposure on some outcome. When data are ‘fairly’ allocated between the competing levels of (or options for) an exposure or treatment [3, 4], such quantification can be done with little elaboration and no need for models, whether parametric or otherwise. Unfortunately, it is often the case that such fair comparisons cannot be made. For example, when the treatment allocated to or taken by patients is associated with factors that might also influence their outcome, a situation known as confounding, one must question whether any differences in outcomes between treatment groups are solely attributable to the difference in treatments. It may be the case that the differences are a consequence – in whole or part – of differences in other factors such as pre-treatment health status. This scenario is so often encountered that design and analytic techniques to account for confounding have emerged as a central theme in the causal inference literature.
In the next section, we provide a short introduction to the potential outcomes framework, or the so-called Rubin causal model, and the principles that have grown from this modern view of causal inference. In Section 3, we highlight some historical questions about the causal origins of cancers that have driven methodological developments. We then turn to an example of a current area of research at the frontiers of causal inference that focuses on therapeutic treatment in Section 4. We end our discussion with some concluding remarks.
2 The potential outcomes framework and a template for causal analyses
2.1 Basic concepts
We begin by defining the notation used throughout this manuscript. Let Z denote treatment, X denote pre-treatment covariates, and Y denote an outcome; each may be indexed by i to indicate a specific individual in a sample of size n or in the population from which the sample is drawn. Define the potential outcome Y(z) as the random variable that records the outcome under the condition where treatment was set to z. In a binary treatment setting, where z takes values 0 and 1, each individual has two potential outcomes, Y(0) and Y(1). Under the assumptions discussed in the following subsection, the observed outcome is Y = (1 − z)Y(0) + zY(1). That is, the observed outcome equals the potential outcome under the treatment actually received; the potential outcome under the treatment not received is often termed a ‘counterfactual’ outcome. This idea of potential outcomes can be traced back to [5] but is more commonly attributed to Rubin [6], who used the notion of counterfactuals to suggest the average treatment effect (ATE), E[Y i (1) − Y i (0)], as an estimand of interest. He used potential outcomes to highlight the fairness of the comparison – that individuals, and thus their characteristics, are fixed so that any differences in expected outcomes must be attributed to different treatment conditions. Though Rubin’s early work in causal inference is grounded in educational sciences rather than oncology, the underlying difficulty is the same: many exposures of interest cannot be randomized. The potential outcome framework formalizes the analysis of non-experimental data and, with care, allows one to show that causal effects could be attributed to some causes.
2.2 Causal principles
In addition to focusing on estimands of interest that avoid confounding through conditional or ‘associational’ (but not necessarily causal) contrasts, the potential outcomes framework provides a foundation upon which each step in an analysis can be explicitly defined, from the formulation of the research question to the design of the analytic data set, and finally to the execution of the analysis. In a tutorial focused primarily on point exposures (rather than treatment sequences) and continuous outcomes, Goetghebeur and colleagues [7] outline key steps required to perform an analysis from which causal inferences may be drawn. The primary steps are: (1) define the causal question, which requires formalizing all operational variable definitions (treatment levels, outcome, and population of interest); (2) specify the estimand of interest, a statistical parameter that is the target effect of interest and that ‘answers’ the question defined in the first step; and (3) execute the estimation. The second step, specifying an estimand, typically relies on contrasts (i.e., differences and ratios) of expected potential outcomes. The final step can be done in a variety of ways – the specific methods that can be applied depend on the causal question, and it is unlikely that only a single approach would yield consistent estimators. However, to execute the estimation via any approach, the analyst should explicitly state all assumptions required for identifiability and estimation (including model specifications) and evaluate those assumptions through sensitivity analyses when direct testing of assumptions is not possible.
In the point exposure, continuous outcome setting, there are numerous valid and consistent methods for estimation, most of which rely on the assumption that a sufficient set of confounders, X, has been measured (often referred to as the no unmeasured confounders assumption). For example, traditional regression analyses and direct comparisons in matched or stratified samples can be performed. Methods based on a propensity score [8] can be used when there are concerns about correct specification or incomplete understanding of the outcome model. The propensity score is the coarsest scalar-valued function of pre-treatment covariates, X, that yields conditional independence of the treatment and the covariates, X. Rosenbaum and Rubin [8] demonstrated that for propensity score e(X), both Z ⊥ X|e(x) and
Whatever the approach adopted – and whether reliant on the propensity score or not – identification assumptions are needed. In particular, the Stable Unit Treatment Value Assumption is required, which relies on consistency of the potential and observed outcomes and further requires that an individual’s outcome is affected only by the treatment they receive and not by the treatment received by others. Positivity is also required, which ensures that for every stratum of X, treatment assignment is not uniquely determined. Finally, each method typically makes some assumptions on model specification, specifically that either the outcome model or the propensity score model is correctly specified (at least with respect to the confounding variables).
Estimation using parametric regression, matching, stratification, and inverse weighting extend naturally to discrete outcomes; however, when using parametric regression to estimate the ATE, one must also marginalize over the distribution of X, even when treatment-covariate interactions are not present. In contrast, the extension of propensity score regression to binary and censored outcomes is not straightforward [11, 12] – an important obstacle for their application in oncology research, where uncensored, continuous outcomes are less common. However, the core principles outlined in steps (1) and (2) of the workflow above, which call for the thoughtful specification of the research question and its subsequent linkage to a causal estimand with carefully stated definitions, remain unchanged.
2.3 Special considerations for censored outcomes
As we will see in the examples of the following sections, there are a variety of outcomes of interest in oncology, e.g., cancer rates, mortality, and recurrence. These outcomes are often censored, which poses additional challenges for causal inference. Consider as an example the outcome ‘death due to throat cancer’ and the potential exposure ‘daily use of mouthwash.’ Suppose an individual, i, used mouthwash daily from the age of 20 and died of throat cancer at age 82. What is the counterfactual outcome for this individual? Perhaps had this individual not used mouthwash daily, s/he would never die of throat cancer (Y i (0) = ∞), but in fact would die at age 70 of heart disease (known to be associated with poor oral hygiene and gum disease).
As argued by Greenland et al. [13], one cannot have a well-defined counterfactual that conditions on the absence of a competing risk. That is, there is no well-defined counterfactual that consists of “no daily use of mouthwash and no death by heart disease.” Alternative approaches have been considered. For instance, Robins [14] allowed for the outcome to be undefined under some treatments or for the potential outcome to be well-defined even if a competing risk would occur and prevent its observation under treatment [15]. Adapting the example in Appendix II of [13] to the example of individual i above, in this latter understanding of a potential outcome, the individual would have developed throat cancer at 90 had they not used mouthwash daily (and thus died of heart disease at 70). Thus, even the starting point of a causal survival analysis requires considerable care. These considerations relate to the estimand of interest, which may focus on the total effect of the treatment on the outcome of interest or on a ‘direct effect’ on the outcome that is not mediated by competing events [16].
The semi-parametric Cox proportional hazards model [17] is widely used in survival analysis; at the time of printing, the original article has been cited nearly 57,000 times, and the use of this approach has become so ubiquitous that it is often done without citing the original proposal. However, hazard ratios have come under considerable criticism for their lack of interpretability [13, 18], [19], [20]. For example, small (possibly clinically unimportant) differences in survival probabilities may translate to large hazard ratios, and hazard ratios, like odds ratios, are not collapsible. Alternative approaches include the restricted mean survival time [20, 21] and (marginal) cumulative incidence [16, 22].
We now turn a particular group of etiological questions that has been the motivation for considerable methodological developments: occupational exposures and their effects on cancer deaths.
3 Occupational exposures
3.1 Early thoughts on causality in studies of workplace exposures
Causal inference was a subject of interest for many of the leading thinkers in our field long before it was formalized through potential outcomes. A survey of early studies of potentially hazardous occupational exposures and their influence on cancer risk highlights this history and sheds light on important early developments in causal inference. Such occupational exposure studies are fraught with challenges, due in part to the fact that randomized trials are both prohibitively expensive given the long latency period as well as ethically dubious given the suspicion of harm. Thus, non-experimental cohort studies are the primary source of evidence to examine the effects of possible carcinogenic workplace exposures in humans.
Concerning steps (1) and (2) of the causal analysis workflow given above, great care must be taken when defining the exposure. Is exposure defined as employment for a particular class of employer (mining industry), employment by a specific employer, or maybe employment in a particular role (e.g., the administrative assistant to the owner or the accountant of a mine will not face the same exposures as the miners)? Is exposure continuously measured (years of exposure) or somehow classified by time or intensity?
In a relatively early example of such a study (though by no means the first – see for example [23]), Doll and colleagues [24] based the classification of occupational exposure of gas workers on multiple factors. Specifically, participants were classified as ‘exposed’ if they were employed in the gas industry for more than five years between the ages of 40 and 65 at the point when observation began, and exposed participants were further divided into risk categories according to whether their specific role involved entry into the carbonizing plant regularly (“heavy exposure”), intermittently, or not at all.
In this study, as well as an earlier investigation of cancer risk of nickel workers [25], Doll was careful to consider alternative explanations for the observed relationship between exposure and cancer risk. In the study of gas workers [24], a small substudy was conducted in 10% of the workers to collect information on smoking habits; this allowed the authors to conclude that it was unlikely that confounding was grossly distorting the increased risk associated with heavy exposure. In terms of more general contextual confounding, in the study of nickel workers, Doll chose as ‘controls’ men in other occupations (grouping these into select occupations that included miners of materials other than nickel, steelworkers, coal miners, or all other occupations) from the same geographical areas as the nickel miners “to eliminate differences in the incidence of the two diseases due to non-industrial local factors” [25]. Although the term confounding was never used in these early studies, its presence was acknowledged and accounted for to the greatest extent possible.
The authors were careful to compare their results with those of other similar studies, and other forms of bias were also considered, including ascertainment bias (“The excess did not appear to be due to a bias in favor of diagnosing lung cancer among nickel workers” [25]) and what is now known as the healthy worker effect, which we discuss in the next section. Thus, even in the absence of causal inference as a formal specialization or without a language of its own within statistics, inferences of a causal nature were being drawn about occupational exposures.
It is perhaps, then, not surprising that one of the best-known texts on causality from the medical statistics literature is the 1965 lecture by Sir Austin Bradford Hill given to the recently formed Section of Occupational Medicine of the Royal Society [26]. Reprinted in 2020 [27] and accompanied by a lengthy and varied discussion by luminaries in our field, it is in this pivotal lecture that Bradford Hill gave a series of conditions that may be used to assess whether a relationship is causal. These conditions are often (erroneously) referred to as the Bradford Hill criteria and include strength, consistency, specificity, temporality, biological gradient, plausibility, coherence, experiment, and analogy. It is interesting to note that even though randomization was well-understood at the time, experiment is rather low on the list; perhaps this is because the audience that Bradford Hill was addressing was more likely to find cohort studies more useful.
While informal, the conditions listed by Bradford Hill foreshadow many of the critical developments that followed in the field of causal inference. Specificity is more likely to hold when the exposure is well-defined. Experiment, strength of relationship, and perhaps also biological gradient speak to the need to avoid confounding and other sources of bias. Consistency links to the ideas of “stability” [28] or invariant prediction [29] found in more recent literature. The remaining conditions require a significant understanding of the substantive area in which the research question arises – a need to be, as Bradford Hill previously suggested [30], bilingual in that: “[the statistician] must learn a great deal of medicine and […] not only have facility in speaking two languages, he must be able to think in two.” Without this fluency, many of the modern tools that are now quite standard, such as directed acyclic graphs [31], are of no use without a solid understanding of the context of the research question.
While workplace safety has, thankfully, improved since the earliest studies cited here, there continue to be questions of occupational exposure that have driven more recent methodological developments and entirely new classes of estimators.
3.2 The healthy worker effect and G-methods
The healthy worker effect is an interesting paradox that has been recognized and discussed in occupational epidemiology for more than a century, and formal statistical solutions through counterfactual formulations were proposed as early as the 1980s. Previous authors [32, 33] have cited an 1885 letter from William Ogle to the Registrar-General of England and Wales as the first reference to this issue. In this letter, Ogle noted that mortality rates are occupation dependent, with those employed in more physically demanding work exhibiting lower mortality rates than those who were unemployed or were employed in less physically taxing positions. In the study of workers in the gas industry introduced in the previous section, Doll [24], noted “An alternative explanation is that some selective bias resulted in the inclusion of a relatively healthy group of employees.” That is, the demands of a physical job could act as a selective force in joining or remaining in the workforce that could distort possible risk factors of a particular occupation. Later, in 1976, Fox and Collier [32] pointed to several origins of the selection of healthy workers, two of which are “the selection of a healthy population for employment” and “the survival in the industry of the healthier [workers].”
Early attempts to address this source of bias were themselves subject to bias. For instance, in the study of nickel workers, comparisons were made with men in “all other occupations” – a reference group deemed “the most suitable” – as well as groups of men working in other industries where the work was similar in nature to that of nickel workers (e.g., workers in steel, aluminum, copper, spelter, and so on) that “might also carry a specific risk of lung cancer” [25]. It might be argued, however, that the physical demands of the latter groups are more similar, and thus subject to the same selective pressures, as that in the nickel industry. How, then, to resolve this, if not through the selection of the comparator occupations?
In a series of papers, Robins [15, 34, 35] described the source of bias arising from the healthy worker effect graphically and proposed G-computation to address these biases and consistently estimate meaningful causal parameters. One of the insights that Robins raised is that occupational exposure must be viewed as time-varying and not as a point exposure. In framing the problem in this way, it becomes evident that exposure at one time point may determine subsequent exposure. For example, if exposure makes an individual sick such that they can no longer work, that intermediate variable of “became sick” then determines subsequent workplace exposure. Those individuals left in employment – and thus with the longest period of exposure – may be those least susceptible to the harmful effects, leading to the incorrect conclusion that the longer exposure is protective.
It is now widely understood and accepted that traditional regression models cannot account for variables that are simultaneously mediators (caused by exposure at a given point in time and leading to changes in the outcome) and confounders (affecting both exposure at subsequent time periods and the outcome) [36]. Indeed, since first introducing G-computation, Robins and colleagues have developed a series of methods for estimation of both conditional and marginal effects of longitudinal sequences of exposures [37–43]; see [44, 45] for an excellent survey of the methods and their relative strengths and frailties.
In the next section, we turn our attention away from the past and historical considerations of the causes (etiology) of cancer, and look instead to current and future research questions on treating cancer.
4 Precision medicine in oncology
There are many different clinical interpretations of the term ‘precision medicine.’ In the statistical literature, it is used synonymously with adaptive treatment strategies (ATS), individualized treatment rules, and dynamic treatment regimes [46–48].
It is well recognized that to account for potential delayed effects of treatment or for synergies between treatments given at different stages, the components of ATS must be studied together as a whole treatment package rather than as individual components of treatment. Sequential multiple assignment randomized trials [49–51], or SMARTs, have been developed as the gold standard, randomized means of assessing ATS. Kidwell [52, 53] considered the challenges and opportunities for SMART designs to investigate cancer treatments in particular and improve patient outcomes. Thall [54] discussed lessons learned from two specific SMARTs in advanced prostate cancer and metastatic kidney cancer. Indeed, the approach has now grown to be well accepted and is discussed in standard textbooks [55]; however, the number of full-scale SMARTs that have been conducted in oncology remains fairly limited.
As ATS seek to tailor treatment to individual patient features such as measures of health, age, cancer stage, or tumor characteristics, the goal of ATS analyses is to uncover heterogeneous treatment effects, i.e., treatment-covariate interactions. Heterogeneous treatment effects are often small in magnitude and thus difficult to detect from a statistical significance perspective. As such, analyses of non-experimental data remain important sources of information to discover and explore ATS. Of course, such data sources bring numerous challenges that require careful causal consideration, including the potential for biases due to confounding, selection, irregular follow-up, and so on. Further, analytic methods for ATS are most developed in the continuous outcome setting; however, binary outcomes (e.g., remission) or censored outcomes are common in oncology.
As an example, we will revisit the work of Krakow and colleagues [56, 57]: in this pair of papers, the authors considered how to tailor the sequence of immunosuppressant drugs given to blood cancer patients treated by allogeneic hematopoietic cell transplantation (AHCT). Such immunosuppressive drugs are often administered sequentially as part of the AHCT to prevent and (if needed) treat graft-versus-host disease (GVHD), a condition in which donor cells attack the recipient patient’s normal tissues. The authors considered the choice between non-specific highly T-lymphodepleting therapies (NHTL) versus alternatives, non-NHTL immunosuppressants. This general categorization of treatment as NHTL versus non-NHTL therapy was necessary because the delayed effects, potential interactions, and appropriateness of different immunosuppressant regimens were not well characterized at the time, particularly in light of the complex and evolving nature of individual patient characteristics.
To study the optimal sequence of immunosuppressants [56, 57], used data from the Center for Blood and Marrow Transplant Research (CIBMTR), a consortium of 450 health care institutions worldwide that report longitudinal data on hematopoietic stem cell transplants, with records on over 425,000 transplant recipients. The CIBMTR collects data at specified time points (prior to transplantation, then 100 days, 6 months, and annually thereafter until death). The information gathered is rich and includes age and sex of the transplant recipient, disease type, date of diagnosis, pre-transplantation disease stage, graft source, treatment regimen, blood lab work, development of GVHD, disease progression and survival, secondary malignancies, and cause of death. Because there are more than 14 classes of agents used for GVHD prophylaxis and/or systemic treatment of acute GVHD, the number of patients with specific immunosuppressant treatment trajectories was sparse. Thus, the categorization of immunosuppressants into NHTL or non-NHTL therapy is a significant coarsening of the treatment variable.
Another important challenge observed in the CIBMTR data was confounding: patients that were given NHTL, either prophylactically or to treat GVHD, typically had more advanced cancer at the time of transplant and were more likely to have a donor with mismatched human leukocyte antigens, which is known to increase the risk of GVHD. Other issues that arose included the complexity of the outcome, as the survival outcome was not only censored but also represented a mixture of patients who were and were not cured. And finally, an issue that has only recently started to receive attention in the ATS methodology literature, unmeasured confounding remains an important source of uncertainty that can undermine confidence in results. Sensitivity analysis approaches for unmeasured confounding have, to date, been limited to the continuous outcome ATS setting.
5 Closing remarks
As we have discussed, the study of cancer outcomes – and in particular, the study of censored outcomes such as time to death or potentially carcinogenic occupational exposures with long latency periods – have occupied the thoughts of some of the most respected thinkers in statistics, including Sir Richard Doll in the 1950s and 60s, Sir Austin Bradford Hill in the 1960s and 70s, and Jamie Robins, from the late 1980s to today. Confounding and notions of balance or fairness have motivated many advances in causal inference and form the basis of both the potential outcomes framework and the development of the propensity score. Occupational exposures focused a lens on other equally important issues, including selection bias and time-dependent confounding.
The methods developed to address these issues have made possible whole new areas of research, including the study of optimal ATS. An interesting point to consider, however, is the degree to which the causal literature, or indeed any statistical literature, can answer questions pertaining to clinical decision-making for individual patients. ATS seek to identify treatment strategies based on patient covariates, but many methods rely on outcome modelling (alone or in combination with treatment modelling in a doubly robust fashion) which ultimately rely on some smoothness and averaging, rather than identification of truly personalized care. The conditional effects or treatment effect heterogeneity captured in ATS analyses are certainly many steps closer to truly individualized care than population-level estimands such as the average treatment effect and similar quantities that are better suited for broader scale public health policy. Nonetheless, the level of individualization supported by ATS analyses is often quite coarse, and further methodological innovations may be needed to better learn from sparse data in settings where the human toll caused by error is potentially very high.
The methods mentioned briefly here – causal diagrams, g-methods, and more – have mathematical formality and are supported by theoretical rigor, and yet could not have arisen without researchers willing to develop the bilingualism that Bradford Hill extolled, the symbiosis between statistician and physician to learn from one another and produce better science as a result.
Funding source: Fonds de Recherche du Québec - Santé
Funding source: Canadian Instititutes for Health Research
-
Author contributions: The author has accepted responsibility for the entire content of this submitted manuscript and approved submission.
-
Research funding: EEMM is a Canada Research Chair (Tier 1) in Statistical Methods for Precision Medicine and acknowledges the support of a chercheur de mérite career award from the Fonds de Recherche du Québec, Santé and a Discovery Grant from the Natural Sciences and Engineering Research Council of Canada (NSERC).
-
Conflict of interest statement: The author declares no conflicts of interest regarding this article.
References
1. Hulswit, M. From cause to causation: a peircean perspective. Dordrecht: Kluwer Publishers; 2002.10.1007/978-94-010-0297-4Search in Google Scholar
2. Plato. Timaeus and critias (original c. 360BCE, translation published 1971). Harmonsworth Middlesex: Penguin books LTD; 1971.Search in Google Scholar
3. Hill, AB. Principles of medical statistics. London, UK: Lancet; 1937.Search in Google Scholar
4. Petrarca, F. Rerum senilium libri. liber XIV, epistola 1. letter to Boccaccio, V.3. Harmondsworth, UK: Penguin Books; 1364.Search in Google Scholar
5. Neyman, J. On the application of probability theory to agricultural experiments. essay in principles. section 9 (translation published in 1990). Stat Sci 1923;5:465–72.10.1214/ss/1177012031Search in Google Scholar
6. Rubin, DB. Estimating causal effects of treatments in randomized and nonrandomized studies. J Educ Psychol 1974;66:688–701. https://doi.org/10.1037/h0037350.Search in Google Scholar
7. Goetghebeur, E, le Cessie, S, Stavola, BD, Moodie, EEM, Waernbaum, I. Formulating causal questions and principled statistical answers. Stat Med 2020;39:4922–48. https://doi.org/10.1002/sim.8741.Search in Google Scholar PubMed PubMed Central
8. Rosenbaum, PR, Rubin, DB. The central role of the propensity score in observational studies for causal effects. Biometrika 1983;70:41–55. https://doi.org/10.1093/biomet/70.1.41.Search in Google Scholar
9. Robins, JM, Rotnitzky, A, Zhao, LP. Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. J Am Stat Assoc 1995;90:106–21. https://doi.org/10.1080/01621459.1995.10476493.Search in Google Scholar
10. Angrist, JD, Imbens, GW, Rubin, DB. Identification of causal effects using instrumental variables. J Am Stat Assoc 1996;91:444–55. https://doi.org/10.1080/01621459.1996.10476902.Search in Google Scholar
11. Li, J, Handorf, E, Bekelman, J, Mitra, N. Propensity score and doubly robust methods for estimating the effect of treatment on censored cost. Stat Med 2016;35:1985–99. https://doi.org/10.1002/sim.6842.Search in Google Scholar PubMed PubMed Central
12. Zetterqvist, J, Sjölander, A. Doubly robust estimation with the R package drgee. Epidemiol Methods 2015;4. https://doi.org/10.1515/em-2014-0021.Search in Google Scholar
13. Greenland, S, Robins, JM, Pearl, J. Confounding and collapsibility in causal inference. Stat Sci 1999;14:29–46. https://doi.org/10.1214/ss/1009211805.Search in Google Scholar
14. Robins, JM. An analytic method for randomized trials with informative censoring. Lifetime Data Anal 1995;1:241–54. https://doi.org/10.1007/bf00985759.Search in Google Scholar
15. Robins, JM. A new approach to causal inference in mortality studies with sustained exposure periods – application to control of the healthy worker survivor effect. Math Model 1986;7:1393–512. https://doi.org/10.1016/0270-0255(86)90088-6.Search in Google Scholar
16. Young, J, Stensrud, M, Tchetgen, ET, Hernán, M. A causal framework for classical statistical estimands infailure-time settings with competing events. Stat Med 2020;39:1199–236. https://doi.org/10.1002/sim.8471.Search in Google Scholar PubMed PubMed Central
17. Cox, DR. Regression models and life-tables. J Roy Stat Soc B 1972;34:187–220. https://doi.org/10.1111/j.2517-6161.1972.tb00899.x.Search in Google Scholar
18. Hernán, MA. The hazards of hazard ratios. Epidemiology 2010;21:13–5. https://doi.org/10.1097/ede.0b013e3181c1ea43.Search in Google Scholar PubMed PubMed Central
19. Martinussen, T, Vansteelandt, S. On collapsibility and confounding bias in Cox and Aalen regression models. Lifetime Data Anal 2013;19:279–96. https://doi.org/10.1007/s10985-013-9242-z.Search in Google Scholar PubMed
20. Martinussen, T, Vansteelandt, S, Andersen, P. Subtleties in the interpretation of hazard contrasts. Lifetime Data Anal 2020;26:833–55. https://doi.org/10.1007/s10985-020-09501-5.Search in Google Scholar PubMed
21. Royston, P, Parmar, M. Restricted mean survival time: an alternative to the hazard ratio for the design and analysis of randomized trials with a time-to-event outcome. BMC Med Res Methodol 2013;13:152. https://doi.org/10.1186/1471-2288-13-152.Search in Google Scholar PubMed PubMed Central
22. Andersen, P, Syriopoulou, E, Parner, E. Causal inference in survival analysis using pseudo-observations. Stat Med 2017;36:2669–81. https://doi.org/10.1002/sim.7297.Search in Google Scholar PubMed
23. Henry, SA. Cancer of the scrotum in relation to occupation. London: Oxford University Press; 1946.Search in Google Scholar
24. Doll, R, Fisher, R, Gammon, E, Gunn, W, Hughes, G, Tyrer, F, et al.. Mortality of gasworkers with special reference to cancers of the lung and bladder, chronic bronchitis, and pneumoconiosis. Br J Ind Med 1965;22:1–12. https://doi.org/10.1136/oem.22.1.1.Search in Google Scholar PubMed PubMed Central
25. Doll, R. Cancer of the lung and nose in nickel workers. Br J Ind Med 1958;15:217–23. https://doi.org/10.1136/oem.15.4.217.Search in Google Scholar PubMed PubMed Central
26. Hill, AB. The environment and disease: association or causation? Proc Roy Soc Med 1965;58:295–300. https://doi.org/10.1177/003591576505800503.Search in Google Scholar
27. Hill, AB. The environment and disease: association or causation? (reprinted, and with commentaries). Obs Studies 2020;6:1–65. https://doi.org/10.1353/obs.2020.0000.Search in Google Scholar
28. Pearl, J. Causality, 2nd ed. Cambridge, UK: Cambridge University Press; 2009.Search in Google Scholar
29. Peters, J, Bühlmann, P, Meinshausen, N. Causal inference by using invariant prediction: identification and confidence intervals. J Roy Stat Soc B 2016;78:947–1012. https://doi.org/10.1111/rssb.12167.Search in Google Scholar
30. Hill, AB. The statistician in medicine (Alfred Watson memorial lecture). J Inst Actuar 1962;88:178–91. https://doi.org/10.1017/s0020268100014980.Search in Google Scholar
31. Pearl, J. Causal diagrams for empirical research. Biometrika 1995;82:669–710. https://doi.org/10.1093/biomet/82.4.702.Search in Google Scholar
32. Fox, AJ, Collier, PF. Low mortality rates in industrial cohort studies due to selection for work and survival in the industry. Br J Prev Soc Med 1976;30:225–30. https://doi.org/10.1136/jech.30.4.225.Search in Google Scholar PubMed PubMed Central
33. Shah, D. Healthy worker effect phenomenon. Indian J Occup Environ Med 2009;13:77–9. https://doi.org/10.4103/0019-5278.55123.Search in Google Scholar PubMed PubMed Central
34. Robins, JM. Addendum to a new approach to causal inference in mortality studies with sustained exposure periods - application to control of the healthy worker survivor effect. Comput Math Appl 1987;14:923–45. https://doi.org/10.1016/0898-1221(87)90238-0.Search in Google Scholar
35. Robins, JM. A graphical approach to the identification and estimation of causal parameters in mortality studies with sustained exposure periods. J Chron Dis 1987;40:139S–61S. https://doi.org/10.1016/s0021-9681(87)80018-8.Search in Google Scholar PubMed
36. Moodie, EEM, Stephens, DA. Using directed acyclic graphs to detect limitations of traditional regression in longitudinal studies. Int J Publ Health 2010;55:701–3. https://doi.org/10.1007/s00038-010-0184-x.Search in Google Scholar PubMed
37. Hernán, MA, Cole, SJ, Margolick, J, Cohen, M, Robins, JM. Structural accelerated failure time models for survival analysis in studies with time-varying treatments. Pharmacoepidemiol Drug Saf 2005;14:477–91. https://doi.org/10.1002/pds.1064.Search in Google Scholar PubMed
38. Murphy, SA, der Laan, MJV, Robins, JM, CPPRG. Marginal mean models for dynamic regimes. J Am Stat Assoc 2001;96:1410–23. https://doi.org/10.1198/016214501753382327.Search in Google Scholar PubMed PubMed Central
39. Robins, J, Tsiatis, AA. Semiparametric estimation of an accelerated failure time model with time-dependent covariates. Biometrika 1992;79:311–9. https://doi.org/10.2307/2336842.Search in Google Scholar
40. Robins, JM. The analysis of randomized and nonrandomized AIDS treatment trials using a new approach to causal inference in longitudinal studies. New York: NCHSR, U.S. Public Health Service; 1989:113–59 pp.Search in Google Scholar
41. Robins, JM. The control of confounding by intermediate variables. Stat Med New 1989;8:679–701. https://doi.org/10.1002/sim.4780080608.Search in Google Scholar PubMed
42. Robins, JM Marginal structural models versus structural nested models as tools for causal inference, volume 116 of IMA. New York, NY: Springer; 1999:95–134 pp.10.1007/978-1-4612-1284-3_2Search in Google Scholar
43. Robins, JM, Hernán, MA, Brumback, B. Marginal structural models and causal inference in epidemiology. Epidemiology 2000;11:550–60. https://doi.org/10.1097/00001648-200009000-00011.Search in Google Scholar PubMed
44. Daniel, R, Cousens, S, Stavola, BD, Kenwood, M, Sterne, J. Methods for dealing with time-dependent confounding. Stat Med 2013;32:1584–618. https://doi.org/10.1002/sim.5686.Search in Google Scholar PubMed
45. Daniel, R, Stavola, BD, Cousens, S. Time-varying confounding: some practical considerations in a likelihood framework. In: Berzuini, C, Dawid, A, Bernardinelli, L, editors. Causality: statistical perspectives and applications. Chichester, UK: Wiley; 2012. In press.10.1002/9781119945710.ch17Search in Google Scholar
46. Chakraborty, B, Moodie, EEM. Statistical methods for dynamic treatment regimes: reinforcement learning, causal inference, and personalized medicine. New York, NY: Springer-Verlang; 2013.10.1007/978-1-4614-7428-9Search in Google Scholar
47. Murphy, SA. Optimal dynamic treatment regimes (with discussion). J Roy Stat Soc B 2003;65:331–66. https://doi.org/10.1111/1467-9868.00389.Search in Google Scholar
48. Robins, JM. Optimal structural nested models for optimal sequential decisions. In: Lin, D, Heagerty, P, editors. Proceedings of the second seattle symposium on biostatistics. New York: Springer; 2004:189–326 pp.10.1007/978-1-4419-9076-1_11Search in Google Scholar
49. Collins, L, Murphy, SA, Strecher, V. The multiphase optimization strategy (MOST) and the sequential multiple assignment randomized trial (SMART): new methods for more potent e-health interventions. Am J Prev Med 2007;32:S112–8. https://doi.org/10.1016/j.amepre.2007.01.022.Search in Google Scholar PubMed PubMed Central
50. Lei, H, Nahum-Shani, I, Lynch, K, Oslin, D, Murphy, S. A SMART design for building individualized treatment sequences. Annu Rev Clin Psychol 2012;8:21–48. https://doi.org/10.1146/annurev-clinpsy-032511-143152.Search in Google Scholar PubMed PubMed Central
51. Oetting, A, Levy, J, Weiss, R, Murphy, SA. Statistical methodology for a SMART design in the development of adaptive treatment strategies. Arlington, VA: American Psychiatric Publishing, Inc; 2011:179–205 pp.10.1093/oso/9780199754649.003.0013Search in Google Scholar
52. Kidwell, KM. SMART designs in cancer research: past, present, and future. Clin Trials 2014;11:445–56. https://doi.org/10.1177/1740774514525691.Search in Google Scholar PubMed PubMed Central
53. Kidwell, KM, Postow, MA, Panageas, KS. Sequential, multiple assignment, randomized trial designs in immuno-oncology research. Clin Cancer Res 2018;24:730–6. https://doi.org/10.1158/1078-0432.ccr-17-1355.Search in Google Scholar
54. Thall, PF. SMART design, conduct, and analysis in oncology. New York: SIAM; 2015.10.1137/1.9781611974188.ch4Search in Google Scholar
55. Halabi, S, Michiels, S. Textbook of clinical trials in oncology: a statistical perspective. Boca Raton, USA: CRC Press; 2019.10.1201/9781315112084Search in Google Scholar
56. Krakow, EF, Hemmer, M, Wang, T, Logan, B, Aurora, M, Spellman, S, et al.. Tools for the precision medicine era: how to develop highly personalized treatment recommendations from cohort and registry data using Q-learning. Am J Epidemiol 2017;186:160–72. https://doi.org/10.1093/aje/kwx027.Search in Google Scholar PubMed PubMed Central
57. Moodie, EEM, Stephens, DA, Alam, S, Zhang, MJ, Logan, B, Arora, M, et al.. A cure-rate model for Q-learning: estimating an adaptive immunosuppressant treatment strategy for allogeneic hematopoietic cell transplant patients. Biom J 2019;61:442–53. https://doi.org/10.1002/bimj.201700181.Search in Google Scholar PubMed PubMed Central
© 2022 Walter de Gruyter GmbH, Berlin/Boston
Articles in the same Issue
- Frontmatter
- Part-1: SMAC 2021 Webconference
- Statistics, philosophy, and health: the SMAC 2021 webconference
- Part-2: Regular Articles
- “Show me the DAG!”
- Causal inference for oncology: past developments and current challenges
- The EBM+ movement
- Bayesianism from a philosophical perspective and its application to medicine
- Bayesian inference for optimal dynamic treatment regimes in practice
- Agent-based modeling in medical research, virtual baseline generator and change in patients’ profile issue
- Agent based modeling in health care economics: examples in the field of thyroid cancer
- A copula-based set-variant association test for bivariate continuous, binary or mixed phenotypes
- Detection of atypical response trajectories in biomedical longitudinal databases
- Potential application of elastic nets for shared polygenicity detection with adapted threshold selection
- Error analysis of the PacBio sequencing CCS reads
- A SIMEX approach for meta-analysis of diagnostic accuracy studies with attention to ROC curves
- Statistical modelling of COVID-19 and drug data via an INAR(1) process with a recent thinning operator and cosine Poisson innovations
- The balanced discrete triplet Lindley model and its INAR(1) extension: properties and COVID-19 applications
Articles in the same Issue
- Frontmatter
- Part-1: SMAC 2021 Webconference
- Statistics, philosophy, and health: the SMAC 2021 webconference
- Part-2: Regular Articles
- “Show me the DAG!”
- Causal inference for oncology: past developments and current challenges
- The EBM+ movement
- Bayesianism from a philosophical perspective and its application to medicine
- Bayesian inference for optimal dynamic treatment regimes in practice
- Agent-based modeling in medical research, virtual baseline generator and change in patients’ profile issue
- Agent based modeling in health care economics: examples in the field of thyroid cancer
- A copula-based set-variant association test for bivariate continuous, binary or mixed phenotypes
- Detection of atypical response trajectories in biomedical longitudinal databases
- Potential application of elastic nets for shared polygenicity detection with adapted threshold selection
- Error analysis of the PacBio sequencing CCS reads
- A SIMEX approach for meta-analysis of diagnostic accuracy studies with attention to ROC curves
- Statistical modelling of COVID-19 and drug data via an INAR(1) process with a recent thinning operator and cosine Poisson innovations
- The balanced discrete triplet Lindley model and its INAR(1) extension: properties and COVID-19 applications