Abstract
Clinic-based cohort studies enroll patients on first being admitted to the clinic, and follow them as part of usual care, with interest being in the marginal mean of the outcome process. As the required frequency of follow-up varies among patients, these studies often feature irregular visit times, with no two patients sharing a visit time. Inverse-intensity weighting has been developed to handle this, however it requires that the visit process be conditionally independent of the outcome given the observed history. When patients schedule visits in response to changes in their health (for example a disease flare), the conditional independence assumption is no longer plausible, leading to biased results. We suggest additional information that can be collected to ensure that conditional independence holds, and examine how this might be used in the analysis. This allows clinic-based cohort studies to be used to determine longitudinal outcomes without incurring bias due to irregular follow-up.
Funding statement: EMP received operating funds from the Natural Sciences and Engineering Research Council, and is supported by a New Investigator Award from the Canadian Institutes of Health Research. The funding agreements ensured the author’s independence in designing the study, interpreting the data, writing, and publishing the report.
Declarations of interest: none.
References
Buzkova, P., Brown, E., and John-Stewart, G. (2010). Longitudinal data analysis for generalized liner models under participant-driven informative follow-up: an application in maternal health epidemiology. American Journal of Epidemiology, 171:189–197.10.1093/aje/kwp353Search in Google Scholar PubMed PubMed Central
Lin, H., Scharfstein, D., and Rosenheck, R. (2004). Analysis of longitudinal data with irregular, outcome-dependent follow-up. Journal of the Royal Statistical Society, Series B, 66:791–813.10.1111/j.1467-9868.2004.b5543.xSearch in Google Scholar
Pullenayegum, E., and Lim, L. (2016). Longitudinal data subject to irregular observation: A review of methods with a focus on visit processes, assumptions, and study design. Statistical Methods in Medical Research, 25:2992–3014.10.1177/0962280214536537Search in Google Scholar PubMed
Lin, D., and Ying, Z. (2001). Semiparametric and nonparametric regression analysis of longitudinal data. Journal of the American Statistical Association, 96:103–113.10.1198/016214501750333018Search in Google Scholar
Lin, D., and Ying, Z. (2003). Semiparametric regression analysis of longitudinal data with informative dropouts. Biostatistics, 4:385–398.10.1093/biostatistics/4.3.385Search in Google Scholar PubMed
Liang, Y., Lu, W., and Ying, Z. (2009). Joint modeling and analysis of longitudinal data with informative observation times. Biometrics, 65:377–384.10.1111/j.1541-0420.2008.01104.xSearch in Google Scholar PubMed
Sun, J., Sun, L., and Liu, D. (2007). Regression analysis of longitudinal data in the presence of informative observation and censoring times. Journal of the Americal Statistical Association, 102:1397–1406.10.1198/016214507000000851Search in Google Scholar
Sun, L., Mu, X., Sun, Z., and Tong, X. (2011a). Semiparameteric analysis of longitudinal data with informative observation times. Acta Mathematicae Applicatae Sinica - English Series, 27:29–42.10.1007/s10255-011-0037-2Search in Google Scholar
Sun, L., Song, X., and Zhou, J. (2011b). Regression analysis of longitudinal data with time-dependent covariates in the presence of informative observation and censoring times. Journal of Statistical Planning and Inference, 141:2902–2919.10.1016/j.jspi.2011.03.013Search in Google Scholar
Buzkova, P., and Lumley, T. (2007). Longitudinal data analysis for generalized linear models with follow-up dependent on outcome-related variables. The Canadian Journal of Statistics, 35:485–500.10.1002/cjs.5550350402Search in Google Scholar
Lange, J. M., Hubbard, R. A., Inoue, L. Y., and Minin, V. N. (2015). A joint model for multistate disease processes and random informative observation times, with aapplication to electronic medical records data. Biometrics, 7:90–101.10.1111/biom.12252Search in Google Scholar PubMed PubMed Central
Muller, H. G., and Wang, J. L. (1994). Hazard rate estimation under random censoring with varying kernels and bandwidths. Biometrics, 50:61–76.10.2307/2533197Search in Google Scholar
Boyd, A., Kittelson, J., and Gillen, D. (2012). Estimation of treatment effect under nonproportional hazards and conditionally independent censoring. Statistics in Medicine, 31:3504–3515.10.1002/sim.5440Search in Google Scholar PubMed PubMed Central
Lam, C., Manlhiot, C., Pullenayegum, E., and Feldman, B. (2011). Efficacy of intravenous ig therapy in juvenile dermatomyositis. Annals of the Rheumatic Diseases, 70:2089–2094.10.1136/ard.2011.153718Search in Google Scholar PubMed
Bode, R., Klein-Gitelman, M., and Miller, M. (2003). Disease activity score for children with juvenile dermatomyositis: reliability and validity evidence. Arthiritis and Rheumatism, 49:7–15.10.1002/art.10924Search in Google Scholar PubMed
Pinheiro, J., and D. Bates. 2000. Mixed-Effects Models in S and S-PLUS. New York: Springer.10.1007/978-1-4419-0318-1Search in Google Scholar
Pullenayegum, E., and Feldman, B. (2013). Doubly robust estimation, optimally truncated inverse-intensity weighting and increment-based methods for the analysis of irregularly observed longitudinal data. Statistics in Medicine, 32:1054–1072.10.1002/sim.5640Search in Google Scholar PubMed
Anderson, P., O. Borgan, R. Gill, and N. Keiding. 1993. Statistical Models based on Counting Processes. New York: Springer.10.1007/978-1-4612-4348-9Search in Google Scholar
Newey, W. (1990). Semiparametric efficiency bounds. Journal of Applied Econometrics, 5:99–135.10.1002/jae.3950050202Search in Google Scholar
Gentleman, R. (2015). Package ’muhaz’: A package for producing a smooth estimate of the hazard function for censored data. Technical Report. Comprehensive R Archive Network.Search in Google Scholar
Tan, K., French, B., and Troxel, A. (2014). Regression modeling of longitudinal data with outcome-dependent observation times: extensions and comparative evaluation. Statistics in Medicine, 33:4770–4789.10.1002/sim.6262Search in Google Scholar PubMed
Pullenayegum, E. (2016). Multiple outputation for the analysis of longitudinal data subject to irregular observation. Statistics in Medicine, 35:1800–1818.10.1002/sim.6829Search in Google Scholar PubMed
© 2020 Walter de Gruyter GmbH, Berlin/Boston
Articles in the same Issue
- Editorial
- The mean prevalence
- Research Articles
- Heterogeneous indirect effects for multiple mediators using interventional effect models
- Sleep habits and their association with daytime sleepiness among medical students of Tanta University, Egypt
- Population attributable fractions for continuously distributed exposures
- A real-time search strategy for finding urban disease vector infestations
- Disease mapping models for data with weak spatial dependence or spatial discontinuities
- A comparison of cause-specific and competing risk models to assess risk factors for dementia
- A simple index of prediction accuracy in multiple regression analysis
- A comparison of approaches for estimating combined population attributable risks (PARs) for multiple risk factors
- Posterior predictive treatment assignment methods for causal inference in the context of time-varying treatments
- Random effects tumour growth models for identifying image markers of mammography screening sensitivity
- Extrapolating sparse gold standard cause of death designations to characterize broader catchment areas
- Extending balance assessment for the generalized propensity score under multiple imputation
- Regression analysis of unmeasured confounding
- The Use of Logic Regression in Epidemiologic Studies to Investigate Multiple Binary Exposures: An Example of Occupation History and Amyotrophic Lateral Sclerosis
- Meeting the Assumptions of Inverse-Intensity Weighting for Longitudinal Data Subject to Irregular Follow-Up: Suggestions for the Design and Analysis of Clinic-Based Cohort Studies
Articles in the same Issue
- Editorial
- The mean prevalence
- Research Articles
- Heterogeneous indirect effects for multiple mediators using interventional effect models
- Sleep habits and their association with daytime sleepiness among medical students of Tanta University, Egypt
- Population attributable fractions for continuously distributed exposures
- A real-time search strategy for finding urban disease vector infestations
- Disease mapping models for data with weak spatial dependence or spatial discontinuities
- A comparison of cause-specific and competing risk models to assess risk factors for dementia
- A simple index of prediction accuracy in multiple regression analysis
- A comparison of approaches for estimating combined population attributable risks (PARs) for multiple risk factors
- Posterior predictive treatment assignment methods for causal inference in the context of time-varying treatments
- Random effects tumour growth models for identifying image markers of mammography screening sensitivity
- Extrapolating sparse gold standard cause of death designations to characterize broader catchment areas
- Extending balance assessment for the generalized propensity score under multiple imputation
- Regression analysis of unmeasured confounding
- The Use of Logic Regression in Epidemiologic Studies to Investigate Multiple Binary Exposures: An Example of Occupation History and Amyotrophic Lateral Sclerosis
- Meeting the Assumptions of Inverse-Intensity Weighting for Longitudinal Data Subject to Irregular Follow-Up: Suggestions for the Design and Analysis of Clinic-Based Cohort Studies