Abstract
Incomplete data is a prevalent complication in longitudinal studies due to individuals’ drop-out before intended completion time. Currently available methods via commercial software for analyzing incomplete longitudinal data at best rely on the ignorability of the drop-outs. If the underlying missing mechanism was non-ignorable, potential bias arises in the statistical inferences. To remove the bias when the drop-out is non-ignorable, joint complete-data and drop-out models have been proposed which involve computational difficulties and untestable assumptions. Since the critical ignorability assumption is unverifiable based on the observed part of the sample, some local sensitivity indices have been proposed in the literature. Specifically, Eftekhari Mahabadi (Second-order local sensitivity to non-ignorability in Bayesian inferences. Stat Med 2018;59:55–95) proposed a second-order local sensitivity tool for Bayesian analysis of cross-sectional studies and show its better performance for handling bias compared with the first-order ones. In this paper, we aim to extend this index for the Bayesian sensitivity analysis of normal longitudinal studies with drop-outs. The index is driven based on a selection model for the drop-out mechanism and a Bayesian linear mixed-effect complete-data model. The presented formulas are calculated using the posterior estimation and draws from the simpler ignorable model. The method is illustrated via some simulation studies and sensitivity analysis of a real antidepressant clinical trial data. Overall, the numerical analysis showed that when repeated outcomes are subject to missingness, regression coefficient estimates are nearly approximated well by a linear function in the neighbourhood of MAR model, but there are a considerable amount of second-order sensitivity for the error term and random effect variances in Bayesian linear mixed-effect model framework.
-
Research ethics: Not applicable.
-
Code availability: The simulation and data analysis codes for this study are available on Code Ocean.
-
Author contributions: The authors have accepted responsibility for the entire content of this manuscript and approved its submission.
-
Competing interests: The authors state no conflict of interest.
-
Research funding: None declared.
-
Data availability: The raw data can be obtained on request from the corresponding author.
Here, the details of statistical inference and Bayesian sensitivity analysis of Bivariate longitudinal simulation study are given. Based on the simulation set up in subsection 4.1, the logarithm of the joint probability function of the observed response variables and the missing indicator needed for non-ignorable model estimation is
where the data is assumed to be sorted by the missing pattern, so that the first N
1(≤N) cases are completely observed. In addition,
The above expectation could be calculated based on the following lemma [39]:
Lemma 1
Let U ∼ N(0, σ
2), then for all
Implementing lemma 1, the explicit form of the expectation in (13) would be:
Hence, the logarithm of the non-ignorable joint pdf needed for the global sensitivity analysis plotted in Figures 1 and 2 can be rewritten as follows:
To calculate Bayesian ISNI and ISSNI, only the ignorable model assuming ψ
1 = 0 in (16) needs to be fitted. Ignorable posterior inference for the vector of transformed parameters
would be performed based on the following ignorable hierarchical posterior distributions:
where
and
We implemented R to get posterior samples of the ignorable model parameters ϕ , using conditional posterior distributions (17)–(21). For this model, sampling was continued until 12,000 draws were available in total and to have plausible results, each scenario of occurrence of missingness is repeated 100 times. In addition to have posterior samples of ψ 00 and ψ 01, based on the probit model (7) and prior (8), OpenBUGS software is used. The number of simulations and repetitions were the same as previous sampling procedure. All posterior samples were checked to be uncorrelated and have plausible mc-errors.
It should be noticed that
ϕ
is a one-to-one function of the vector of interesting parameters
where
where
In the three-time longitudinal simulation setting of Section 4.2, let
Completers: M i = (0, 0) for i = 1, …, N 1
Subjects who drop put at the third visit: M i = (0, 1) for i = N 1 + 1, …, N 2.
Subjects who drop put at the second visit: M i = (1, 1) for i = N 2 + 1, …, N.
Based on these assumption and the mentioned drop-out mechanism, the logarithm of the non-ignorable joint pdf of the observed response variables and the missing indicators can be decomposed as follows,
where
and
The integral parts of (24) have a logistic-normal form [39], which could be rewritten as
and
To have explicit forms for the above expectations, one-probit approximation approach [39] is applied:
where according to the minimax criterion m = 1.7 is chosen. Therefore, implementing the one-probit approximation along with lemma 1, the explicit form of the expectations would be:
and
where a = ψ 00 + 2ψ 01 + ψ 02 y i1 and b = ψ 00 + 3ψ 01 + ψ 02 y i2. Substituting the above formulas in (24), the logarithm of the non-ignorable joint pdf can be summarized and rewritten as follows:
As mentioned before, to calculate Bayesian ISNI and ISSNI ignorable posterior samples of the parameters β
1, σ (square root of σ
2) and
According to the ignorable posterior samples, ISNI and ISSNI are calculated as,
and
where h(.) = expit(.),
References
1. Rubin, DB. Inference and missing data (with Discussion). Bimetrika 1976;63:581–92. https://doi.org/10.1093/biomet/63.3.581.Suche in Google Scholar
2. Little, RJA, Rubin, DB. Statistical analysis with missing data, 2nd ed. New York: Wiley; 2002.10.1002/9781119013563Suche in Google Scholar
3. Diggle, PJ, Kenward, MG. Informative drop-out in longitudinal data analysis. Appl Statist 1994;43:49–94. https://doi.org/10.2307/2986113.Suche in Google Scholar
4. Little, RJA. Modeling the drop-out mechanism in longitudinal studies. J Am Stat Assoc 1995;90:1112–21. https://doi.org/10.1080/01621459.1995.10476615.Suche in Google Scholar
5. Copas, JB, Li, HG. Inference for non-random samples (with Discussion). J R Stat Soc B 1997;59:55–95. https://doi.org/10.1111/1467-9868.00055.Suche in Google Scholar
6. Little, RJA, Rubin, DB. Statistical analysis with missing data. New York: Wiley; 1987.Suche in Google Scholar
7. Little, RJA. Pattern-mixture models for multivariate incomplete data. J Am Stat Assoc 1993;88:125–34. https://doi.org/10.1080/01621459.1993.10594302.Suche in Google Scholar
8. Troxel, AB, Harrington, DP, Lipsitz, SR. Analysis of longitudinal data with non-ignorable non-monotone missing values. J R Stat Soc Ser C Appl Stat 1998;47:425–38. https://doi.org/10.1111/1467-9876.00119.Suche in Google Scholar
9. Copas, JB, Eguchi, S. Local sensitivity approximations for selectivity bias. J R Stat Soc Series B Stat Methodol 2001;63:871–95. https://doi.org/10.1111/1467-9868.00318.Suche in Google Scholar
10. Hirano, K, Imbens, GW, Ridder, G, Rubin, DB. Combining panels with attrition and refreshment samples. Econometrica 2001;69:1645–59. https://doi.org/10.1111/1468-0262.00260.Suche in Google Scholar
11. Deng, Y, Hillygus, DS, Reiter, JP, Si, Y, Zheng, S. Handling attrition in longitudinal studies: the case for refreshment samples. Stat Sci 2013;28:238–56. https://doi.org/10.1214/13-sts414.Suche in Google Scholar
12. Daniels, MJ, Hogan, JW. Missing data in longitudinal studies: strategies for Bayesian modeling and sensitivity analysis. Boca Raton: Chapman & Hall/CRC; 2008.10.1201/9781420011180Suche in Google Scholar
13. Zhang, J, Heitjan, DF. Impact of nonignorable coarsening on Bayesian inference. Biostatistics 2007;8:722–43. https://doi.org/10.1093/biostatistics/kxm001.Suche in Google Scholar PubMed
14. Troxel, AB, Ma, G, Heitjan, DF. An index of local sensitivity to non-ignorability. Stat Sinica 2004;14:1221–37.Suche in Google Scholar
15. Xie, H, Heitjan, DF. Sensitivity analysis of causal inference in a clinical trial subject to crossover. Clin Trials 2004;1:21–30. https://doi.org/10.1191/1740774504cn005oa.Suche in Google Scholar PubMed
16. Zhang, J, Heitjan, DF. Nonignorable censoring in randomized clinical trials. Clin Trials 2005;2:488–96. https://doi.org/10.1191/1740774505cn128oa.Suche in Google Scholar PubMed
17. Zhang, J, Heitjan, DF. A simple local sensitivity analysis tool for nonignorable coarsening: application to dependent censoring. Biometrics 2006;62:1260–8. https://doi.org/10.1111/j.1541-0420.2006.00580.x.Suche in Google Scholar PubMed
18. Ma, G, Troxel, AB, Heitjan, DF. An index of local sensitivity to nonignorable dropout in longitudinal modeling. Stat Med 2005;24:2129–50. https://doi.org/10.1002/sim.2107.Suche in Google Scholar PubMed
19. Xie, H. A local sensitivity analysis approach to longitudinal non-Gaussian data with non-ignorable dropout. Stat Med 2008;27:3155–77. https://doi.org/10.1002/sim.3117.Suche in Google Scholar PubMed
20. Qian, Y, Xie, H. Measuring the impact of nonignorability in panel data with non-monotone nonresponse. J Appl Econom 2010;27:129–59. https://doi.org/10.1002/jae.1157.Suche in Google Scholar
21. Eftekhari Mahabadi, S, Ganjali, M. An index of local sensitivity to non-ignorability for multivariate longitudinal mixed-effect data with potential non-random dropout. Stat Med 2010;29:1779–92. https://doi.org/10.1002/sim.3948.Suche in Google Scholar PubMed
22. Xie, H. Analyzing longitudinal clinical trial data with nonignorable missingness and unknown missingness reasons. Comput Stat Data Anal 2012;56:1287–300. https://doi.org/10.1016/j.csda.2010.11.021.Suche in Google Scholar
23. Spagnoli, A, Marino, MF, Alfo, M. A bidimensional finite mixture model for longitudinal data subject to dropout. Stat Med 2018;37:2998–3011. https://doi.org/10.1002/sim.7698.Suche in Google Scholar PubMed
24. Eftekhari Mahabadi, S, Ganjali, M. An index of local sensitivity to non-ignorability for parametric survival models with potential non-random missing covariate: an application to the SEER cancer registry data. J Appl Stat 2012;39:2327–48. https://doi.org/10.1080/02664763.2012.710196.Suche in Google Scholar
25. Xie, H. Adjusting for nonignorable missingness when estimating generalized additive models. Biom J 2010;52:186–200. https://doi.org/10.1002/bimj.200900202.Suche in Google Scholar PubMed
26. Xie, H, Qian, Y, Qu, L. A semiparametric approach for analyzing nonignorable missing data. Stat Sin 2011;21:1881–99. https://doi.org/10.5705/ss.2009.252.Suche in Google Scholar
27. Xie, H. Bayesian inference from incomplete longitudinal data: a simple method to quantify sensitivity to nonignorable dropout. Stat Med 2009;28:2725–45. https://doi.org/10.1002/sim.3655.Suche in Google Scholar PubMed
28. Eftekhari Mahabadi, S, Ganjali, M. A Bayesian approach for sensitivity analysis of incomplete multivariate longitudinal data with potential nonrandom dropout. Metron 2015;73:397–417. https://doi.org/10.1007/s40300-015-0063-6.Suche in Google Scholar
29. Gao, W, Hedeker, D, Mermelstein, R, Xie, H. A scalable approach to measuring the impact of nonignorable nonresponse with an EMA application. Stat Med 2016;35:5579–602. https://doi.org/10.1002/sim.7078.Suche in Google Scholar PubMed PubMed Central
30. Yuan, C, Hedeker, D, Mermelstein, R, Xie, H. A tractable method to account for high-dimensional nonignorable missing data in intensive longitudinal data. Stat Med 2020;39:2589–605. https://doi.org/10.1002/sim.8560.Suche in Google Scholar PubMed PubMed Central
31. Eftekhari Mahabadi, S. Second-order local sensitivity to non-ignorability in Bayesian inferences. Stat Med 2018;59:55–95.Suche in Google Scholar
32. Lesaffre, E, Lawson, AB. Bayesian Biostatistics. New Jersey: Wiley; 2012.10.1002/9781119942412Suche in Google Scholar
33. Heckman, JJ. Sample selection bias as a specification error. Econometrica 1979;47:153–61. https://doi.org/10.2307/1912352.Suche in Google Scholar
34. Laird, NM, Ware, JH. Random effects models for longitudinal data. Biometrics 1982;38:963–74. https://doi.org/10.2307/2529876.Suche in Google Scholar
35. Malik, HJ. Logistic distribution. Encyclopedia of statistical sciences, vol. 5. New York: Wiley; 1995.Suche in Google Scholar
36. Liu, C. Robit regression: a simple robust alternative to logistic and probit regression. In: Applied bayesian modeling and causal inference from incomplete-data perspectives. London: Wiley; 2004.10.1002/0470090456.ch21Suche in Google Scholar
37. Goldstein, DJ, Lu, Y, Detke, MJ, Wiltse, C, Mallinckrodt, C, Demitrack, MA. Duloxetine in the treatment of depression: a double-blind placebo-controlled comparison with paroxetine. J Clin Psychopharmacol 2004;24:389–99. https://doi.org/10.1097/01.jcp.0000132448.65972.d9.Suche in Google Scholar PubMed
38. Gelman, A, Jakulin, A, Pittau, GM, Su, YS. A weakly informative default prior distribution for logistic and other regression models. Ann Appl Stat 2008;2:1360–83. https://doi.org/10.1214/08-aoas191.Suche in Google Scholar
39. Demidenko, E. Mixed models: theory and applications with R, 2nd ed. New York: Wiley; 2013.Suche in Google Scholar
© 2023 Walter de Gruyter GmbH, Berlin/Boston
Artikel in diesem Heft
- Frontmatter
- Research Articles
- Random forests for survival data: which methods work best and under what conditions?
- Flexible variable selection in the presence of missing data
- An interpretable cluster-based logistic regression model, with application to the characterization of response to therapy in severe eosinophilic asthma
- MBPCA-OS: an exploratory multiblock method for variables of different measurement levels. Application to study the immune response to SARS-CoV-2 infection and vaccination
- Detecting differentially expressed genes from RNA-seq data using fuzzy clustering
- Hypothesis testing for detecting outlier evaluators
- Response to comments on ‘sensitivity of estimands in clinical trials with imperfect compliance’
- Commentary
- Comments on “sensitivity of estimands in clinical trials with imperfect compliance” by Chen and Heitjan
- Research Articles
- Optimizing personalized treatments for targeted patient populations across multiple domains
- Statistical models for assessing agreement for quantitative data with heterogeneous random raters and replicate measurements
- History-restricted marginal structural model and latent class growth analysis of treatment trajectories for a time-dependent outcome
- Revisiting incidence rates comparison under right censorship
- Ensemble learning methods of inference for spatially stratified infectious disease systems
- The survival function NPMLE for combined right-censored and length-biased right-censored failure time data: properties and applications
- Hybrid classical-Bayesian approach to sample size determination for two-arm superiority clinical trials
- Estimation of a decreasing mean residual life based on ranked set sampling with an application to survival analysis
- Improving the mixed model for repeated measures to robustly increase precision in randomized trials
- Bayesian second-order sensitivity of longitudinal inferences to non-ignorability: an application to antidepressant clinical trial data
- A modified rule of three for the one-sided binomial confidence interval
- Kalman filter with impulse noised outliers: a robust sequential algorithm to filter data with a large number of outliers
- Bayesian estimation and prediction for network meta-analysis with contrast-based approach
- Testing for association between ordinal traits and genetic variants in pedigree-structured samples by collapsing and kernel methods
Artikel in diesem Heft
- Frontmatter
- Research Articles
- Random forests for survival data: which methods work best and under what conditions?
- Flexible variable selection in the presence of missing data
- An interpretable cluster-based logistic regression model, with application to the characterization of response to therapy in severe eosinophilic asthma
- MBPCA-OS: an exploratory multiblock method for variables of different measurement levels. Application to study the immune response to SARS-CoV-2 infection and vaccination
- Detecting differentially expressed genes from RNA-seq data using fuzzy clustering
- Hypothesis testing for detecting outlier evaluators
- Response to comments on ‘sensitivity of estimands in clinical trials with imperfect compliance’
- Commentary
- Comments on “sensitivity of estimands in clinical trials with imperfect compliance” by Chen and Heitjan
- Research Articles
- Optimizing personalized treatments for targeted patient populations across multiple domains
- Statistical models for assessing agreement for quantitative data with heterogeneous random raters and replicate measurements
- History-restricted marginal structural model and latent class growth analysis of treatment trajectories for a time-dependent outcome
- Revisiting incidence rates comparison under right censorship
- Ensemble learning methods of inference for spatially stratified infectious disease systems
- The survival function NPMLE for combined right-censored and length-biased right-censored failure time data: properties and applications
- Hybrid classical-Bayesian approach to sample size determination for two-arm superiority clinical trials
- Estimation of a decreasing mean residual life based on ranked set sampling with an application to survival analysis
- Improving the mixed model for repeated measures to robustly increase precision in randomized trials
- Bayesian second-order sensitivity of longitudinal inferences to non-ignorability: an application to antidepressant clinical trial data
- A modified rule of three for the one-sided binomial confidence interval
- Kalman filter with impulse noised outliers: a robust sequential algorithm to filter data with a large number of outliers
- Bayesian estimation and prediction for network meta-analysis with contrast-based approach
- Testing for association between ordinal traits and genetic variants in pedigree-structured samples by collapsing and kernel methods