Abstract
Traditional survival analysis typically assumes that all subjects will eventually experience the event of interest given a sufficiently long follow-up period. Nevertheless, due to advancements in medical technology, researchers now frequently observe that some subjects never experience the event and are considered cured. Furthermore, traditional survival analysis assumes independence between failure time and censoring time. However, practical applications often reveal dependence between them. Ignoring both the cured subgroup and this dependence structure can introduce bias in model estimates. Among the methods for handling dependent censoring data, the numerical integration process of frailty models is complex and sensitive to the assumptions about the latent variable distribution. In contrast, the copula method, by flexibly modeling the dependence between variables, avoids strong assumptions about the latent variable structure, offering greater robustness and computational feasibility. Therefore, this paper proposes a copula-based method to handle dependent current status data involving a cure fraction. In the modeling process, we establish a logistic model to describe the susceptible rate and a Cox proportional hazards model to describe the failure time and censoring time. In the estimation process, we employ a sieve maximum likelihood estimation method based on Bernstein polynomials for parameter estimation. Extensive simulation experiments show that the proposed method demonstrates consistency and asymptotic efficiency under various settings. Finally, this paper applies the method to lymph follicle cell data, verifying its effectiveness in practical data analysis.
Funding source: Youth Postdoctoral Program of Changbai Talent Program in Jilin Province
Award Identifier / Grant number: 202442110
Acknowledgements
This work was partly supported by the National Natural Science Foundation of China (No. 12271060) and Youth Postdoctoral Program of Changbai Talent Program in Jilin Province (No. 202442110).
-
Research ethics: Not applicable.
-
Informed consent: Informed consent was obtained from all individuals included in this study, or their legal guardians or wards.
-
Author contributions: All authors have accepted responsibility for the entire content of this manuscript and approved its submission.
-
Use of Large Language Models, AI and Machine Learning Tools: None declared.
-
Conflict of interest: The authors state no conflict of interest.
-
Research funding: None declared.
-
Data availability: Not applicable.
To prove Theorem 1, Theorem 2, and Theorem 3, we provide the following notation definitions along with Lemma 1 and Lemma 2. First, define the covering number of the class
For all parameters, there exists θ ∈ Θ
n
and
If such a κ does not exist, then
Lemma 1 (Calculation of Covering Number) Suppose conditions (A1), (A2), and (A4) hold. Then, the covering number of the class
where m = o(n ν ), 0 < ν < 1 represents the degrees of freedom of the Bernstein polynomial, and M n = O(n c ), c > 0 denotes the size of the sieve space Θ n .
Proof of Lemma 1:
Define two parameters
Additionally, based on the calculation [40] (p.94), we can derive
The proof of Lemma 1 is complete.
Lemma 2
(Uniform Convergence) Suppose conditions (A1), (A2), and (A4) hold. Then, we have
almost surely, where
Proof of Lemma 2:
We note that when the regularity conditions (A1), (A2) and (A4) are satisfied, |l(θ, X)| is bounded. Therefore, without loss of generality, sup
θ∈Θ|l(θ, O)| ≤ 1, so we have
Let P n ° denote a measure that places mass ± n −1 at each observation sample {O 1, O 2, …, O n }, with the random ± independently determined from O i .
Therefore, based on the above inequality (B1) and Pollard’s paper [41], we can obtain the following inequality:
Given the sample observation data
Thus, we have.
According to the definition of the covering number
Therefore, based on Hoeffding’s inequality [41], we can obtain the conclusion:
Based on inequalities (B2), (B3), and (B4), we can obtain the following inequality:
From inequality (B5), it can be seen that the right-hand side of inequality (B5) does not depend on the observation data
Combining inequality (B6) and the symmetry of the inequalities, we get the following inequality:
Therefore, we can conclude that
Proof of Theorem 1:
Based on the conclusions of Lemma 1 and Lemma 2, it is known that the covering number of the class
For a given ϵ > 0, we define the following:
Therefore, we can prove that.
If the parameter
Combining inequalities (B8) and (B9), we get
Therefore, we have ξ
n
≥ δ
ϵ
, which implies that
Proof of Theorem 2:
Next, we will use Theorem 3.4.1 [40] to discuss the convergence rate of
where
Proof of Theorem 3:
Define the linear expansion of Θ − θ
0 as V, where
We define the Fisher inner product on the space V as
Next, we define the linear function of θ as
We can easily derive that
in distribution. First, we need to prove that
First, by condition (A4) and Theorem 1.6.2 [42], we know that for any v* ∈ Θ0, there exists ∏nv* ∈ Θn such that
For I 1, by condition (A1), condition (A2), Chebyshev’s inequality, and ‖Πnv* − v*‖ = o(1), we can obtain I 1 = ϵnop(n−1/2). For I 2, we have
where
For I 3, we have
where
where the last inequality in the above derivation is due to the Cauchy-Schwarz inequality, along with
Therefore, we obtain
We now proceed to prove that ‖v*‖<sup>2 = h⊤Σh. Consider the parameter vector
where
Therefore, we can prove that.
From the minimization approach defined in Equation (14) and following the discussion in Section 3.2, [44], we can obtain.
Therefore, by utilizing the following equation and the Cramr-Wold theorem, we complete the proof of the first part of Theorem 3.
As for the semiparametric efficiency of the parameter estimators, it can be directly established by applying Theorem 4 [45].
References
1. Sun, J. The statistical analysis of interval-censored failure time data. New York: Springer; 2006, vol 3.Search in Google Scholar
2. Farewell, VT. The use of mixture models for the analysis of survival data with long-term survivors. Biometrics 1982;38:1041–6. https://doi.org/10.2307/2529885.Search in Google Scholar
3. Ghitany, M, Maller, RA, Zhou, S. Exponential mixture models with long-term survivors and covariates. J Multivariate Anal 1994;49:218–41. https://doi.org/10.1006/jmva.1994.1023.Search in Google Scholar
4. Lam, K, Xue, H. A semiparametric regression cure model with current status data. Biometrika 2005;92:573–86. https://doi.org/10.1093/biomet/92.3.573.Search in Google Scholar
5. Hu, T, Xiang, L. Efficient estimation for semiparametric cure models with interval-censored data. J Multivariate Anal 2013;121:139–51. https://doi.org/10.1016/j.jmva.2013.06.006.Search in Google Scholar
6. Shao, F, Li, J, Ma, S, Lee, MLT. Semiparametric varying-coefficient model for interval censored data with a cured proportion. Stat Med 2014;33:1700–12. https://doi.org/10.1002/sim.6054.Search in Google Scholar PubMed
7. Zhou, J, Zhang, J, Lu, W. Computationally efficient estimation for the generalized odds rate mixture cure model with interval-censored data. J Comput Graph Stat 2018;27:48–58. https://doi.org/10.1080/10618600.2017.1349665.Search in Google Scholar PubMed PubMed Central
8. Mazucheli, J, Coelho-Barros, EA, Achcar, JA. The exponentiated exponential mixture and non-mixture cure rate model in the presence of covariates. Comput Methods Progr Biomed 2013;112:114–24. https://doi.org/10.1016/j.cmpb.2013.06.015.Search in Google Scholar PubMed
9. Chen, CM, Shen, PS, Wei, JCC, Lin, L. A semiparametric mixture cure survival model for left-truncated and right-censored data. Biom J 2017;59:270–90. https://doi.org/10.1002/bimj.201500267.Search in Google Scholar PubMed
10. Usman, U, Shamsuddeen, S, Arkilla, BM, Yakubu, A. Mixture cure model for right censored survival data with weibull exponentiated exponential distribution. Pak J Statistics 2022;38.Search in Google Scholar
11. Fang, H, Li, G, Sun, J. Maximum likelihood estimation in a semiparametric logistic/proportional-hazards mixture model. Scand J Stat 2005;32:59–75. https://doi.org/10.1111/j.1467-9469.2005.00415.x.Search in Google Scholar
12. Lu, W. Efficient estimation for an accelerated failure time model with a cure fraction. Stat Sin 2010;20:661.Search in Google Scholar
13. Wang, Z, Wang, X. Evaluating the time-dependent predictive accuracy for event-to-time outcome with a cure fraction. Pharm Stat 2020;19:955–74. https://doi.org/10.1002/pst.2048.Search in Google Scholar PubMed
14. Chen, CM, Shen, P, Huang, WL. Semiparametric transformation models for interval-censored data in the presence of a cure fraction. Biom J 2019;61:203–15. https://doi.org/10.1002/bimj.201700304.Search in Google Scholar PubMed
15. Liu, X, Xiang, L. Generalized accelerated hazards mixture cure models with interval-censored data. Comput Stat Data Anal 2021;161:107248. https://doi.org/10.1016/j.csda.2021.107248.Search in Google Scholar
16. Wang, X, Wang, Z. EM algorithm for the additive risk mixture cure model with interval-censored data. Lifetime Data Anal 2021;27:91–130. https://doi.org/10.1007/s10985-020-09507-z.Search in Google Scholar PubMed
17. Pal, S, Peng, Y, Aselisewine, W. A new approach to modeling the cure rate in the presence of interval censored data. Comput Stat 2024;39:2743–69. https://doi.org/10.1007/s00180-023-01389-7.Search in Google Scholar PubMed PubMed Central
18. Zhou, J, Zhang, J, McLain, AC, Cai, B. A multiple imputation approach for semiparametric cure model with interval censored data. Comput Stat Data Anal 2016;99:105–14. https://doi.org/10.1016/j.csda.2016.01.013.Search in Google Scholar
19. Diao, G, Yuan, A. A class of semiparametric cure models with current status data. Lifetime Data Anal 2019;25:26–51. https://doi.org/10.1007/s10985-018-9420-0.Search in Google Scholar PubMed PubMed Central
20. Chen, CM, Lu, TFC, Chen, MH, Hsu, CM. Semiparametric transformation models for current status data with informative censoring. Biom J 2012;54:641–56. https://doi.org/10.1002/bimj.201100131.Search in Google Scholar PubMed
21. Wang, P, Zhao, H, Du, M, Sun, J. Inference on semiparametric transformation model with general interval-censored failure time data. J Nonparametric Statistics 2018;30:758–73. https://doi.org/10.1080/10485252.2018.1478091.Search in Google Scholar
22. Wang, S, Wang, C, Wang, P, Sun, J. Estimation of the additive hazards model with case k interval-censored failure time data in the presence of informative censoring. Comput Stat Data Anal 2020;144:106891.10.1016/j.csda.2019.106891Search in Google Scholar
23. Zhao, B, Wang, S, Wang, C, Sun, J. New methods for the additive hazards model with the informatively interval-censored failure time data. Biom J 2021;63:1507–25. https://doi.org/10.1002/bimj.202000288.Search in Google Scholar PubMed
24. Zhao, B, Wang, S, Wang, C. A new frailty-based gee approach of the informatively case k interval-censored failure time data. Commun Stat Theor Methods 2024;53:6527–43. https://doi.org/10.1080/03610926.2023.2247505.Search in Google Scholar
25. Ma, L, Hu, T, Sun, J. Sieve maximum likelihood regression analysis of dependent current status data. Biometrika 2015;102:731–8. https://doi.org/10.1093/biomet/asv020.Search in Google Scholar
26. Zhao, S, Hu, T, Ma, L, Wang, P, Sun, J. Regression analysis of informative current status data with the additive hazards model. Lifetime Data Anal 2015;21:241–58. https://doi.org/10.1007/s10985-014-9303-y.Search in Google Scholar PubMed
27. Xu, D, Zhao, S, Hu, T, Yu, M, Sun, J. Regression analysis of informative current status data with the semiparametric linear transformation model. J Appl Stat 2019;46:187–202. https://doi.org/10.1080/02664763.2018.1466870.Search in Google Scholar
28. Zhao, S, Dong, L, Sun, J. Regression analysis of interval-censored data with informative observation times under the accelerated failure time model. J Syst Sci Complex 2022;35:1520–34. https://doi.org/10.1007/s11424-021-0209-y.Search in Google Scholar
29. Ding, Y, Sun, T. Copula models and diagnostics for multivariate interval-censored data. In: Emerging topics in modeling interval-censored survival data. New York: Springer International Publishing; 2022:141–65 pp.10.1007/978-3-031-12366-5_8Search in Google Scholar
30. Liu, Y, Hu, T, Sun, J. Regression analysis of current status data in the presence of a cured subgroup and dependent censoring. Lifetime Data Anal 2017;23:626–50. https://doi.org/10.1007/s10985-016-9382-z.Search in Google Scholar PubMed
31. Wang, S, Wang, C, Sun, J. An additive hazards cure model with informative interval censoring. Lifetime Data Anal 2021;27:244–68. https://doi.org/10.1007/s10985-021-09515-7.Search in Google Scholar PubMed
32. Wang, S, Xu, D, Wang, C, Sun, J. Estimation of linear transformation cure models with informatively interval-censored failure time data. J Nonparametric Statistics 2023;35:283–301. https://doi.org/10.1080/10485252.2022.2148667.Search in Google Scholar
33. Deresa, NW, Keilegom, IV. Copula based cox proportional hazards models for dependent censoring. J Am Stat Assoc 2024;119:1044–54. https://doi.org/10.1080/01621459.2022.2161387.Search in Google Scholar
34. Delhelle, M, Van Keilegom, I. Copula based dependent censoring in cure models. Test 2025:1–22. https://doi.org/10.1007/s11749-024-00961-7.Search in Google Scholar
35. Nelsen, RB. An introduction to copulas. New York: Springer; 2006.Search in Google Scholar
36. Huang, J, Rossini, A. Sieve estimation for the proportional-odds failure-time regression model with interval censoring. J Am Stat Assoc 1997;92:960–7. https://doi.org/10.1080/01621459.1997.10474050.Search in Google Scholar
37. Efron, B, Tibshirani, RJ. An introduction to the bootstrap. Boca Raton: Chapman and Hall/CRC; 1994.10.1201/9780429246593Search in Google Scholar
38. Pintilie, M. Competing risks: a practical perspective. Chichester: John Wiley & Sons; 2006.10.1002/9780470870709Search in Google Scholar
39. Deresa, NW, Van Keilegom, I. Semiparametric transformation models for survival data with dependent censoring. Ann Inst Stat Math 2025;77:425–57. https://doi.org/10.1007/s10463-024-00921-w.Search in Google Scholar
40. Van Der Vaart, AW, Wellner, JA. Weak convergence. New York: Springer; 1996.10.1007/978-1-4757-2545-2_3Search in Google Scholar
41. Pollard, D. Convergence of stochastic processes. New York: Springer Science & Business Media; 2012.Search in Google Scholar
42. Lorentz, GG. Bernstein polynomials. New York: American Mathematical Soc; 1986.Search in Google Scholar
43. Shen, X, Wong, WH. Convergence rate of sieve estimates. Ann Stat 1994;22:580–615. https://doi.org/10.1214/aos/1176325486.Search in Google Scholar
44. Chen, X, Fan, Y, Tsyrennikov, V. Efficient estimation of semiparametric multivariate copula models. J Am Stat Assoc 2006;101:1228–40. https://doi.org/10.1198/016214506000000311.Search in Google Scholar
45. Shen, X. On methods of sieves and penalization. Ann Stat 1997;25:2555–91. https://doi.org/10.1214/aos/1030741085.Search in Google Scholar
© 2025 Walter de Gruyter GmbH, Berlin/Boston
Articles in the same Issue
- Frontmatter
- Research Article
- The gROC curve and the optimal classification
- Review
- Leveraging external information by guided adaptive shrinkage to improve variable selection in high-dimensional regression settings
- Research Articles
- Enhanced doubly robust estimate with semiparametric models for causal inference of survival outcome
- Two-sample empirical likelihood method for right censored data
- Regression analysis of interval-censored failure time data under semiparametric transformation models with missing covariates
- Efficiency for evaluation of disease etiologic heterogeneity in case-case and case-control studies
- Inference on overlap index: with an application to cancer data
- Copula-based Cox models for dependent current status data with a cure fraction
- Early completion based on adjacent dose information for model-assisted designs to accelerate maximum tolerated dose finding
- An enhanced approximate Bayesian computation method for stage-structured development models
- Bayesian competing risks survival modeling for assessing the cause of death of patients with heart failure
- Forecasting mortality rates in hyponatremia: a statistical approach using Holt-Winters models
Articles in the same Issue
- Frontmatter
- Research Article
- The gROC curve and the optimal classification
- Review
- Leveraging external information by guided adaptive shrinkage to improve variable selection in high-dimensional regression settings
- Research Articles
- Enhanced doubly robust estimate with semiparametric models for causal inference of survival outcome
- Two-sample empirical likelihood method for right censored data
- Regression analysis of interval-censored failure time data under semiparametric transformation models with missing covariates
- Efficiency for evaluation of disease etiologic heterogeneity in case-case and case-control studies
- Inference on overlap index: with an application to cancer data
- Copula-based Cox models for dependent current status data with a cure fraction
- Early completion based on adjacent dose information for model-assisted designs to accelerate maximum tolerated dose finding
- An enhanced approximate Bayesian computation method for stage-structured development models
- Bayesian competing risks survival modeling for assessing the cause of death of patients with heart failure
- Forecasting mortality rates in hyponatremia: a statistical approach using Holt-Winters models