Copula-based Cox models for dependent current status data with a cure fraction

Shuying Wang; Danping Zhou; Yunfei Yang; Bo Zhao

doi:10.1515/ijb-2025-0038

Artikel

Copula-based Cox models for dependent current status data with a cure fraction

Shuying Wang , Danping Zhou , Yunfei Yang und Bo Zhao

Veröffentlicht/Copyright: 9. Oktober 2025

Veröffentlicht von

Veröffentlichen auch Sie bei De Gruyter Brill

Manuskript einreichen Informationen für Autor*innen Erkunden Sie dieses Fachgebiet

Aus der Zeitschrift The International Journal of Biostatistics Band 21 Heft 2

Abstract

Traditional survival analysis typically assumes that all subjects will eventually experience the event of interest given a sufficiently long follow-up period. Nevertheless, due to advancements in medical technology, researchers now frequently observe that some subjects never experience the event and are considered cured. Furthermore, traditional survival analysis assumes independence between failure time and censoring time. However, practical applications often reveal dependence between them. Ignoring both the cured subgroup and this dependence structure can introduce bias in model estimates. Among the methods for handling dependent censoring data, the numerical integration process of frailty models is complex and sensitive to the assumptions about the latent variable distribution. In contrast, the copula method, by flexibly modeling the dependence between variables, avoids strong assumptions about the latent variable structure, offering greater robustness and computational feasibility. Therefore, this paper proposes a copula-based method to handle dependent current status data involving a cure fraction. In the modeling process, we establish a logistic model to describe the susceptible rate and a Cox proportional hazards model to describe the failure time and censoring time. In the estimation process, we employ a sieve maximum likelihood estimation method based on Bernstein polynomials for parameter estimation. Extensive simulation experiments show that the proposed method demonstrates consistency and asymptotic efficiency under various settings. Finally, this paper applies the method to lymph follicle cell data, verifying its effectiveness in practical data analysis.

Keywords: proportional hazards mixture cure model; copula model; current status data; dependent censoring

Corresponding author: Danping Zhou, School of Mathematics and Statistics, Changchun University of Technology, Changchun 130012, China, E-mail: 13227940662@163.com

Funding source: Youth Postdoctoral Program of Changbai Talent Program in Jilin Province

Award Identifier / Grant number: 202442110

Acknowledgements

This work was partly supported by the National Natural Science Foundation of China (No. 12271060) and Youth Postdoctoral Program of Changbai Talent Program in Jilin Province (No. 202442110).

Research ethics: Not applicable.
Informed consent: Informed consent was obtained from all individuals included in this study, or their legal guardians or wards.
Author contributions: All authors have accepted responsibility for the entire content of this manuscript and approved its submission.
Use of Large Language Models, AI and Machine Learning Tools: None declared.
Conflict of interest: The authors state no conflict of interest.
Research funding: None declared.
Data availability: Not applicable.

Appendix

To prove Theorem 1, Theorem 2, and Theorem 3, we provide the following notation definitions along with Lemma 1 and Lemma 2. First, define the covering number of the class L n = { l ( θ , O ) : θ ∈ Θ n } . Let O = (O ₁, …, O _n) represent the sample observation data, and define l(θ, O) as the log-likelihood function based on the individual observation O from Section 2.

For all parameters, there exists θ ∈ Θ_n and θ ( j ) = ( β ( j ) ⊤ , γ ( j ) ⊤ , η ( j ) ⊤ , Λ T ( j ) , Λ C ( j ) ) ∈ Θ n , for j = 1, …, κ, such that the following condition holds:

min j ∈ { 1 , … , κ } 1 n ∑ i = 1 n | l ( θ , O i ) − l ( θ ( j ) , O i ) | < ϵ

If such a κ does not exist, then N ( ϵ , L n , L 1 ( P n ) ) = ∞ .

Lemma 1 (Calculation of Covering Number) Suppose conditions (A1), (A2), and (A4) hold. Then, the covering number of the class L n = { l ( θ , O ) : θ ∈ Θ n } satisfies

N ( ϵ , L n , L 1 ( P n ) ) ≤ K M p + 2 q M n 2 ( m + 1 ) ϵ − ( p + 2 ( q + m + 1 ) ) ,

where m = o(n ^ν), 0 < ν < 1 represents the degrees of freedom of the Bernstein polynomial, and M _n = O(n ^c), c > 0 denotes the size of the sieve space Θ_n.

Proof of Lemma 1:

Define two parameters θ ( 1 ) = ( β ( 1 ) ⊤ , γ ( 1 ) ⊤ , η ( 1 ) ⊤ , Λ T ( 1 ) , Λ C ( 1 ) ) and θ ( 2 ) = ( β ( 2 ) ⊤ , γ ( 2 ) ⊤ , η ( 2 ) ⊤ , Λ T ( 2 ) , Λ C ( 2 ) ) , where θ ⁽¹⁾, θ ⁽²⁾ ∈ Θ. Under the regularity conditions (A1), (A2), and (A4), by applying the Taylor expansion, we obtain

l ( θ ( 1 ) , O ) − l ( θ ( 2 ) , O ) ≤ K ‖ β ( 1 ) − β ( 2 ) ‖ + ‖ γ 2 ( 1 ) − γ 2 ( 2 ) ‖ + ‖ η ( 1 ) − η ( 2 ) ‖ + ‖ Λ T ( 1 ) − Λ T ( 2 ) ‖ ∞ + ‖ Λ C ( 1 ) − Λ C ( 2 ) ‖ ∞ ,

Additionally, based on the calculation [40] (p.94), we can derive

N ϵ , L , L 1 ( P n ) ≤ N ϵ 5 M , B , ‖ ⋅ ‖ ⋅ N ϵ 5 M n , M n 1 , L ∞ ⋅ N ϵ 5 M n , M n 2 , L ∞ ≤ 15 M ϵ p + 2 q ⋅ 15 M n ϵ m + 1 ⋅ 15 M n ϵ m + 1 ≤ K M p + 2 q M n 2 ( m + 1 ) ϵ − ( p + 2 ( q + m + 1 ) ) ,

The proof of Lemma 1 is complete.

Lemma 2

(Uniform Convergence) Suppose conditions (A1), (A2), and (A4) hold. Then, we have

sup θ ∈ Θ n | P n l ( θ , O ) − P l ( θ , O ) | → 0

almost surely, where P n l ( θ , O ) = 1 n ∑ i = 1 n l ( θ , O i ) and Pl(θ, O) = ∫l(θ, O) dP(O).

Proof of Lemma 2:

We note that when the regularity conditions (A1), (A2) and (A4) are satisfied, |l(θ, X)| is bounded. Therefore, without loss of generality, sup_θ∈Θ|l(θ, O)| ≤ 1, so we have P l 2 ( θ , O ) ≤ P sup θ ∈ Θ | l ( θ , O ) | 2 ≤ 1 . Let a n = n − 1 / 2 + ϕ 1 ( log ⁡ n ) 1 / 2 , where ν/2 < ϕ ₁ < 1/2. Thus, the sequence a _n is a non-increasing positive sequence. For any given ϵ > 0, let ϵ _n = ϵa _n. Then, for any θ ∈ Θ_n and sufficiently large n, we obtain

(B1) var ( P n l ( θ , O ) ) ( 4 ϵ n ) 2 ≤ ( 1 / n ) P l 2 ( θ , O ) 16 ϵ 2 a n 2 ≤ 1 16 ϵ 2 n a n 2 = 1 16 ϵ 2 n 2 ϕ 1 ⁡ log ⁡ n ≤ 1 2 .

Let P _n° denote a measure that places mass ± n ⁻¹ at each observation sample {O ₁, O ₂, …, O _n}, with the random ± independently determined from O _i.

Therefore, based on the above inequality (B1) and Pollard’s paper [41], we can obtain the following inequality:

P sup θ ∈ Θ n | P n l ( θ , O ) − P l ( θ , O ) | > 8 ϵ n ≤ 4 P sup θ ∈ Θ n | P n ° l ( θ , O ) | > 2 ϵ n .

Given the sample observation data H = { O 1 , O 2 , … , O n } , for any θ ∈ Θ_n, choose θ ⁽¹⁾, …, θ ^(κ), where κ = N ( ϵ , L n , L 1 ( P n ) ) . Suppose the parameter θ* represents the value of θ ^(j) where l(θ, O) achieves its optimal value. Then, there exists.

P n ° ( l ( θ , O ) − l ( θ * , O ) ) = n − 1 ∑ i = 1 n ± ( l ( θ , O i ) − l ( θ * , O i ) ) ≤ n − 1 ∑ i = 1 n | l ( θ , O i ) − l ( θ * , O i ) | = P n | l ( θ , O i ) − l ( θ * , O i ) | .

Thus, we have.

According to the definition of the covering number N ( ϵ n / 2 , L n , L 1 ( P n ) ) , for each θ ^(j), there exists θ ̃ ( j ) ∈ Θ n such that P n | l ( θ ̃ ( j ) , O ) − l ( θ ( j ) , O ) | < ϵ n / 2 . Therefore, we can obtain the following inequality:

(B3) P ( | P n ° l ( θ ( j ) , O ) | > 3 ε n 2 | H ) ≤ P ( | P n l ( θ ( j ) , O ) − l ( θ ̃ ( j ) , O ) | + | P n ° l ( θ ̃ ( j ) , O ) | > 3 ε n 2 | H ) ≤ P ( | P n ° l ( θ ̃ ( j ) , O ) | > ε n | H ) .

Therefore, based on Hoeffding’s inequality [41], we can obtain the conclusion:

(B4) P ( | P n ° l ( θ ̃ ( j ) , O ) | > ϵ n | H ) = P ∑ i = 1 n ± l ( θ ̃ ( j ) , O i ) > n ϵ n H ≤ 2 ⁡ exp − 2 ( n ϵ n ) 2 / ∑ i = 1 n ( 2 l ( θ ̃ ( j ) , O i ) ) 2 ≤ 2 ⁡ exp − n ϵ n 2 / 2 ( because | l ( θ ̃ ( j ) , O ) | ≤ 1 ) .

Based on inequalities (B2), (B3), and (B4), we can obtain the following inequality:

(B5) P sup θ ∈ Θ n | P n o ( θ , O | > 2 ε n ∣ H ≤ 2 N ( ε n / 2 , L n , L 1 ( P n ) ) exp − n ε n 2 / 2 ≤ K M p + 2 q M n 2 ( m + 1 ) ( ε n / 2 ) − ( p + 2 ( q + m + 1 ) ) ⁡ exp − n ε n 2 / 2 .

From inequality (B5), it can be seen that the right-hand side of inequality (B5) does not depend on the observation data H . Therefore, by taking the expectation over the observation data H = { O 1 , O 2 , … , O n } , we obtain.

(B6) P sup θ ∈ Θ n | P n o ( θ , O | > 2 ε n ≤ K M p + 2 q M n 2 ( m + 1 ) ( ε n / 2 ) − ( p + 2 ( q + m + 1 ) ) ⁡ exp − n ε n 2 / 2 .

Combining inequality (B6) and the symmetry of the inequalities, we get the following inequality:

(B7) P sup θ ∈ Θ n | P n l ( θ , O ) − P l ( θ , O ) | > 8 ϵ n ≤ 4 P sup θ ∈ Θ n | P n ° l ( θ , O ) | > 2 ϵ n ≤ K M p + 2 q M n 2 ( m + 1 ) ( ε n / 2 ) − ( p + 2 ( q + m + 1 ) ) ⁡ exp − n ε n 2 / 2 ≤ K ⁡ exp − K n 2 ϕ 1 ⁡ log ⁡ n .

Therefore, we can conclude that ∑ n = 1 ∞ P sup θ ∈ Θ n | P l ( θ , O i ) − P l ( θ , O i ) | > 8 ε n < ∞ . By the Borel-Cantelli Lemma, we have sup θ ∈ Θ n | P n l ( θ , O ) − P l ( θ , O ) | → 0 almost surely. The proof of Lemma 2 is complete.

Proof of Theorem 1:

Based on the conclusions of Lemma 1 and Lemma 2, it is known that the covering number of the class L n = { l ( θ , O ) : θ ∈ Θ n } satisfies N ( ϵ , L n , L 1 ( P n ) ) ≤ K M p + 2 q M n 2 ( m + 1 ) ϵ − ( p + 2 ( q + m + 1 ) ) , and that

sup θ ∈ Θ n | P n l ( θ , O ) − P l ( θ , O ) | → 0 a . s .

For a given ϵ > 0, we define the following:

K ϵ = { θ : d ( θ , θ 0 ) > ϵ , θ ∈ Θ n } , M ( θ , O ) = − l ( θ , O ) , ξ 1 n = sup θ ∈ Θ n P n M ( θ , O ) − P M ( θ , O ) , ξ 2 n = P n M ( θ 0 , O ) − P M ( θ 0 , O ) .

Therefore, we can prove that.

(B8) inf K ϵ P M ( θ , O ) = inf K ϵ { P M ( θ , O ) − P n M ( θ , O ) + P n M ( θ , O ) } ≤ ξ 1 n + inf K ϵ P n M ( θ , O ) .

If the parameter θ ̌ ∈ K ϵ , we can obtain.

(B9) inf K ϵ P n M ( θ , O ) = P n M ( θ ̌ , O ) ≤ P n M ( θ 0 , O ) = ξ 2 n + P ( θ 0 , O ) .

Combining inequalities (B8) and (B9), we get inf K ϵ P M ( θ , O ) ≤ ξ n + P M ( θ 0 , O ) , where ξ _n = ξ ₁ n + ξ ₂ n.According to condition (A3), we have.

(B10) inf K ε P M ( θ ; O ) − P M ( θ 0 ; O ) = δ ε > 0 ,

Therefore, we have ξ _n ≥ δ _ϵ, which implies that { θ ̌ ∈ K ϵ } ⊆ { ξ n ≥ δ ϵ } . According to the law of large numbers and the conclusion of Lemma 2, we have ξ _1n → 0 and ξ _2n → 0 almost surely. Hence, ⋃ k = 1 ∞ ⋂ n = k ∞ { θ ̌ n ∈ K ϵ } ⊆ ⋃ k = 1 ∞ ⋂ n = k ∞ { ξ n ≥ δ ϵ } , which establishes that d(θ, θ ₀) → 0. The proof of Theorem 1 is complete.

Proof of Theorem 2:

Next, we will use Theorem 3.4.1 [40] to discuss the convergence rate of θ ̂ n . First, by using Theorem 1.6.2 [42], we know that there exist Bernstein polynomials Λ_Tn0 and Λ_Cn0, such that ‖ Λ T 0 − Λ Tn0 ‖ ∞ = O ( n − r ν 2 ) and ‖ Λ C 0 − Λ Cn0 ‖ ∞ = O ( n − r ν 2 ) . For any ρ > 0, define the class F ρ = { l ( θ , O ) − l ( θ n 0 , O ) : θ ∈ Θ n , d ( θ , θ n 0 ) ≤ ρ } , where θ n 0 = ( β n 0 , γ n 0 , η n 0 , Λ Tn0 , Λ Cn0 ) . Through the calculation [43](p. 597), we get log N [ ] ( ϵ , F ρ , ‖ ⋅ ‖ ) ≤ K N ⁡ log ( ρ / ϵ ) , N = 2 ( m + 1 ) , where N [ ] ( ϵ , F ρ , ‖ ⋅ ‖ ) represents the bracket number of the function class F with respect to the metric or semi-metric d. Furthermore, for any l ( θ n 0 , O ) − l ( θ , O ) ∈ F ρ , we have ‖ l ( θ n 0 , O ) − l ( θ , O ) ‖ 2 2 ≤ K ρ 2 . Thus, by Lemma 3.4.2 [40], we can obtain

(B11) E P n 1 2 ( P n − P ) F ρ ≤ K H ρ ( ϵ , F ρ , ‖ ⋅ ‖ 2 ) 1 + H ρ ( ϵ , F ρ , ‖ ⋅ ‖ 2 ) ρ 2 n 1 2 ≤ K ⋅ K N 1 2 ρ 1 + K N 1 2 ρ ρ 2 N 1 2 = K ( N 1 2 ρ + N / n 1 2 ) ≔ X n ( ρ ) .

where H ρ ( ϵ , F ρ , ‖ ⋅ ‖ 2 ) = ∫ 0 ρ 1 + log N [ ] ( ϵ , F ρ , ‖ ⋅ ‖ 2 ) 1 / 2 d ϵ . We can see that X n ( ρ ) ρ is a decreasing function of ρ, and satisfies r n 2 X n 1 r n = r n N 1 2 + r n 2 N n 1 2 < 2 n 1 2 , where r n = N − 1 2 n 1 2 = n ( − ν + 1 ) / 2 , 0 < ν < 0.5 . By Theorem 3.2.5 [40], we can obtain n ( 1 − ν ) / 2 d ( θ ̂ , θ n 0 ) = O P ( 1 ) . Also, by Theorem 1.6.2 [42], we get d ( θ n 0 , θ 0 ) = O p ( n − r ν 2 ) . Finally, we obtain the convergence rate d ( θ ̂ , θ 0 ) = O p n − ( 1 − ν ) / 2 + n − r ν / 2 , which implies that ‖ Λ ̂ T n − Λ T 0 ‖ 2 + ‖ Λ ̂ C n − Λ C 0 ‖ 2 = O q n − ( 1 − ν ) / 2 + n − r / 2 .

Proof of Theorem 3:

Define the linear expansion of Θ − θ ₀ as V, where θ 0 = β 0 ⊤ , γ 0 ⊤ , η 0 ⊤ , Λ T 0 , Λ C 0 is the true value of θ, and let Θ₀ represent the true parameter space. Let l(θ; O) denote the log-likelihood function, and define δ _n = n ^−rν/2 + n ^{−(1−ν)/2}. For any θ ∈ {θ ∈ Θ₀: d(θ, θ ₀) = O(δ _n)}, define the first and second directional derivatives of l(θ; O) in the direction v ∈ V as follows:

(B12) l ̇ ( θ ; O ) [ v ] = d l ( θ + s v ; O ) d s s = 0 ,

(B13) l ̈ ( θ ; O ) [ v , v ̃ ] = d 2 l ( θ + s v + s ̃ v ̃ ; O ) d s ̃ s = 0 , s ̃ = 0 = d l ̇ ( θ + s ̃ v ̃ ; O ) d s ̃ s ̃ = 0 .

We define the Fisher inner product on the space V as ⟨ v , v ̃ ⟩ = P l ̇ ( θ ; O ) [ v ] | l ̇ ( θ ; O ) [ v ̃ ] , and for v ∈ V, the Fisher norm is ‖v‖2 = ⟨v, v⟩. Let V ̄ denote the closed linear expansion of the space V under the Fisher norm, and it can be proven that ( V ̄ , ‖ ⋅ ‖ ) is a Hilbert space.

Next, we define the linear function of θ as γ ( θ ) = h 1 ⊤ β + h 2 ⊤ γ + h 3 ⊤ η , where h = h 1 ⊤ , h 2 ⊤ , h 3 ⊤ ⊤ is an arbitrary vector of dimension p + 2q, and ‖h‖ ≤ 1. Additionally, let ϑ = β ⊤ , γ ⊤ , η ⊤ ⊤ , and for any v = v ϑ , φ T , φ C ∈ V , we define.

γ ̇ ( θ 0 ) [ v ] = d γ ( θ 0 + s v ) d s s = 0 = h ⊤ v ϑ .

We can easily derive that γ ( θ ) − γ ( θ 0 ) = γ ̇ ( θ 0 ) [ θ − θ 0 ] . Also, by the Riesz representation theorem, for all v ∈ V ̄ , there exists v * ∈ V ̄ such that γ ̂ ( θ 0 ) [ v ] = ⟨ v * , v ⟩ , and ‖ v * ‖ 2 = ‖ γ ̂ ( θ 0 ) ‖ . Since h ⊤ ( ( β ̂ − β 0 ) ⊤ , ( γ ̂ − γ 0 ) ⊤ , ( η ̂ − η 0 ) ⊤ ) = γ ( θ ̂ n ) − γ ( θ 0 ) = γ ̇ ( θ 0 ) [ θ ̂ n − θ 0 ] = ⟨ θ ̂ n − θ 0 , v * ⟩ , to prove Theorem 3 by the Cramr-Wold theorem, we need to show

n θ ̂ n − θ 0 , v * → N ( 0 , h ⊤ Σ h )

in distribution. First, we need to prove that n θ ̂ n − θ 0 , v * → N ( 0 , ‖ v * ‖ 2 ) in distribution, and then prove that ‖v*‖2 = h⊤Σh.

First, by condition (A4) and Theorem 1.6.2 [42], we know that for any v* ∈ Θ0, there exists ∏nv* ∈ Θn such that ∏ n v * − v * = o ( 1 ) and δn‖Πnv* − v*‖ = o(n−1/2). At the same time, we define r [ θ − θ 0 , O ] = l ( θ ; O ) − l ( θ 0 , O ) − l ̇ ( θ , O ) [ θ − θ 0 ] , and from the definition of θ ̂ , we can obtain.

0 ≤ P n l ( θ ̂ n , O ) − l ( θ ̂ n ± ϵ n ∏ n v * , O ) = ( P n − P ) l ( θ ̂ n , O ) − l ( θ ̂ n ± ϵ n ∏ n v * , O ) + P l ( θ ̂ n , O ) − l ( θ ̂ n ± ϵ n ∏ n v * , O ) ± ϵ n P n l ̇ ( θ , O ) ∏ n v * + ( P n − P ) r [ θ − θ 0 , O ] − r θ ̂ n ± ϵ n ∏ n v * − θ 0 , O + P r [ θ − θ 0 , O ] − r θ ̂ n ± ϵ n ∏ n v * − θ 0 , O = ∓ ϵ n P n l ̇ ( θ , O ) [ v * ] ± ϵ n P n l ̇ ( θ , O ) ∏ n v * − v * + ( P n − P ) r [ θ − θ 0 , O ] − r θ ̂ n ± ϵ n ∏ n v * − θ 0 , O + P r [ θ − θ 0 , O ] − r θ ̂ n ± ϵ n ∏ n v * − θ 0 , O ≔ ∓ ϵ n P n l ̇ ( θ , O ) [ v * ] + I 1 + I 2 + I 3 .

For I ₁, by condition (A1), condition (A2), Chebyshev’s inequality, and ‖Πnv* − v*‖ = o(1), we can obtain I ₁ = ϵnop(n−1/2). For I ₂, we have

I 2 = ( P n − P ) l ( θ ̂ n , O ) − l ( θ ̂ n ± ϵ n ∏ n v * , O ) ± ϵ n l ̇ ( θ 0 ; O ) ∏ n v * = ∓ ϵ n ( P n − P ) l ̇ ( θ ̃ , O ) ∏ n v * − l ̇ ( θ 0 , O ) ∏ n v * ,

where θ ̃ is between θ ̂ n and θ ̂ n ± ϵ n ∏ n v * . From Theorem 2.8.4 [40], we know that l ̇ ( θ , O ) Π n v * − l ̇ ( θ 0 , O ) Π n v * : ‖ θ − θ 0 ‖ = O ( δ n ) is a Donsker class. Therefore, by Lemma 2.3.12 [40], we get I 2 = ϵ n × o p n − 1 / 2 .

For I ₃, we have

P ( r [ θ − θ 0 , O ] ) = P { l ( θ , O ) − l ( θ 0 , O ) − l ̇ ( θ 0 , O ) [ θ − θ 0 ] } = 2 − 1 P { l ̈ ( θ ̃ , O ) [ θ − θ 0 , θ − θ 0 ] − l ̈ ( θ 0 , O ) [ θ − θ 0 , θ − θ 0 ] } + 2 − 1 P { l ̈ ( θ 0 , O ) [ θ − θ 0 , θ − θ 0 ] } = 2 − 1 P { l ̈ ( θ 0 , O ) [ θ − θ 0 , θ − θ 0 ] } + ϵ n × o p ( n − 1 / 2 ) ,

where θ ̃ lies between θ0 and θ, and the last inequality follows from the Taylor expansion and conditions (A1) and (A2). Therefore,

I 3 = − 2 − 1 ‖ θ ̂ n − θ 0 ‖ 2 − ‖ θ ̂ n ± ϵ n ∏ n v * − θ 0 ‖ 2 + ϵ n × o p ( n − 1 / 2 ) = ± ϵ n ⟨ θ ̂ n − θ 0 , ∏ n v * ⟩ + 2 − 1 ‖ ϵ n ∏ n v * ‖ 2 + ϵ n × o p ( n − 1 / 2 ) = ± ϵ n ⟨ θ ̂ n − θ 0 , v * ⟩ + 2 − 1 ‖ ϵ n ∏ n v * ‖ 2 + ϵ n × o p ( n − 1 / 2 ) = ± ϵ n ⟨ θ ̂ n − θ 0 , v * ⟩ + ϵ n × o p ( n − 1 / 2 ) ,

where the last inequality in the above derivation is due to the Cauchy-Schwarz inequality, along with δ n ∏ n v * − v * = o ( n − 1 / 2 ) and ∏ n v * 2 → ‖ v * ‖ 2 . Additionally, combining P l ̇ ( θ 0 , O ) [ v * ] = 0 , we obtain.

0 ≤ P n l ( θ ̂ n , O ) − l ( θ ̂ n ± ϵ n ∏ i = 1 n v i * , O ) = ∓ ϵ n P n l ̇ ( θ 0 , O ) [ v * ] ± ϵ n 〈 θ ̂ n − θ 0 , v * 〉 + ϵ n × o p ( n − 1 / 2 ) = ∓ ϵ n ( P n − P ) l ̇ ( θ 0 , O ) [ v * ] ± ϵ n 〈 θ ̂ n − θ 0 , v * 〉 + ϵ n × o p ( n − 1 / 2 ) .

Therefore, we obtain n ⟨ θ ̂ n − θ 0 , v * ⟩ = n ( P n − P ) l ̇ ( θ 0 , O ) [ v * ] + o p ( 1 ) → N ( 0 , ‖ v * ‖ 2 ) , where the asymptotic normality is guaranteed by the central limit theorem, with asymptotic variance ‖ v * ‖ 2 = ‖ l ̇ ( θ 0 ; O ) [ v * ] ‖ 2 . This implies n 1 / 2 ( γ ( θ ̂ n ) − γ ( θ 0 ) ) = n 1 / 2 ⟨ θ ̂ n − θ 0 , v * ⟩ + o p ( 1 ) → N ( 0 , ‖ v * ‖ 2 ) .

We now proceed to prove that ‖v*‖<sup>2 = h⊤Σh. Consider the parameter vector ϑ = β ⊤ , γ ⊤ , η ⊤ ⊤ . For each component ϑk, k = 1, 2, …, p + 2q, we solve the following optimization problem

(B14) inf φ k * E l ϑ ⋅ e k − l b 1 * * b 1 * − l b 2 * * b 2 * 2 ,

where φ k * = φ T k * , φ C k * denotes the solution to this minimization problem. Here l ϑ = l β ⊤ , l γ ⊤ , l η ⊤ , ⊤ represents the derivative of l(θ, O) with respect to l φ T k * φ T k * and l φ C k * φ C k * are the directional derivatives of Λ_T0 and Λ_C0 along the directions φ T k * and φ C k * respectively. e _k is a p + 2q-dimensional basis vector with 1 in the k-th position and 0 elsewhere. We then define the k-th element of the efficient score S _ϑ as l ϑ ⋅ e k − l φ T k * φ T k * − l φ C k * φ C k * , k = 1,2 , … , p + 2 q .

Therefore, we can prove that.

sup v ∈ V ̄ : ‖ v ‖ > 0 γ ̇ ( θ 0 ) [ v ] 2 ‖ v ‖ 2 = sup v ∈ V ̄ : ‖ v ‖ > 0 | h ⊤ v ϑ | 2 ‖ v ‖ 2 = sup v ∈ V ̄ : ‖ v ‖ > 0 | h ⊤ v ϑ | 2 P { l ̇ ( θ , O ) [ v ] } 2

From the minimization approach defined in Equation (14) and following the discussion in Section 3.2, [44], we can obtain.

sup v ∈ V ̄ : ‖ v ‖ > 0 γ ̇ ( θ 0 ) [ v ] 2 ‖ v ‖ 2 = sup ( φ T , φ C ) ≠ 0 | h ⊤ v ϑ | 2 E ( l ϑ [ v ϑ ] + l φ T [ φ T k ] + l φ C [ φ C k ] ) 2 = h ⊤ E S ϑ S ϑ ⊤ − 1 h = h ⊤ Σ h .

Therefore, by utilizing the following equation and the Cramr-Wold theorem, we complete the proof of the first part of Theorem 3.

(B15) h ⊤ ( β ̂ − β 0 ) ⊤ , ( γ ̂ − γ 0 ) ⊤ , ( η ̂ − η 0 ) ⊤ ) = γ ( θ ̂ n ) − γ ( θ 0 ) = γ ̇ ( θ 0 ) [ θ ̂ n − θ 0 ] = ⟨ θ ̂ n − θ 0 , v * ⟩

As for the semiparametric efficiency of the parameter estimators, it can be directly established by applying Theorem 4 [45].

References

1. Sun, J. The statistical analysis of interval-censored failure time data. New York: Springer; 2006, vol 3.Suche in Google Scholar

2. Farewell, VT. The use of mixture models for the analysis of survival data with long-term survivors. Biometrics 1982;38:1041–6. https://doi.org/10.2307/2529885.Suche in Google Scholar

3. Ghitany, M, Maller, RA, Zhou, S. Exponential mixture models with long-term survivors and covariates. J Multivariate Anal 1994;49:218–41. https://doi.org/10.1006/jmva.1994.1023.Suche in Google Scholar

4. Lam, K, Xue, H. A semiparametric regression cure model with current status data. Biometrika 2005;92:573–86. https://doi.org/10.1093/biomet/92.3.573.Suche in Google Scholar

5. Hu, T, Xiang, L. Efficient estimation for semiparametric cure models with interval-censored data. J Multivariate Anal 2013;121:139–51. https://doi.org/10.1016/j.jmva.2013.06.006.Suche in Google Scholar

6. Shao, F, Li, J, Ma, S, Lee, MLT. Semiparametric varying-coefficient model for interval censored data with a cured proportion. Stat Med 2014;33:1700–12. https://doi.org/10.1002/sim.6054.Suche in Google Scholar PubMed

7. Zhou, J, Zhang, J, Lu, W. Computationally efficient estimation for the generalized odds rate mixture cure model with interval-censored data. J Comput Graph Stat 2018;27:48–58. https://doi.org/10.1080/10618600.2017.1349665.Suche in Google Scholar PubMed PubMed Central

8. Mazucheli, J, Coelho-Barros, EA, Achcar, JA. The exponentiated exponential mixture and non-mixture cure rate model in the presence of covariates. Comput Methods Progr Biomed 2013;112:114–24. https://doi.org/10.1016/j.cmpb.2013.06.015.Suche in Google Scholar PubMed

9. Chen, CM, Shen, PS, Wei, JCC, Lin, L. A semiparametric mixture cure survival model for left-truncated and right-censored data. Biom J 2017;59:270–90. https://doi.org/10.1002/bimj.201500267.Suche in Google Scholar PubMed

10. Usman, U, Shamsuddeen, S, Arkilla, BM, Yakubu, A. Mixture cure model for right censored survival data with weibull exponentiated exponential distribution. Pak J Statistics 2022;38.Suche in Google Scholar

11. Fang, H, Li, G, Sun, J. Maximum likelihood estimation in a semiparametric logistic/proportional-hazards mixture model. Scand J Stat 2005;32:59–75. https://doi.org/10.1111/j.1467-9469.2005.00415.x.Suche in Google Scholar

12. Lu, W. Efficient estimation for an accelerated failure time model with a cure fraction. Stat Sin 2010;20:661.Suche in Google Scholar

13. Wang, Z, Wang, X. Evaluating the time-dependent predictive accuracy for event-to-time outcome with a cure fraction. Pharm Stat 2020;19:955–74. https://doi.org/10.1002/pst.2048.Suche in Google Scholar PubMed

14. Chen, CM, Shen, P, Huang, WL. Semiparametric transformation models for interval-censored data in the presence of a cure fraction. Biom J 2019;61:203–15. https://doi.org/10.1002/bimj.201700304.Suche in Google Scholar PubMed

15. Liu, X, Xiang, L. Generalized accelerated hazards mixture cure models with interval-censored data. Comput Stat Data Anal 2021;161:107248. https://doi.org/10.1016/j.csda.2021.107248.Suche in Google Scholar

16. Wang, X, Wang, Z. EM algorithm for the additive risk mixture cure model with interval-censored data. Lifetime Data Anal 2021;27:91–130. https://doi.org/10.1007/s10985-020-09507-z.Suche in Google Scholar PubMed

17. Pal, S, Peng, Y, Aselisewine, W. A new approach to modeling the cure rate in the presence of interval censored data. Comput Stat 2024;39:2743–69. https://doi.org/10.1007/s00180-023-01389-7.Suche in Google Scholar PubMed PubMed Central

18. Zhou, J, Zhang, J, McLain, AC, Cai, B. A multiple imputation approach for semiparametric cure model with interval censored data. Comput Stat Data Anal 2016;99:105–14. https://doi.org/10.1016/j.csda.2016.01.013.Suche in Google Scholar

19. Diao, G, Yuan, A. A class of semiparametric cure models with current status data. Lifetime Data Anal 2019;25:26–51. https://doi.org/10.1007/s10985-018-9420-0.Suche in Google Scholar PubMed PubMed Central

20. Chen, CM, Lu, TFC, Chen, MH, Hsu, CM. Semiparametric transformation models for current status data with informative censoring. Biom J 2012;54:641–56. https://doi.org/10.1002/bimj.201100131.Suche in Google Scholar PubMed

21. Wang, P, Zhao, H, Du, M, Sun, J. Inference on semiparametric transformation model with general interval-censored failure time data. J Nonparametric Statistics 2018;30:758–73. https://doi.org/10.1080/10485252.2018.1478091.Suche in Google Scholar

22. Wang, S, Wang, C, Wang, P, Sun, J. Estimation of the additive hazards model with case k interval-censored failure time data in the presence of informative censoring. Comput Stat Data Anal 2020;144:106891.10.1016/j.csda.2019.106891Suche in Google Scholar

23. Zhao, B, Wang, S, Wang, C, Sun, J. New methods for the additive hazards model with the informatively interval-censored failure time data. Biom J 2021;63:1507–25. https://doi.org/10.1002/bimj.202000288.Suche in Google Scholar PubMed

24. Zhao, B, Wang, S, Wang, C. A new frailty-based gee approach of the informatively case k interval-censored failure time data. Commun Stat Theor Methods 2024;53:6527–43. https://doi.org/10.1080/03610926.2023.2247505.Suche in Google Scholar

25. Ma, L, Hu, T, Sun, J. Sieve maximum likelihood regression analysis of dependent current status data. Biometrika 2015;102:731–8. https://doi.org/10.1093/biomet/asv020.Suche in Google Scholar

26. Zhao, S, Hu, T, Ma, L, Wang, P, Sun, J. Regression analysis of informative current status data with the additive hazards model. Lifetime Data Anal 2015;21:241–58. https://doi.org/10.1007/s10985-014-9303-y.Suche in Google Scholar PubMed

27. Xu, D, Zhao, S, Hu, T, Yu, M, Sun, J. Regression analysis of informative current status data with the semiparametric linear transformation model. J Appl Stat 2019;46:187–202. https://doi.org/10.1080/02664763.2018.1466870.Suche in Google Scholar

28. Zhao, S, Dong, L, Sun, J. Regression analysis of interval-censored data with informative observation times under the accelerated failure time model. J Syst Sci Complex 2022;35:1520–34. https://doi.org/10.1007/s11424-021-0209-y.Suche in Google Scholar

29. Ding, Y, Sun, T. Copula models and diagnostics for multivariate interval-censored data. In: Emerging topics in modeling interval-censored survival data. New York: Springer International Publishing; 2022:141–65 pp.10.1007/978-3-031-12366-5_8Suche in Google Scholar

30. Liu, Y, Hu, T, Sun, J. Regression analysis of current status data in the presence of a cured subgroup and dependent censoring. Lifetime Data Anal 2017;23:626–50. https://doi.org/10.1007/s10985-016-9382-z.Suche in Google Scholar PubMed

31. Wang, S, Wang, C, Sun, J. An additive hazards cure model with informative interval censoring. Lifetime Data Anal 2021;27:244–68. https://doi.org/10.1007/s10985-021-09515-7.Suche in Google Scholar PubMed

32. Wang, S, Xu, D, Wang, C, Sun, J. Estimation of linear transformation cure models with informatively interval-censored failure time data. J Nonparametric Statistics 2023;35:283–301. https://doi.org/10.1080/10485252.2022.2148667.Suche in Google Scholar

33. Deresa, NW, Keilegom, IV. Copula based cox proportional hazards models for dependent censoring. J Am Stat Assoc 2024;119:1044–54. https://doi.org/10.1080/01621459.2022.2161387.Suche in Google Scholar

34. Delhelle, M, Van Keilegom, I. Copula based dependent censoring in cure models. Test 2025:1–22. https://doi.org/10.1007/s11749-024-00961-7.Suche in Google Scholar

35. Nelsen, RB. An introduction to copulas. New York: Springer; 2006.Suche in Google Scholar

36. Huang, J, Rossini, A. Sieve estimation for the proportional-odds failure-time regression model with interval censoring. J Am Stat Assoc 1997;92:960–7. https://doi.org/10.1080/01621459.1997.10474050.Suche in Google Scholar

37. Efron, B, Tibshirani, RJ. An introduction to the bootstrap. Boca Raton: Chapman and Hall/CRC; 1994.10.1201/9780429246593Suche in Google Scholar

38. Pintilie, M. Competing risks: a practical perspective. Chichester: John Wiley & Sons; 2006.10.1002/9780470870709Suche in Google Scholar

39. Deresa, NW, Van Keilegom, I. Semiparametric transformation models for survival data with dependent censoring. Ann Inst Stat Math 2025;77:425–57. https://doi.org/10.1007/s10463-024-00921-w.Suche in Google Scholar

40. Van Der Vaart, AW, Wellner, JA. Weak convergence. New York: Springer; 1996.10.1007/978-1-4757-2545-2_3Suche in Google Scholar

41. Pollard, D. Convergence of stochastic processes. New York: Springer Science & Business Media; 2012.Suche in Google Scholar

42. Lorentz, GG. Bernstein polynomials. New York: American Mathematical Soc; 1986.Suche in Google Scholar

43. Shen, X, Wong, WH. Convergence rate of sieve estimates. Ann Stat 1994;22:580–615. https://doi.org/10.1214/aos/1176325486.Suche in Google Scholar

44. Chen, X, Fan, Y, Tsyrennikov, V. Efficient estimation of semiparametric multivariate copula models. J Am Stat Assoc 2006;101:1228–40. https://doi.org/10.1198/016214506000000311.Suche in Google Scholar

45. Shen, X. On methods of sieves and penalization. Ann Stat 1997;25:2555–91. https://doi.org/10.1214/aos/1030741085.Suche in Google Scholar

Received: 2025-04-10

Accepted: 2025-08-30

Published Online: 2025-10-09

Sie haben derzeit keinen Zugang zu diesem Inhalt.

Artikel in diesem Heft

https://doi.org/10.1515/ijb-2025-0038

Schlagwörter für diesen Artikel

proportional hazards mixture cure model; copula model; current status data; dependent censoring