Home Modeling of Clinical Phenotypes Assessed at Discrete Study Visits
Article Publicly Available

Modeling of Clinical Phenotypes Assessed at Discrete Study Visits

  • Emily Huang ORCID logo , Ravi Varadhan EMAIL logo and Michelle Carlson
Published/Copyright: August 2, 2019
Become an author with De Gruyter Brill

Abstract

In studies of clinical phenotypes, such as dementia, disability, and frailty, participants are typically assessed at in-person clinic visits. Thus, the precise time of onset for the phenotype is unknown. The discreteness of the clinic visits yields grouped event time data. We investigate how to perform a risk factor analysis in the case of grouped data. Since visits can be months to years apart, numbers of ties can be large, causing the exact tie-handling method of the Cox model to be computationally infeasible. We propose two, new, computationally efficient approximations to the exact method: Laplace approximation and an analytic approximation. Through extensive simulation studies, we compare these new methods to the Prentice-Gloeckler model and the Cox model using Efron’s and Breslow’s tie-handling methods. In addition, we compare the methods in an application to a large cohort study (N = 3,605) on the development of clinical frailty in older adults. In our simulations, the Laplace approximation has low bias in all settings, and the analytic approximation has low bias in settings where the regression coefficient is not large in magnitude. Their corresponding confidence intervals also have approximately the nominal coverage probability. In the data application, the results from the approximations are nearly identical to that of the Prentice-Gloeckler model.

1 Introduction

Older adults are at increased risk of experiencing poor health outcomes, such as frailty, disability, and dementia. Identifying risk factors for these outcomes is a major goal of prospectively conducted epidemiological studies of aging. Many aging-related phenotypes can only be assessed at in-person study visits. For these phenotypes, a participant’s time of onset cannot be known precisely. It is known up to the time interval between the study visit in which the outcome was first observed and the previous visit. This type of data in which the true time of onset is only known to be in a certain group is called grouped event time data, and is the focus of our paper. Grouped data arises whenever the outcome of interest is assessed at discrete study visits, so it is common not only in aging research but also in many other fields. Some examples of epidemiological studies that have collected grouped event data include the Cardiovascular Health Study, the Multiethnic Study of Atherosclerosis, and the Women’s Health and Aging Study (Fried et al. 1991; Bild et al. 2002; Onder et al. 2005).

The structure of event times for grouped data differs from that of interval censored event times. In the grouped data setting, the intervals are fixed by the number and timing of visits given by the study design, while in the interval-censored data setting the intervals are random. In the grouped data setting, all the observed event times will cluster into a small number of groups, which would correspond to the study visits, whereas in the case of interval censored data the number of intervals can be quite large.

We are interested in examining how the grouped nature of the data affects risk factor analysis in prospective cohort studies. One modeling approach is the Prentice-Gloeckler (PG) model, which accounts for the grouped nature of the data (Prentice and Gloeckler 1978). A straightforward, but less principled approach is to use the Cox proportional hazards (PH) model, the workhorse of survival analysis, substituting the unknown precise onset of the outcome with the visit time at which the outcome is first observed. The Cox model estimates the impact of covariates on the hazard function, which is based on continuous event time, whereas the observed event times in the grouped data setting are discrete. Due to the grouped nature of the event time, there will be a substantial number of ties in event times, and this needs to be addressed while using the Cox PH model. There are three well-known tie-handling techniques: the exact method, the method by Breslow, and the method by Efron. A challenge of the exact method is that it can be computationally intractable when there is a large number of ties in the data. We propose two new approximations to the exact method that are computationally efficient, even for a very large number of tied failure times. The first approximation is the Laplace approximation to an integral (Erdelyi 1956, p. 36–37), and the second is an analytic approximation to the integral.

Through simulation, we compare our two approximations to the PG model, and to the Cox PH model using Breslow’s and Efron’s tie-handling schemes. In general, the proposed methods yield estimators with low bias, and confidence intervals with the desired coverage probability. Also, we apply the proposed methods to the Cardiovascular Health Study (Fried et al. 1991), a cohort study on the development of clinical frailty in older adults. The methods give comparable results to the Prentice-Gloeckler model, in our data application. The R code for the proposed methods is provided online. The link can be found in the Supplementary Materials section at the end of this paper. Issues of competing events, although relevant given the type of events considered in our motivating example, are not addressed in this paper.

Currently, Breslow’s approximation is the default method for handling tied data in most software packages, such as Stata and SAS. Hertz-Picciotto and Rockhill (1997) and Scheike and Sun (2007) found that Breslow’s approximation tends to underestimate the coefficients in the Cox model and that Efron’s approximation performs better than Breslow’s approximation. The simulation results of this paper are consistent with those findings.

This paper is organized as follows. The assumptions are stated in Section 2. The PG model and the Cox PH Model, including the exact tie-handling method and the tie-handling methods by Breslow and Efron, are introduced in Section 3. We present our proposed approximations to the exact method in Section 4. The data application to the Cardiovascular Health Study and the simulation studies are described in Sections 5 and 6, respectively. Concluding remarks are given in Section 7.

2 Context and assumptions

We consider a population whose members have not experienced the event of interest at baseline (t = 0), which is entry into the study. For each member, we assume her/his hazard function λβ(t|Z) satisfies:

(1)λβ(t|Z)=λ0(t)exp(Z1β1+Z2β2++Zpβp),

which is the Cox proportional hazards assumption. In (1), Z=(Z1,Z2,,Zp) is a vector of covariates collected at baseline, and β=(β1,,βp) is a vector of unknown coefficients corresponding to Z. The function of time, λ0(t), is the hazard function corresponding to Z=(0,,0). The Cox model assumes that the event time T is absolutely continuous. Our objective is to estimate the coefficient vector β, using data from a prospective cohort study where the participants are assessed at in-person visits. During the baseline visit (t = 0), the covariates Z are collected for each participant. Let Zi denote the vector of baseline covariates (Zi1,Zi2,,Zip) for participant i. After the baseline visit, follow-up visits are used to determine whether or not participants have experienced the event of interest. We assume that the spacing between visits is uniform across participants, though the visits themselves need not occur at regular intervals. Let m denote the number of follow-up visits. We label the baseline visit as Visit 0, and the follow-up visits as Visits 1 through m. Let tk denote the length of time between Visit 0 and Visit k. Missing data can result due to early drop-out. However, we assume no intermittent missing data, i. e. no cases where a participant missed Visit jbut attended Visit j', where j' > j.

For any given participant i, the observed data vector is (Zi,δi,Vi), where Zi is the vector of covariates, δi is an indicator of censoring, and Vi is an integer between 0 and m. A participant is censored if the study ended or the participant dropped out before she/he experienced the event. The censoring indicator δi takes the value 0 if participant i is censored, and 1 otherwise. If δi = 0, Vi is the last visit that participant i attended. If δi = 1, Vi is the first visit where participant i was observed to have experienced the event, and we refer to Vi as the “observed event time.” From the observed data, the true (i. e. precise) event time is unknown, but it is known to occur between Visits Vi1 and Vi. We make two assumptions about the observed data. First, the data vectors (Zi,δi,Vi), i=1,,n, are independent and identically distributed from a true, unknown joint distribution P on (Z,δ,V). Second, for participants who are censored due to dropout, their dropout time is independent of their true event time, conditional on the observed covariates.

A common attribute of grouped data is heavy ties in the observed event times. This occurs because adjacent visits can be months to years apart. For example, in the Cardiovascular Health Study, the first follow-up assessment of the clinical frailty outcome occurred three years after the baseline visit. There were 307 participants whose observed event time was the first follow-up visit, and 187 participants whose observed event time was the second follow-up visit. Define Ik as the set of participants who had the event at Visit k, i. e. Ik={i:δi=1,Vi=k}. Let dk denote the number of participants tied at Visit k, i. e. the number of elements in the set Ik.

We compare the performance of various approaches to estimate β. Existing approaches include the Prentice-Gloeckler model and the Cox model with tie-handling using the exact method, Breslow’s method, or Efron’s method. We propose two new, computationally efficient approximations to the exact method.

3 Existing approaches

3.1 Prentice-Gloeckler model

Prentice and Gloeckler (1978) apply maximum likelihood to estimate β, while accounting for the grouped nature of the data. When deriving the likelihood function, they make the Cox proportional hazards assumption. Let t0=0 and tm+1 = . For any given i=1,,n, let Ti be the unobserved continuous event time of participant i. Let Ti be the first visit where participant i would be observed to have experienced the event under no censoring, i. e.

Ti=vif and only if T(tv1,tv].

We refer to Ti as the “grouped event time.” The grouped event time Ti is defined for every participant, but is only observed for uncensored participants. Throughout the rest of the paper, we use the dot symbol to represent dot products, i. e. ab is the dot product of vectors a and b. The likelihood function derived by Prentice and Gloeckler (1978) is:

(2)L(β)=i=1nP(Ti=vi|Zi)δiP(Ti>vi|Zi)1δi,

where

(3)P(Ti=vi|Zi)=P(tvi1<Titvi|Zi)=(1αviexp{Ziβ})j=1vi1αjexp{Ziβ},
(4)αj=exptj1tjλ0(u)du,
(5)P(Ti>vi|Zi)=P(Ti>tvi|Zi)=j=1viαjexp{Ziβ}.

The PG model is equivalent to using a generalized linear model of the binomial family with a complementary log-log link (Allison 1982). The Cox model and PG model both make the proportional hazards assumption and they do not require the user to assume a specific form for the baseline hazard λ0. A distinction between the models is that the former does not parametrically specify the baseline hazard function, while the latter does through the αj terms defined in (4). In the PG model, the αj terms are estimated from data without requiring assumptions about the functional form of λ0(t). The PG model is not appropriate if the number of parameters is close to the number of uncensored subjects, which occurs if the visits are frequent resulting in small tie sizes (Therneau and Grambsch 2010).

3.2 Cox model with existing tie-handling methods

Let Ri(β)=exp(Ziβ) be the risk score of participant i. Let the indicator function 1(E) equal 1 if event E is true and 0 otherwise. In the case of no ties in the observed event times, the vector β can be estimated as the maximizer of the partial likelihood function, which is

(6)PL(β)=i:δi=1Ri(β)j=1n1(VjVi)Rj(β).

However, ties are common in our context of grouped data, as discussed in Section 2. We present the existing methods for tie-handling using a simple example. Suppose that participants 1 and 2 are tied at Visit k. Their contribution to the partial likelihood is either equal to:

(7)R1R1+R2+j=3n1(Vjk)RjR2R2+j=3n1(Vjk)Rj,

if participant 1 experienced the event before participant 2, or equal to:

(8)R2R1+R2+j=3n1(Vjk)RjR1R1+j=3n1(Vjk)Rj,

if participant 2 experienced the event before participant 1. Both (7) and (8) have the term j=3n1(Vjk)Rj in the first and second denominators; the subjects contributing to this term are those who are censored at or after Visit k or have an observed event time after Visit k. The difference between (7) and (8) arises from the presence of R1 or R2 in the second denominator. Since participants 1 and 2 have the same observed event times, we do not know which is correct.

In the exact approach, one takes the average of (7) and (8) (Kalbfleisch and Prentice 2002). The exact approach is computationally feasible when the number of tied subjects dk is small, as in this case. However, it is computationally prohibitive for even moderately large dk since the number of possibilities to average over, dk!, becomes massive. For example, when the tie size dk is 10, the number of possibilities is 10! = 3,628,800. For the frailty outcome in the Cardiovascular Health Study, we encounter 307 and 187 ties in the two follow-up visits. The exact method is computationally impossible. Motivated by this, we propose two computationally efficient approximations of the exact approach in Section 4.

Commonly used tie-handling methods in the Cox PH model are those proposed by Breslow and Efron (Therneau and Grambsch 2010). Rather than taking the average of the dk! possibilities, Breslow’s and Efron’s methods each involve taking a product of two terms, as in (7) and (8). In Breslow’s method, the sum of the risk scores of participants 1 and 2 appears in both denominators, i. e.

(9)R1R1+R2+j=3n1(Vjk)RjR2R1+R2+j=3n1(Vjk)Rj.

In Efron’s method, the full sum is in the first denominator, while the second denominator uses the average of R1 and R2, i. e.

(10)R1R1+R2+j=3n1(Vjk)RjR20.5(R1+R2)+j=3n1(Vjk)Rj.

More generally, if there are dk participants tied at Visit k, the exact method will take the average of all the dk! possibilities, whereas Breslow’s and Efron’s methods only involve taking a product of dk terms. Without loss of generality, assume that the dk participants tied at Visit k are participants 1,2, , dk. Then the contribution to the partial likelihood at Visit k, for Breslow’s and Efron’s methods, respectively, can be written as follows:

(11)k=1dkRkj=1dkRj+j=dk+1n1(Vjk)Rj,
(12)k=1dkRkdkk+1dkj=1dkRj+j=dk+1n1(Vjk)Rj.

4 Proposed approximations to the exact tie-handling method

As discussed in Section 3, computing the exact likelihood is difficult when the tie sizes are not small. For any given participant i and Visit k, where k{1,,m}, define

(13)ai,k(β)=Ri(β)j/Ik1(Vjk)Rj(β).

For conciseness, we refer to this as ai,k throughout the rest of the paper. DeLong et al. (1994) showed that the contribution to the exact likelihood at Visit k (i. e. the average of the dk! possibilities for Visit k) can be represented as dk! multiplied by

(14)t=0iIk(1eai,kt)etdt.

Our proposed method is to approximate (14), using two new approaches: the Laplace approximation and a simple analytic approximation. We also considered a Gaussian quadrature approach, the Gauss-Laguerre quadrature (Davis and Rabinowitz 1984, p. 222–223). However, it did not work as well as the other approaches discussed here, mainly due to the difficulty in choosing the appropriate number of quadrature points (results not shown).

4.1 Laplace approximation

Our first approximation is based on the Laplace approximation to an integral (Erdelyi 1956) and will be referred to as “Laplace" throughout the paper. Here, we present the procedure for Laplace. The mathematical derivations are presented in Appendix A. Denote the integrand of (14) as P(t), i. e.

P(t)=iIk(1eai,kt)et.

Let tk be the value of t such that

iIkai,keai,kt1=1.

Define

σk=iIkai,k2eai,ktk(eai,ktk1)2.

Let Xk be a Normally distributed random variable with mean tk and variance 1/σk. We estimate the unknown coefficient vector β as the value of β that maximizes the function

(15)k=1mP(tk)2π/σkPr(Xk>0).

This is equivalent to finding the maximizer of the function

f(β)=k=1mlogP(tk)1/2logσk+logPr(Xk>0),

which is obtained by taking the logarithm of (15) and omitting the terms that do not depend on β. We denote the Laplace estimator of β as βˆLaplace. For any given matrix M, let [M]ij denote the element in the i-th row and j-th column of M, and let (M)1 denote the inverse of M. Let Hf(β) denote the Hessian of f(β), i. e. for any given i,j, [Hf(β)]ij=2f(β)/βiβj. The standard error of the i-th element in βˆLaplace is estimated by: Hf(βˆLaplace)1ii.

4.2 Analytic approximation

We propose another approximation that will be referred to as “Analytic,” in which (14) is approximated as

(16)2πdke(dk+1)iIk1e(dk+1)ai,k,

which is a function of β due to the term ai,k that was defined in (13). The details of the derivation are provided in Appendix B. The unknown coefficient vector β is estimated by the value of β that maximizes the function

(17)k=1m2πdke(dk+1)iIk1e(dk+1)ai,k,

which is the product of (16) over k=1,,m. This is equivalent to maximizing the function

(18)g(β)=k=1miIklog1e(dk+1)ai,k,

which results from taking the logarithm of (17) and omitting the terms that do not depend on β. We denote the Analytic estimator of β as βˆAnalytic. Let Hg(β) denote the Hessian of g(β). Analogous to the Laplace method, we estimate the standard error of the i-th element in βˆAnalytic by: Hg(βˆAnalytic)1ii.

5 Application to the cardiovascular health study

The Cardiovascular Health Study (CHS) was a study of 5,201 older adults enrolled between 1989 and 1990, and followed up at annual clinic visits until 1999 (Fried et al. 1991). One of the outcomes collected in CHS was clinical frailty, which is a syndrome defined as the presence of three or more of the following: unintentional weight loss, exhaustion, low grip strength, slow walking speed, and low physical activity (Fried et al. 2001). Frailty status was assessed at the baseline visit and at two follow-up visits, occurring three and seven years after the baseline visit, respectively. We fitted the Cox and PG models to time-to-frailty data collected in CHS. It should be noted that, in addition to the cohort of 5,201 participants, a second cohort of 687 participants was recruited in 1992. We focus on the first cohort in our analysis. This is because the follow-up schedule for frailty differs between the two cohorts. The second cohort had only a single follow-up assessment of frailty.

In our analysis of time-to-frailty, five covariates were used: age, gender (0 = female, 1 = male), whether the person had ever smoked (0 = no, 1 = yes), whether the person had received education beyond high school (0 = no, 1 = yes), and C-reactive protein (CRP) level. These covariates were collected at the baseline visit. We standardized both age and CRP level. Age, in units of years, was standardized by subtracting 65 and dividing by 10. CRP level, in units of mg/L, was standardized by dividing by the interquartile range (2.15 mg/L). The fitted models included main terms for the covariates and no interaction terms. We let δ be an indicator of whether the participant was assessed as frail during her/his follow-up (0 if no, 1 if yes). If δ = 0, we let V be the last follow-up visit at which the participant’s frailty status was assessed. If δ = 1, we let V be the first follow-up visit at which the participant was assessed as frail.

We included only participants who satisfied the following conditions: (i) were non-frail at the baseline visit, (ii) frailty status was assessed at the first follow-up only, or at both follow-ups, (iii) covariates were non-missing. The resulting sample size was n = 3,605. The purpose for (i) was, since we sought to study the time-to-frailty development after the baseline visit, we focused on those who were not yet frail at the baseline visit. Criterion (ii) ensured that there was some follow-up on the frailty status after the baseline visit because those without any such follow-up would not affect the model fits. Also, criterion (ii) excluded those whose frailty status was observed at the second follow-up but not the first follow-up. If the status at the second follow-up was frail, then δ = 1 but it could not be deduced whether V should be 2 or 1 (i. e. the person was actually frail by the first follow-up). If the status at the second follow-up was non-frail, there was also some ambiguity since the person might have been non-frail through the second follow-up, or have been frail at the first follow-up and transitioned to being non-frail by the second follow-up. A potential impact of excluding participants who did not attend the first follow-up visit is that the sample used in our analysis may under-represent a specific subtype of participants. For example, the participants who missed the first follow-up visit may be more likely to have severe health problems, e. g. some of them might have missed their study visit due to their health problems. Criterion (iii) was required since participants without the full set of covariates would not impact the model fits.

Web Figure 1 in the Supplementary Materials presents the empirical marginal distributions of age, gender, smoking status, education level, and CRP level, after excluding participants as described above. The mean age is 72 (range 64–95) and the proportion of female participants is 0.58. Table 1 presents the empirical joint distribution on (δ,V), after excluding participants as described above. Each cell gives the number of participants, followed by the corresponding sample proportion in parentheses. We evaluated the validity of the proportional hazards assumption by testing the null hypothesis that the correlation between the scaled Schoenfeld residuals and time is zero, separately for each covariate. The results supported the proportional hazards assumption; the p-value was 0.10 for age, 0.77 for education level, 0.48 for gender, 0.33 for smoking status, and 0.80 for CRP level.

Table 1:

Distribution of (δ,V) in Cardiovascular Health Study Application.

V
12
δ0884 (0.25)2227 (0.62)
1307 (0.09)187 (0.05)

Table 2 presents the regression coefficient estimates in the top panel, with the corresponding standard error estimates in the middle panel. The values are shown up to four decimal places. The results in this table and those throughout the paper were computed using R. When fitting the Cox model, we applied our proposed tie-handling methods Laplace and Analytic, as well as the existing methods by Efron and Breslow. When assessing each of these methods, we use PG as a reference since it accounts for the discreteness of the data directly. All of the methods were computationally efficient; in this data application, the running time was 0.01 seconds for Efron’s method, 0.02 seconds for Breslow’s method, 0.3 seconds for PG, 1.2 seconds for Analytic, and 1.4 seconds for Laplace.

Table 2:

Regression Coefficient Estimates, Standard Error Estimates, and 95 % Confidence Intervals in the Cardiovascular Health Study Application. Panel (a) presents the regression coefficient estimates and Panel (b) presents the standard error estimates. Panel (c) presents both the estimates of and the 95 % confidence intervals for the exponentiated regression coefficients. Each confidence interval is obtained by taking the corresponding regression coefficient estimate (in Panel A), adding and subtracting the quantity of 1.96 multiplied by the standard error estimate (in Panel B), and then exponentiating the results. The covariate age, in units of years, has been standardized by subtracting 65 and dividing by 10. The covariate CRP level, in units of mg/L, has been standardized by dividing by the interquartile range (2.15 mg/L).

(a) Regression coefficient estimates.
PGCox PH
LaplaceAnalyticEfronBreslow
age1.11271.11271.10691.10631.0263
education0.28740.28740.28600.28590.2669
gender0.53840.53840.53610.53610.5045
smoke0.18560.18560.18450.18490.1708
CRP0.04180.04180.04170.04180.0405
(b) Standard error estimates.
PGCox PH
LaplaceAnalyticEfronBreslow
age0.07930.07930.07920.07860.0775
education0.09240.09240.09240.09220.0922
gender0.10030.10030.10030.10020.1000
smoke0.09490.09490.09490.09480.0947
CRP0.01230.01230.01230.01230.0125
(c) Point estimates and 95 % confidence ntervals for exponentiated regression coefficients.
PGCox PH
LaplaceAnalyticEfronBreslow
age3.04 [2.60, 3.55]3.04 [2.60, 3.55]3.02 [2.59, 3.53]3.02 [2.59, 3.53]2.79 [2.40, 3.25]
education0.75 [0.63, 0.90]0.75 [0.63, 0.90]0.75 [0.63, 0.90]0.75 [0.63, 0.90]0.77 [0.64, 0.92]
gender0.58 [0.48, 0.71]0.58 [0.48, 0.71]0.59 [0.48, 0.71]0.59 [0.48, 0.71]0.60 [0.50, 0.73]
smoke1.20 [1.00, 1.45]1.20 [1.00, 1.45]1.20 [1.00, 1.45]1.20 [1.00, 1.45]1.19 [0.99,1.43]
CRP1.04 [1.02, 1.07]1.04 [1.02, 1.07]1.04 [1.02, 1.07]1.04 [1.02, 1.07]1.04 [1.02,1.07]

In the upper two panels of Table 2, the results for Laplace perfectly match those for PG. For all covariates, the absolute difference between the Laplace estimate and the corresponding PG estimate was on the order of 106 or less. This applied to both the regression coefficient estimates and standard error estimates. For each covariate, the magnitude of the percent difference of Laplace compared to PG, i. e. |(Laplace - PG)/PG× 100|, was at most 0.01 %. The results for Analytic and Efron’s method were also close to those of PG. For both Efron’s method and Analytic, the absolute difference from PG was at most 6 ×103 in magnitude. For each covariate, the magnitude of the percent difference from PG was at most 0.8 % for Efron’s method and 0.6 % for Analytic. Compared to PG and Laplace, the regression coefficients for Efron’s method and Analytic were closer to the null. Those for Breslow’s method were further in the direction of the null. Breslow’s method yielded the most different results from PG. For the regression coefficient estimates, the magnitude of the percent difference of Breslow’s method compared to PG was as much as 8 %.

The bottom panel of Table 2 shows the 95 % confidence intervals for the exponentiated regression coefficients computed using each approach. For interpretability, we present the confidence intervals for the exponentiated regression coefficients rather than for the regression coefficients themselves. Each confidence interval is computed by taking the corresponding regression coefficient estimate, adding and subtracting the quantity 1.96 multiplied by the standard error estimate, and exponentiating both values. Beside each 95 % confidence interval, we also present the corresponding regression coefficient estimate (after exponentiation). Although there are some differences between the confidence intervals computed by the different approaches, the qualitative conclusions are the same. The various approaches all find that a person’s hazard for frailty increases with higher age, holding the other factors constant. Also, the hazard for frailty is lower for those who have education beyond high school than for those who have a high school education or less. Females have a higher hazard than males, holding other factors constant. All of the approaches concluded that the hazard for frailty is not statistically different between those who have ever smoked versus those who have never smoked, but the point estimate suggests that smoking raises the hazard. The hazard for frailty increases with higher CRP level, holding other factors constant.

6 Simulation studies

Through simulation, we evaluate the Cox and PG models in the grouped data setting. For the Cox model, the Laplace, Analytic, Efron’s, and Breslow’s tie-handling methods are separately applied. We compare the regression coefficient estimators with respect to bias, standard error, and root-mean-square error. Also, we evaluate the actual coverage probability of the nominal 95 % confidence interval for each method.

Each simulation study includes a large number of simulations, nsim=10,000. For any given simulation s, a study with sample size n is simulated. The contents of the study data set are described in the following paragraph. Using the data set, the parameter β is estimated using the Cox model and the PG model, respectively. For a given estimator βˆ=(βˆ1,,βˆp), we approximate the bias, standard error, and root-mean-square error of the i-th element βˆi as:

(19)bias1nsims=1nsimβˆs,iβi,
(20)standard error1nsims=1nsimβˆs,i1nsims=1nsimβˆs,i2,
(21)rootmeansquare error1nsims=1nsimβˆs,iβi2,

where βˆs,i is the estimate of βi in the s-th simulation. Also, we estimate the coverage probability of the 95 % confidence interval for βi by

1nsims=1nsim1(βˆs,i1.96SEˆs,iβiβˆs,i+1.96SEˆs,i),

where SEˆs,i is the estimate of the standard error of βˆi from the s-th simulation.

Each simulation follows n participants according to a schedule which includes a baseline visit (Visit 0) and m follow-up visits (Visits 1,,m), occurring at fixed times after the baseline visit. The collected data set includes covariate, censoring, and time information of the n participants, designated as (Zi,δi,Vi), i=1,,n. These random variables were previously defined in Section 2. For any given i, participant i does not miss any visits preceding visit Vi. The simulation studies were organized into two sets, based on the structure of the data generating distribution. We refer to the two sets as the Single Covariate Simulations and the Data-Based Simulations.

6.1 Single covariate simulations

In these simulation studies, each simulation is a randomized controlled trial with m = 3 follow-up visits. The single covariate Z represents the treatment assignment (1 if control, 0 if treatment). Treatment assignment is random, with equal probability of being assigned to treatment or control, i. e. P(Z = z) = 0.5 for z ∈ { 0,1}.

The simulations also generate the censoring time C and the grouped event time T for each participant, from which δ and V are derived. The censoring time C represents a participant’s last visit before dropping out or being administratively censored. Each participant’s censoring time is an independent draw from the discrete distribution having point masses at 0, 1, 2, and 3, with probabilities of 0.08, 0.1, 0.1, and 0.72, respectively. A participant with C = 0 attends only the baseline visit, while one with C = 3 continues till the end of the trial. Censoring time is independent of event time and treatment assignment.

The grouped event time T is the first visit following the true continuous event time, which is unobserved and denoted as T. The value T can be 1,2,3 or , where T= indicates that the event occurs after the trial has ended and thus would not be observed at any of the visits. We generate the true event time T, as a random draw from the continuous failure time distribution induced by the hazard function:

(22)λ(t|Z)=t50exp{Zβ}.

We formulated (22) so that there would be approximately 25 % with observed event times, 25 % dropout, and 50 % administrative censoring in each randomized trial, if β = 0. The baseline hazard function, i. e. (22) under Z = 0, is λ0(t)=t/50. We let T have units of years, and define Visits 1,2, and 3 to occur 2,4, and 6 years after the baseline visit, respectively. Based on T, the grouped event time T is derived as follows: 1 if 0<T2, 2 if 2<T4, 3 if 4<T6, and if T>6.

By definition, the censoring indicator δ and visit number V are as follows:

(23)δ=1(TC),
(24)V=Tδ+C(1δ).

Note that, although T can take on the value , the visit number V is at most 3 since δ = 0 when T=. Administrative censoring occurs if δ = 0 and V = 3, i. e. C = 3 and T=. We vary β between 0, 0.2, 0.4, 0.6, 0.8, 1, and 2. Also, we vary the sample size n between 200, 500, and 2,000. For each β and sample size n, Table 3 shows the average number of ties at each visit. The average is taken across the 10,000 simulations for the given pair of β and n. For any given sample size and visit, the average number of ties grows with higher β (there are a few exceptions, namely for each sample size at Visit 3 when changing from β = 1 to 2). For instance, when n = 200, the average number of ties for Visit 1 is 7 if β = 0, 13 if β = 1, and 27 if β = 2. For a given β, the average number of ties grows with higher sample size n. For instance, when β = 0.2, the average number of ties at Visit 3 is 24 for n = 200, 60 for n = 500, and 240 at n = 2,000.

Table 3:

Single covariate simulation studies: average number of ties at each visit, for each setting of β and sample Size n.

n = 200n = 500n = 2000
βVisit 1Visit 2Visit 3Visit 1Visit 2Visit 3Visit 1Visit 2Visit 3
07182218455672178222
0.28202420496080196240
0.49222622546489216257
0.6102428256069101239275
0.8112729296673114265293
1132931337377131294307
22745286811270271448282

Table 4 presents the absolute bias, standard error, root-mean-square error, and coverage probability at n = 500. Web Table 1 and Table 2 of the Supplementary Materials present the results for sample sizes n = 200 and 2,000. For a visual representation, the tabulated results are also plotted in Web Figures 2–4 in the Supplementary Materials. These figures exclude the results for Breslow’s method since it has larger bias than the other methods when β0.

Table 4:

Bias, Standard Error (SE), Root-Mean-Square Error (RMSE), and Coverage Probability (CP) in Single Covariate Simulation Study with n = 500. For each value of β, the number of simulations is 1,0000.

CoxPHPG
LaplaceAnalyticEfron’s methodBreslow’s method
βBiasSERMSECPBiasSERMSECPBiasSERMSECPBiasSERMSECPBiasSERMSECP
00.0020.1890.1890.9510.0020.1870.1870.9530.0020.1880.1880.9510.0020.1760.1760.9650.0020.1890.1890.951
0.2–0.0020.1790.1790.949–0.0040.1780.1780.952–0.0030.1790.1790.949–0.0170.1660.1670.965–0.0020.1800.1800.948
0.40.0030.1730.1730.947–0.0010.1720.1720.9490.0020.1730.1730.947–0.0310.1590.1620.9610.0040.1740.1740.947
0.60.0030.1680.1680.950–0.0050.1660.1660.9530.0010.1680.1680.950–0.0540.1530.1620.9580.0040.1690.1690.950
0.80.0040.1640.1640.949–0.0110.1610.1610.9520.0000.1640.1640.949–0.0800.1480.1680.9390.0050.1640.1640.949
10.0000.1610.1610.950–0.0250.1560.1580.951–0.0060.1610.1610.951–0.1150.1440.1850.9030.0010.1620.1620.950
20.0060.1570.1580.952–0.1770.1350.2220.803–0.0330.1530.1570.945–0.3780.1300.4000.2770.0100.1580.1580.951

Laplace and PG have absolute bias close to zero that also improves with higher sample size. At n = 200, the magnitude of the absolute bias is 0.02 for Laplace and 0.03 for PG, for each choice of β. At n = 500, it is 0.006 for Laplace and 0.01 for PG. Also, their coverage probabilities are approximately the nominal coverage probability of 0.95. For every pair of β and n, their coverage probability is within 0.006 of 0.95. Laplace and PG have very comparable standard error and root-mean-square error. Efron’s method also performs well, which is consistent with the simulation results of Hertz-Picciotto and Rockhill (1997) and Scheike and Sun (2007) on tied data in the Cox model. Generally, Efron’s method has bias close to zero and coverage probability close to 0.95. In addition, its standard error can be less than Laplace and PG. However, it does not perform as well as Laplace and PG in the case of β = 2. At β = 2, the bias of Efron’s method is 0.03 at n = 200, 500, and 2,000, while the bias of Laplace and PG becomes negligible with larger sample size. At n = 2,000, Efron’s method has a coverage probability of 0.92, while Laplace and PG have the correct coverage probability of 0.95. Analytic has the lowest standard error among Laplace, Efron’s method, and PG. Its bias is close to zero for β1. For β in this range, the magnitude of the absolute bias for Analytic is 0.01 at n = 200, 0.02 at n = 500, and 0.02 at n = 2,000. However, for β = 2, its bias is approximately 0.17 at n = 200, 500, and 2,000.

Although it has the best standard error and negligible bias when β = 0, Breslow’s method has the worst bias of all the methods when β/=0. In addition, its coverage probability can be low. For instance, at n = 2,000, the coverage probability of Breslow’s method is 0.85 when β = 0.8, 0.71 when β = 1, and almost 0 when β = 2. The results show that Breslow’s method can be unsuitable when the number of ties is large. Intuitively, this is because the approximation used in Breslow’s method, shown in (11), puts the risk score of each tied subject in every denominator, while in truth risk scores are omitted one at a time with each subsequent denominator. When the number of ties is large, this method may not yield a good approximation of the true partial likelihood.

This first set of simulation studies mimics a randomized controlled trial setting. To mimic a typical epidemiological setting without randomization to the exposures, we perform simulations based on the Cardiovascular Health Study.

6.2 Data-based simulations

In the data-based simulation studies, we generate each data set by sampling n participants with replacement from the Cardiovascular Health Study data set, discussed in Section 5. We focus on the outcome frailty and the covariates age, gender, education, CRP level, and smoking. The true β is taken to be the regression coefficient estimates when PG is applied to the CHS data set (i. e. the coefficients in the first column of Table 2). We vary n between 500, 1,000, and 2,000. For each sample size, Table 5 shows the average number of ties, across the 10,000 simulations, for each visit. The average number of ties grows with higher sample size, as expected. For each sample size, there are on average 9 % of subjects tied at the first visit and 5 % of subjects tied at the second visit. For even the smallest number in this table (26), the exact method of tie-handling is computationally challenging since we have to compute 26! = 4×1026 terms.

Table 5:

Data-Based Simulation Studies: Average Number of Ties at Each Visit, for Each Sample Size n.

nVisit 1Visit 2
5004226
10008552
2000170104

Table 6 presents the absolute bias, standard error, root-mean-square error, and coverage probability for n = 500. Web Table 3 and Table 4 of the Supplementary Materials present the results for sample sizes n = 1,000 and 2,000. The tabulated results are plotted in Web Figures 5–7 in the Supplementary Materials. For all three sample sizes, Laplace, Analytic, Efron’s method, and PG have comparable results for bias, standard error, root-mean-square error, and coverage probability. Their bias is small. For all four methods, at n = 500, the magnitude of the bias is 0.01 for each covariate. At n = 1,000, the magnitude of the bias is 0.006 for Laplace and PG, and 0.003 for Analytic and Efron’s method. At n = 2,000, it is 0.002 for Laplace and PG, and 0.007 for Analytic and Efron’s method. Their coverage probabilities are about 0.95, except for the covariate CRP at n = 500. In this case, Laplace, Analytic, Efron’s method, and PG each have coverage probability between 0.93 and 0.94, which is slightly lower than the nominal coverage probability of 0.95. At the larger sample sizes, n = 1,000 and n = 2,000, these methods have good coverage probability of 0.945 or higher for all covariates.

Table 6:

Bias, Standard Error (SE), Root-Mean-Square Error (RMSE), and Coverage Probability (CP) in Data-Based Simulation Study with n = 500. The number of simulations was 10,000.

CoxPHPG
LaplaceAnalyticEfron’s methodBreslow’s method
BiasSERMSECPBiasSERMSECPBiasSERMSECPBiasSERMSECPBiasSERMSECP
age 0.0040.2241.1810.949–0.0040.2211.1740.952–0.0050.2201.1730.951–0.0870.1971.0990.956 0.0060.2251.1830.948
education–0.0080.2580.7390.951–0.0060.2560.7370.952–0.0060.2560.7370.951 0.0130.2410.7210.962–0.0080.2580.7390.951
gender–0.0120.2810.9100.949–0.0090.2790.9070.950–0.0100.2800.9080.949 0.0220.2630.8800.960–0.0130.2820.9110.949
smoke–0.0040.2640.6260.949–0.0050.2620.6250.950–0.0040.2620.6260.950–0.0180.2460.6180.964–0.0030.2640.6260.948
CRP 0.0060.0430.5680.936 0.0060.0430.5680.938 0.0050.0420.5680.936 0.0030.0390.5680.960 0.0060.0430.5680.936

7 Discussion

The exact method of tie-handling is computationally intractable when there is a large number of ties in the observed event times. Hence, we propose two, new and computationally efficient approximations, Laplace and Analytic. Both approximations performed well in the simulation studies and a real data application to the Cardiovascular Health Study. The Laplace approximation yielded results that were almost identical to that of the PG model for all scenarios. Laplace and PG had low bias and generated confidence intervals with approximately the nominal coverage probability. Our results provide strong evidence that the Cox model, using the Laplace approximation for tie-handling, can serve as a substitute for the PG model in the grouped data setting. The Analytic approximation, given its remarkable simplicity, performed reasonably well in most scenarios, although it tended to have substantial bias for large β, as shown in the Single Covariate Simulations when β = 2. Hence, Analytic may be an attractive option in grouped event time data where the relative risks are not expected to be large. In fact, in the CHS application, the Analytic approximation provided results very similar to PG and Laplace. Our new approximations use a different approach than Breslow’s and Efron’s approximations because the former approximate the exact likelihood with ties, while the latter approximate the actual likelihood. A potential advantage of our approach compared to using the PG model is that fitting the PG model for high-dimensional data might be computationally intensive, relative to our simple approximations.

Both PG and Cox models make the proportional hazards assumption. Although the Cox model estimates the baseline hazard nonparametrically, and the PG model estimates it as the intercept terms, these two are essentially equivalent (with a complementary log-log link in the PG model). Thus, the essential comparison is in terms of different ways of handling the regression part, i. e. the covariate part of the model.

A potential area of future research is to examine the accuracy of our methods under cases with more covariates. Another potential area of future research is to determine the impact of a large number of visits on the results. In this paper, we considered the case of a small number of follow-up visits, e. g. up to 3 follow-up visits in the simulation studies and 2 in the CHS data example.

We would not recommend the use of Breslow’s tie-handling method in the grouped data setting. In simulation, it was observed that Breslow’s method had larger bias than the other methods and can have poor coverage probability, as low as 0.5. In the CHS application, the results of Breslow’s method were also quite different from the other methods.

For those who use the R programming language, we note that the exact method implemented in the “survival” package of R is not the exact method for tie-handling discussed in this paper. The exact method in the “survival” package corresponds to the discrete logistic model proposed by Cox (1972). The discrete logistic model has a different parameter of interest than the models considered in this paper. This is because, rather than making the proportional hazards assumption given in (1), the discrete logistic model assumes that the odds ratios of the hazards are proportional, i. e. logitλβ(t|Z)=logitλ0(t)exp(Zβ). Thus, the term exp(Zβ) represents the increase in the odds ratio, rather than the increase in the hazard. This model simplifies to the proportional hazards model in (1), if the event times are continuously distributed (Cox 1972).

The strong performance of the Cox PH model, using Laplace, Analytic, and Efron’s tie-handling methods, was an impressive and highly surprising finding, given the extremely discrete nature of the event times that we considered. There were 307 and 187 tied event times, respectively, at the first and second follow-up visits of frailty in the CHS study. One would not think that the Cox PH model would be appropriate for estimating relative risks in such a situation. To our knowledge, this type of result has not been reported in the literature before.

Award Identifier / Grant number: U01 FD004977-01

Funding statement: This work was supported by the U.S. Food and Drug Administration (Funder Id: http://doi.org/10.13039/100000038, Grant Number: U01 FD004977-01)

Award Identifier / Grant number: R01AG023629

Funding statement: National Institute on Aging (Funder Id: http://doi.org/10.13039/100000049, Grant Number: R01AG023629)

Award Identifier / Grant number: T32AG000247

Funding statement: National Institute on Aging (Funder Id: http://doi.org/10.13039/100000049, Grant Number: T32AG000247)

Appendix

A Derivation of the Laplace Approximation

Denote the integrand of (14) as P(t), i. e.

P(t)=iIk(1eai,kt)et.

We have

logP(t)=iIklog(1eai,kt)t;logP(t)t=iIkai,keai,kt11;2logP(t)2t=iIkai,k2eai,kt(eai,kt1)2.

Let tk be the solution to logP(t)t=0, i. e.

iIkai,keai,ktk1=1.

Define

σk=2logP(tk)2t=iIkai,k2eai,ktk(eai,ktk1)2.

Using a Taylor expansion of logP(t) about the point t=tk, we have

logP(t)logP(tk)σk2(ttk)2.

Thus, (14) is approximately equal to

(25)t=0P(tk)eσk(ttk)2/2dt=P(tk)2π/σkPr(Xk>0),

where Xk has a Normal distribution with mean tk and variance 1/σk. Note that (25) is a function of β since tk and σk depend on β. We estimate the unknown coefficient vector β as the value of β that maximizes the function

(26)k=1mP(tk)2π/σkPr(Xk>0),

which is the product of (25) over k=1,,m. This is equivalent to finding the maximizer of the function

f(β)=k=1mlogP(tk)1/2logσk+logPr(Xk>0),

which is obtained by taking the logarithm of (26) and omitting the terms that do not depend on β.

B Derivation of the “Analytic” Approximation

Without loss of generality, we assume that Ik={1,2,,dk} to simplify the notation in this proof. We have that

t=0iIk(1eai,kt)etdt=t=0i=1dkai,ktxi=01eai,ktxidxietdt=i=1dkai,kt=0tdketi=1dkxi=01eai,ktxidxidt=i=1dkai,kt=0tdketxdk=01x1=01eti=1dkai,kxii=1dkdxidt=i=1dkai,kxdk=01x1=01t=0tdket(1+i=1dkai,kxi)dti=1dkdxi=i=1dkai,kxdk=01x1=011(i=1dkai,kxi+1)dk+1u=0udkeudui=1dkdxi=i=1dkai,kΓ(dk+1)xdk=01x1=011(i=1dkai,kxi+1)dk+1i=1dkdxi.

We approximate the term in the square brackets by

xdk=01x1=01e(dk+1)i=1dkai,kxii=1dkdxi=1(dk+1)dki=1dk1e(dk+1)ai,kai,k.

Using this approximation, we have

t=0iIk(1eai,kt)etdti=1dkai,kΓ(dk+1)1(dk+1)dki=1dk1e(dk+1)ai,kai,k2πdkdk+1/2edki=1dkai,k1(dk+1)dki=1dk1e(dk+1)ai,kai,kby Stirling's formula=2πdkedk[(1+1/dk)dk]1i=1dk[1e(dk+1)ai,k]2πdkedke1i=1dk[1e(dk+1)ai,k]using the approximation (1+1/dk)dke=2πdke(dk+1)i=1dk[1e(dk+1)ai,k]=2πdke(dk+1)iIk[1e(dk+1)ai,k].

References

Allison, P. D. (1982). Discrete-time methods for the analysis of event histories. Sociological Methodology, 13:61–98.10.2307/270718Search in Google Scholar

Bild, D. E., Bluemke, D. A., Burke, G. L., Detrano, R., Diez Roux, A. V., Folsom, A. R., et al. (2002). Multi-ethnic study of atherosclerosis: objectives and design. American Journal of Epidemiology, 156:871–881.10.1093/aje/kwf113Search in Google Scholar

Cox, D. R. (1972). Regression models and life-tables. Journal of the Royal Statistical Society: Series B (Methodological), 34:187–202.10.1111/j.2517-6161.1972.tb00899.xSearch in Google Scholar

Davis, P., and Rabinowitz, P. (1984). Methods of Numerical Integration. New York: Dover Publications Mineola.10.1016/B978-0-12-206360-2.50012-1Search in Google Scholar

DeLong, D., Guirguis, G., and So, Y. (1994). Efficient computation of subset selection probabilities with application to Cox regression. Biometrika, 81:607–611.10.1093/biomet/81.3.607Search in Google Scholar

Erdelyi, A. (1956). Asymptotic Expansions. New York: Dover Publications New York.10.21236/AD0055660Search in Google Scholar

Fried, L., Borhani, N., Enright, P., Furberg, C., Gardin, J., Kronmal, R., et al. (1991). The Cardiovascular health study: Design and rationale. Annals of Epidemiology, 1:263–276.10.1016/1047-2797(91)90005-WSearch in Google Scholar

Fried, L., Tangen, C., Walston, J., Newman, A., Hirsch, C., Gottdiener, J. et al. (2001). Frailty in older adults: Evidence for a phenotype. The Journals of Gerontology Series A: Biological Sciences and Medical Sciences, 56:M146–M157.10.1093/gerona/56.3.M146Search in Google Scholar

Hertz-Picciotto, I., and Rockhill, B. (1997). Validity and efficiency of approximation methods for tied survival times in Cox regression. Biometrics, 53:1151–1156.10.2307/2533573Search in Google Scholar

Kalbfleisch, J. D., and R. L. Prentice. (2002). The Statistical Analysis of Failure Time Data. Hoboken, New Jersey: John Wiley & Sons.10.1002/9781118032985Search in Google Scholar

Onder, G., Penninx, B. W., Ferrucci, L., Fried, L. P., Guralnik, J. M., and Pahor, M. (2005). Measures of physical performance and risk for progressive and catastrophic disability: results from the women’s health and aging study. The Journals of Gerontology Series A: Biological Sciences and Medical Sciences, 60:74–79.10.1093/gerona/60.1.74Search in Google Scholar PubMed

Prentice, R., and Gloeckler, L. (1978). Regression analysis of grouped survival data with application to breast cancer data. Biometrics, 34:57–67.10.2307/2529588Search in Google Scholar

Scheike, T., and Sun, Y. (2007). Maximum likelihood estimation for tied survival data under cox regression model via em-algorithm. Lifetime Data Analysis, 13:399–420.10.1007/s10985-007-9043-3Search in Google Scholar PubMed

Therneau, T., and Grambsch, P. (2010) Modeling Survival Data: Extending the Cox Model. New York: Springer-Verlag.Search in Google Scholar


Supplementary Material

The online version of this article offers supplementary material (DOI:https://doi.org/10.1515/em-2018-0011).


Received: 2018-06-10
Revised: 2019-05-28
Accepted: 2019-06-18
Published Online: 2019-08-02

© 2019 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 21.9.2025 from https://www.degruyterbrill.com/document/doi/10.1515/em-2018-0011/html
Scroll to top button