Abstract
Interval estimation of the proportion parameter in the analysis of binary outcome data arising in cluster studies is often an important problem in many biomedical applications. In this paper, we propose two approaches based on the profile likelihood and Wilson score. We compare them with two existing methods recommended for complex survey data and some other methods that are simple extensions of well-known methods such as the likelihood, the generalized estimating equation of Zeger and Liang and the ratio estimator approach of Rao and Scott. An extensive simulation study is conducted for a variety of parameter combinations for the purposes of evaluating and comparing the performance of these methods in terms of coverage and expected lengths. Applications to biomedical data are used to illustrate the proposed methods.
1 Introduction
Binary outcome data sampled from clusters arise frequently in many biomedical, toxicological, clinical medicine, and epidemiological applications. The observed binary outcome data often exhibit greater or lesser variability than that predicted by a simple binomial model, referred to as over/under-dispersion [1]. There are several reasons that may lead to the over/under-dispersion in binary data. For instance, in a case-controlled study of familial aggregation of chronic obstructive pulmonary disease (COPD) [2], siblings within each family are correlated and the number of impaired pulmonary function (IPF) cases per family may be over-dispersed as compared to a binomial model. The 95% profile based confidence interval for intraclass correlation proposed by Saha ([3]) using the data in the study mentioned above (see Table 5 of Liang et al. [2]) is (0.0593, 0.4006), which supports the significance of within-family correlation. As a result, it shows that the observed variance 0.1418 in the estimated proportion of the IPF cases per family is 1.38 times larger than the predicted variance 0.1026 obtained using a binomial model. This concludes that standard approaches (see, for example, Hogg and Tanis [4], pp. 308–310) of analyzing such data that ignore the cluster structure may result in underestimation of the true standard error of the estimated infected rate when the correlation between siblings per family is positive. Furthermore, inference methods concerning the parameters of interest based on the binomial model in such data may significantly inflate the Type I error rate [5]. Although a number of confidence intervals for a single proportion based on clustered data have been studied for complex survey data, little attention has been paid to model-based approaches for inferring about the proportion in the analysis of clustered binary data. Kleinman [6] studied the properties of the maximum likelihood (ML) and the method of moments (MM) estimators for the proportion parameter based on parametric and semiparametric models for clustered binary data. Based on several model structures, Paul and Islam [7] investigated the joint estimation of the proportion and dispersion parameters in terms of bias and efficiency. Surprisingly, these approaches were not extended to investigate the coverage probabilities of confidence interval estimation of the proportion.
Developing a confidence interval for the proportion parameter using clustered binary data is an important problem. For example, in the diagnostic accuracy of contrast-enhanced multi-detector row spiral computed tomography coronary angiography [8], interval estimates of sensitivity (true positive rate) and specificity (true negative rate) are often used at the patient level, at the coronary artery level, and at the coronary artery segment level. Another example involves the estimation of sensitivity and specificity to assess the accuracy of radiologists’ readings in a mammogram screening study [9], where the proportion of positive readings in cancer cases and the proportion of negative readings in non-cancer cases within each radiologist may be over-dispersed. To make inferences on sensitivity and specificity, one usually uses the inference methodology developed for a single proportion. There is an abundance of literature pertaining to inferences for a single proportion based on non-clustered binary data (for example, Agresti and Coull [10] and Newcombe [11]. However, little attention has been paid to such inferences in the case of small to moderate sample size clustered binary data using parametric and semiparametric models. For complex survey data, some authors have developed alternative methods extending those derived for non-clustered binary data. In particular, the modified Clopper-Pearson (MCP) method and the modified Wilson score (MWS) method are recommended for analyzing complex survey data (see Korn and Graubard [12], page 65). However, in some situations the MCP method is somewhat conservative while the MWS method shows lack of coverage when the variability of the weights is small or large. Moreover, the sampling weights are not readily available in biomedical applications though they are required to calculate the effective sample size in order to find these intervals.
Lui [13] derived three methods based on model assumptions using the estimation of the intraclass correlation by analysis of variance, but the performances of these methods were not examined. Rutter [14] introduced bootstrap interval methods for sensitivity and specificity to measure the diagnostic accuracy with patient-clustered data using bootstrapping to estimate the variance, but these methods are computationally demanding. For the balanced data set-up, Kim and Lee [9] proposed an asymptotic confidence interval for a single proportion based on the beta-binomial distribution which works well for larger proportions when the cluster sizes are over 25, but suffers from serious under-coverage for small numbers of clusters.In many practical problems, the cluster sizes are often not equal and small (\lt15) or the number of clusters is small to moderate (see, for example, Zhou et al. [8], page 112). In order to assess the accuracy of computer-aided detection enhanced computed tomography colonography for the detection of polyps, Zhou et al. [8] obtained the asymptotic confidence interval for sensitivity of clustered binary data using a ratio estimator for the variance given by Rao and Scott [5]. However, this method shows serious under-coverage (see, Figure 1 and Paul and Zaihra [15], p. 4219).

The observed coverage probability and the expected interval length of 95% nominal confidence intervals for the proportion
The main focus of this paper is to develop asymptotic confidence intervals for a single proportion arising in cluster studies. In particular, in Section 3 we propose two new approaches based on the Wilson score and profile likelihood that will properly incorporate the intraclass correlation structure. In addition, we consider a number of extensions of existing methods in order for them to be feasible for clustered binary outcome data. Section 4 conducts a simulation study to assess the performances of these intervals in comparison with two existing methods recommended for analyzing survey data in terms of coverage and interval length. The methods developed in this paper are applied to analyze medical data sets in Section 5. Some concluding remarks are given in Section 6.
2 Models
2.1 Beta-binomial model
Let
for
2.2 Semi-parametric models
In some situations, the full parametric assumption may be too restrictive in which case a more flexible model can be used that only specifies the mean and variance of the data distribution (see Paul and Islam [17]). Let
3 Confidence intervals for the single proportion
Since we will deal with and compare a large number of methods for constructing confidence intervals (CI), we now list all of them and their acronyms for easy reference:
PL: profile likelihood | EQL: extended quasi-likelihood |
WI1 and WI2: two versions of the Wilson score modified for clustered binary data | DEQL: double extended quasi-likelihood |
QEE: quadratic estimating equations | |
WA1 and WA2: two versions of the Wald CI | MCP: modified Clopper-Pearson method for complex survey data |
R1 and R2: two versions of the ratio estimator | MWS: modified Wilson score for complex survey data |
G1 and G2: two versions of the generalized estimating equation | |
ML: maximum likelihood |
3.1 The CI based on PL
Altaye et al. [19] discussed that an obvious approach to constructing a confidence interval for the parameter of interest may not perform well with extreme true values or when the sample size is small. Here we use a profile likelihood based confidence interval approach which has been shown to provide accurate results when computing confidence limits for a single proportion [11] or the difference between two proportions [20] in the case of non-clustered binary data. Let
where
3.2 The Wilson score interval
From the above semi-parametric model, an estimator of
After some straightforward algebra, it can be obtained as
where
and
3.3 Extensions of other methods
We now consider some additional approaches that extend existing methods:
The Wald CIs: From the above, we see that the sample proportion
RE based CIs: Following the result provided by Rao and Scott [5], one can obtain the corrected estimated variance of
Note that
GEE based CIs: Paul and Zaihra [15] applied the generalized estimating equation (GEE) approach of Zeger and Liang [21] to basic binary data and obtained an estimate of a proportion from clustered correlated binary data and a sandwich estimate of its variance, which are given by
respectively. As
ML based CI: The log-likelihood of the beta-binomial model, apart from a constant, can be written as
The ML estimators
EQL based CI: Paul and Saha [22] obtained the EQL using the mean and variance of
Now, using
DEQL based CI: Based on the semi-parametric model, we obtain the profile double extended quasi-likelihood from Paul and Saha [22], which, apart from a constant, is
with
QEE based CI: The QEE estimates
4 Simulation studies
The primary goal of our simulations was to provide guidance in the selection of an appropriate confidence interval for the proportion, based on clustered binary data, by assessing the performance of the 15 intervals considered in this paper, in terms of the observed coverage probability and the average interval length using the pre-assigned confidence levels of 90% and 95%. In some cases, there are only a few clusters with unequal and variable cluster sizes. For example, in multicenter clinical trials or studies that validate assay sensitivity and specificity across several labs there may only be a few clusters (few clinical centers, few labs) with large cluster sizes. In some other applications, e.g., in ecology and parasitology, the clusters could be single animals with very few repeated observations available, i.e., cluster sizes are very small [16]. Based on the above realizations, we considered six different configurations of cluster sizes with different numbers of clusters: (i) fixed cluster sizes (
where
The coverage properties and expected lengths for all the methods are almost identical for both the 90% and 95% pre-assigned confidence levels. We present simulation results for only a pre-assigned confidence level of 95% with the results for a 90% nominal confidence level being provided in the Supplementary Materials (see, Figures W2–W6 and Tables W5–W8). The box plots and the medians of the observed CPs and the ELs obtained from the 15 interval procedures for the 180 (6 values of k, 10 values of
Median coverage probability (CP) and median expected length (EL) of the 95% confidence intervals for
Method | Median CP | Median EL | Length comparison individual/ML |
---|---|---|---|
PL | 0.943 | 0.156 | 0.971 |
WI1 | 0.950 | 0.167 | 1.043 |
WI2 | 0.951 | 0.164 | 1.025 |
WA1 | 0.943 | 0.168 | 1.051 |
WA2 | 0.942 | 0.165 | 1.031 |
R1 | 0.941 | 0.166 | 1.039 |
R2 | 0.944 | 0.166 | 1.039 |
G1 | 0.937 | 0.165 | 1.030 |
G2 | 0.942 | 0.165 | 1.030 |
ML | 0.941 | 0.160 | 1.000 |
EQL | 0.974 | 0.197 | 1.227 |
DEQL | 0.968 | 0.191 | 1.195 |
QEE | 0.944 | 0.188 | 1.175 |
MCP | 0.965 | 0.179 | 1.119 |
MWS | 0.955 | 0.170 | 1.063 |
We now evaluate the methods stratified by different parameter space points. It has been seen from the above results that EQL, DEQL, and MCP are somewhat conservative, and the CPs for G2 and R2 are better than those for G1 and R1. However, it shows that MCP has a less conservative property than those of EQL and DEQL. In the interest of brevity, we present the observed CP and EL results only for the 11 generally more competitive methods, namely, PL, WI1, WI2, WA1, WA2, R2, G2, ML, QEE, MCP, and MWS. Since the results for fixed cluster sizes between

The observed coverage probability of 95% nominal confidence intervals for the proportion

The observed coverage probability of 95% nominal confidence intervals for the proportion

The expected interval length of 95% nominal confidence intervals for the proportion

The expected interval length of 95% nominal confidence intervals for the proportion
The coverage probability estimates based on confidence intervals by the methods with nominal level,
k | PL | WI1 | WI2 | WA1 | WA2 | R2 | G2 | ML | QEE | MCP | MWS | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Fixed litter sizes | |||||||||||||
19 | 0.1 | 0.05 | 0.957 | 0.959 | 0.960 | 0.978 | 0.979 | 0.978 | 0.974 | 0.984 | 0.986 | 0.989 | 0.969 |
0.15 | 0.943 | 0.965 | 0.963 | 0.955 | 0.953 | 0.966 | 0.960 | 0.948 | 0.962 | 0.982 | 0.970 | ||
0.25 | 0.932 | 0.952 | 0.947 | 0.945 | 0.936 | 0.951 | 0.944 | 0.943 | 0.944 | 0.970 | 0.959 | ||
0.35 | 0.950 | 0.949 | 0.939 | 0.943 | 0.934 | 0.945 | 0.939 | 0.955 | 0.956 | 0.968 | 0.957 | ||
0.45 | 0.940 | 0.953 | 0.943 | 0.946 | 0.938 | 0.946 | 0.940 | 0.947 | 0.945 | 0.968 | 0.958 | ||
0.3 | 0.05 | 0.951 | 0.961 | 0.967 | 0.961 | 0.969 | 0.989 | 0.986 | 0.990 | 0.992 | 0.990 | 0.967 | |
0.15 | 0.946 | 0.962 | 0.964 | 0.943 | 0.939 | 0.957 | 0.950 | 0.922 | 0.940 | 0.981 | 0.966 | ||
0.25 | 0.941 | 0.951 | 0.944 | 0.934 | 0.929 | 0.938 | 0.931 | 0.914 | 0.928 | 0.970 | 0.956 | ||
0.35 | 0.945 | 0.944 | 0.940 | 0.934 | 0.927 | 0.932 | 0.926 | 0.933 | 0.940 | 0.971 | 0.958 | ||
0.45 | 0.932 | 0.948 | 0.942 | 0.938 | 0.932 | 0.937 | 0.931 | 0.941 | 0.944 | 0.967 | 0.956 | ||
0.5 | 0.05 | 0.947 | 0.954 | 0.964 | 0.944 | 0.961 | 0.990 | 0.988 | 0.994 | 0.996 | 0.992 | 0.961 | |
0.15 | 0.949 | 0.958 | 0.962 | 0.934 | 0.933 | 0.946 | 0.941 | 0.916 | 0.940 | 0.979 | 0.961 | ||
0.25 | 0.944 | 0.949 | 0.951 | 0.925 | 0.926 | 0.929 | 0.923 | 0.899 | 0.917 | 0.975 | 0.961 | ||
0.35 | 0.947 | 0.950 | 0.948 | 0.935 | 0.932 | 0.936 | 0.928 | 0.935 | 0.948 | 0.972 | 0.960 | ||
0.45 | 0.949 | 0.949 | 0.948 | 0.933 | 0.933 | 0.931 | 0.9267 | 0.932 | 0.937 | 0.972 | 0.958 | ||
73 | 0.1 | 0.05 | 0.937 | 0.954 | 0.954 | 0.944 | 0.945 | 0.951 | 0.950 | 0.953 | 0.971 | 0.964 | 0.954 |
0.15 | 0.935 | 0.947 | 0.942 | 0.945 | 0.942 | 0.943 | 0.941 | 0.940 | 0.951 | 0.953 | 0.946 | ||
0.25 | 0.939 | 0.949 | 0.945 | 0.949 | 0.943 | 0.945 | 0.944 | 0.941 | 0.943 | 0.957 | 0.951 | ||
0.35 | 0.941 | 0.950 | 0.944 | 0.948 | 0.943 | 0.945 | 0.943 | 0.943 | 0.943 | 0.955 | 0.949 | ||
0.45 | 0.948 | 0.958 | 0.952 | 0.955 | 0.950 | 0.952 | 0.951 | 0.966 | 0.968 | 0.961 | 0.956 | ||
0.3 | 0.05 | 0.934 | 0.951 | 0.953 | 0.935 | 0.933 | 0.942 | 0.941 | 0.970 | 0.983 | 0.962 | 0.948 | |
0.15 | 0.949 | 0.950 | 0.951 | 0.945 | 0.945 | 0.945 | 0.944 | 0.922 | 0.941 | 0.959 | 0.949 | ||
0.25 | 0.931 | 0.953 | 0.952 | 0.948 | 0.946 | 0.945 | 0.943 | 0.934 | 0.946 | 0.958 | 0.950 | ||
0.35 | 0.944 | 0.951 | 0.952 | 0.949 | 0.951 | 0.945 | 0.943 | 0.942 | 0.944 | 0.959 | 0.954 | ||
0.45 | 0.946 | 0.956 | 0.953 | 0.952 | 0.951 | 0.948 | 0.946 | 0.967 | 0.967 | 0.959 | 0.953 | ||
0.5 | 0.05 | 0.951 | 0.954 | 0.961 | 0.945 | 0.943 | 0.949 | 0.947 | 0.977 | 0.989 | 0.971 | 0.954 | |
0.15 | 0.941 | 0.950 | 0.952 | 0.940 | 0.944 | 0.939 | 0.938 | 0.915 | 0.936 | 0.962 | 0.950 | ||
0.25 | 0.933 | 0.952 | 0.958 | 0.946 | 0.950 | 0.944 | 0.943 | 0.933 | 0.950 | 0.963 | 0.954 | ||
0.35 | 0.939 | 0.951 | 0.956 | 0.946 | 0.952 | 0.943 | 0.942 | 0.936 | 0.939 | 0.960 | 0.950 | ||
0.45 | 0.943 | 0.953 | 0.957 | 0.950 | 0.953 | 0.946 | 0.944 | 0.962 | 0.965 | 0.961 | 0.952 | ||
ED litter sizes | |||||||||||||
20 | 0.1 | 0.05 | 0.962 | 0.965 | 0.967 | 0.972 | 0.975 | 0.984 | 0.980 | 0.981 | 0.986 | 0.991 | 0.974 |
0.15 | 0.939 | 0.955 | 0.951 | 0.943 | 0.941 | 0.954 | 0.946 | 0.940 | 0.950 | 0.974 | 0.962 | ||
0.25 | 0.936 | 0.947 | 0.941 | 0.940 | 0.935 | 0.943 | 0.938 | 0.940 | 0.941 | 0.967 | 0.960 | ||
0.35 | 0.930 | 0.944 | 0.938 | 0.937 | 0.931 | 0.941 | 0.936 | 0.943 | 0.944 | 0.966 | 0.956 | ||
0.45 | 0.932 | 0.946 | 0.938 | 0.942 | 0.934 | 0.943 | 0.938 | 0.951 | 0.951 | 0.963 | 0.955 | ||
0.3 | 0.05 | 0.947 | 0.958 | 0.967 | 0.954 | 0.961 | 0.986 | 0.982 | 0.990 | 0.995 | 0.990 | 0.966 | |
0.15 | 0.943 | 0.959 | 0.962 | 0.935 | 0.934 | 0.947 | 0.941 | 0.905 | 0.924 | 0.980 | 0.966 | ||
0.25 | 0.949 | 0.945 | 0.944 | 0.934 | 0.931 | 0.938 | 0.932 | 0.913 | 0.926 | 0.968 | 0.956 | ||
0.35 | 0.953 | 0.938 | 0.938 | 0.927 | 0.926 | 0.929 | 0.923 | 0.927 | 0.933 | 0.962 | 0.951 | ||
0.45 | 0.942 | 0.944 | 0.941 | 0.935 | 0.934 | 0.937 | 0.930 | 0.944 | 0.949 | 0.966 | 0.954 | ||
0.5 | 0.05 | 0.943 | 0.953 | 0.963 | 0.956 | 0.961 | 0.982 | 0.978 | 0.993 | 0.997 | 0.989 | 0.959 | |
0.15 | 0.950 | 0.957 | 0.965 | 0.935 | 0.936 | 0.942 | 0.938 | 0.897 | 0.925 | 0.981 | 0.966 | ||
0.25 | 0.951 | 0.943 | 0.947 | 0.921 | 0.923 | 0.926 | 0.921 | 0.901 | 0.918 | 0.970 | 0.957 | ||
0.35 | 0.952 | 0.943 | 0.944 | 0.928 | 0.932 | 0.931 | 0.925 | 0.934 | 0.948 | 0.969 | 0.953 | ||
0.45 | 0.953 | 0.941 | 0.947 | 0.930 | 0.934 | 0.932 | 0.926 | 0.938 | 0.944 | 0.971 | 0.957 | ||
50 | 0.1 | 0.05 | 0.952 | 0.952 | 0.956 | 0.940 | 0.942 | 0.953 | 0.951 | 0.960 | 0.978 | 0.967 | 0.956 |
0.15 | 0.938 | 0.949 | 0.947 | 0.945 | 0.944 | 0.947 | 0.945 | 0.943 | 0.952 | 0.959 | 0.951 | ||
0.25 | 0.939 | 0.948 | 0.945 | 0.945 | 0.941 | 0.944 | 0.942 | 0.941 | 0.943 | 0.959 | 0.951 | ||
0.35 | 0.939 | 0.946 | 0.943 | 0.943 | 0.939 | 0.943 | 0.941 | 0.941 | 0.942 | 0.955 | 0.948 | ||
0.45 | 0.943 | 0.946 | 0.944 | 0.945 | 0.942 | 0.944 | 0.941 | 0.960 | 0.961 | 0.957 | 0.950 | ||
0.3 | 0.05 | 0.954 | 0.958 | 0.967 | 0.933 | 0.936 | 0.950 | 0.949 | 0.981 | 0.989 | 0.973 | 0.959 | |
0.15 | 0.951 | 0.944 | 0.948 | 0.935 | 0.936 | 0.939 | 0.936 | 0.915 | 0.930 | 0.958 | 0.948 | ||
0.25 | 0.934 | 0.952 | 0.954 | 0.946 | 0.946 | 0.946 | 0.944 | 0.933 | 0.944 | 0.964 | 0.956 | ||
0.35 | 0.941 | 0.951 | 0.952 | 0.947 | 0.948 | 0.946 | 0.944 | 0.941 | 0.943 | 0.961 | 0.953 | ||
0.45 | 0.945 | 0.952 | 0.954 | 0.948 | 0.951 | 0.946 | 0.944 | 0.964 | 0.966 | 0.962 | 0.955 | ||
0.5 | 0.05 | 0.950 | 0.957 | 0.968 | 0.937 | 0.938 | 0.946 | 0.945 | 0.983 | 0.991 | 0.977 | 0.958 | |
0.15 | 0.947 | 0.949 | 0.955 | 0.933 | 0.939 | 0.937 | 0.934 | 0.906 | 0.930 | 0.966 | 0.954 | ||
0.25 | 0.948 | 0.949 | 0.955 | 0.938 | 0.945 | 0.940 | 0.937 | 0.927 | 0.943 | 0.965 | 0.954 | ||
0.35 | 0.938 | 0.949 | 0.956 | 0.940 | 0.950 | 0.940 | 0.937 | 0.937 | 0.940 | 0.962 | 0.954 | ||
0.45 | 0.937 | 0.946 | 0.952 | 0.942 | 0.947 | 0.942 | 0.939 | 0.955 | 0.959 | 0.961 | 0.952 |
The results in Figures 2 and 3 and Table 2 show that, irrespective of fixed or variable cluster sizes, the coverage properties of all the methods are very similar. As expected, the variations of CPs for different values of
As expected, we see from the results in Figures 4 and 5 and Table 3 that the ELs for all the methods decrease as k increases. In addition, the ELs for all the methods increase when the risk rate, as well as the deviation from independence among observations within the same cluster, increase. For instance, ranges for the ELs in Figure 4(v) for
The expected interval lengths based on confidence intervals by the methods with nominal level,
k | PL | WI1 | WI2 | WA1 | WA2 | R2 | G2 | ML | QEE | MCP | MWS | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Fixed litter sizes | |||||||||||||
19 | 0.1 | 0.05 | 0.105 | 0.113 | 0.111 | 0.105 | 0.104 | 0.104 | 0.101 | 0.122 | 0.211 | 0.128 | 0.119 |
0.15 | 0.150 | 0.164 | 0.158 | 0.165 | 0.158 | 0.161 | 0.157 | 0.159 | 0.253 | 0.193 | 0.178 | ||
0.25 | 0.181 | 0.195 | 0.186 | 0.199 | 0.190 | 0.194 | 0.188 | 0.191 | 0.213 | 0.229 | 0.212 | ||
0.35 | 0.199 | 0.213 | 0.203 | 0.218 | 0.208 | 0.213 | 0.207 | 0.209 | 0.210 | 0.251 | 0.233 | ||
0.45 | 0.209 | 0.222 | 0.211 | 0.228 | 0.216 | 0.222 | 0.216 | 0.224 | 0.225 | 0.261 | 0.243 | ||
0.3 | 0.05 | 0.133 | 0.147 | 0.144 | 0.134 | 0.131 | 0.131 | 0.127 | 0.148 | 0.183 | 0.163 | 0.151 | |
0.15 | 0.183 | 0.213 | 0.204 | 0.214 | 0.205 | 0.208 | 0.203 | 0.191 | 0.327 | 0.242 | 0.220 | ||
0.25 | 0.227 | 0.251 | 0.240 | 0.259 | 0.247 | 0.252 | 0.245 | 0.235 | 0.255 | 0.285 | 0.259 | ||
0.35 | 0.256 | 0.275 | 0.262 | 0.286 | 0.272 | 0.278 | 0.271 | 0.261 | 0.269 | 0.312 | 0.284 | ||
0.45 | 0.270 | 0.287 | 0.273 | 0.300 | 0.285 | 0.291 | 0.283 | 0.275 | 0.278 | 0.325 | 0.296 | ||
0.5 | 0.05 | 0.166 | 0.179 | 0.175 | 0.163 | 0.157 | 0.160 | 0.156 | 0.172 | 0.205 | 0.199 | 0.182 | |
0.15 | 0.214 | 0.253 | 0.244 | 0.255 | 0.246 | 0.250 | 0.243 | 0.222 | 0.298 | 0.286 | 0.257 | ||
0.25 | 0.264 | 0.296 | 0.286 | 0.309 | 0.297 | 0.303 | 0.295 | 0.273 | 0.317 | 0.336 | 0.301 | ||
0.35 | 0.299 | 0.324 | 0.312 | 0.342 | 0.329 | 0.336 | 0.327 | 0.306 | 0.318 | 0.367 | 0.329 | ||
0.45 | 0.316 | 0.336 | 0.325 | 0.358 | 0.344 | 0.351 | 0.342 | 0.321 | 0.329 | 0.381 | 0.342 | ||
73 | 0.1 | 0.05 | 0.040 | 0.045 | 0.044 | 0.044 | 0.043 | 0.043 | 0.043 | 0.049 | 0.143 | 0.048 | 0.045 |
0.15 | 0.067 | 0.072 | 0.070 | 0.072 | 0.070 | 0.071 | 0.070 | 0.069 | 0.076 | 0.076 | 0.074 | ||
0.25 | 0.083 | 0.087 | 0.085 | 0.088 | 0.085 | 0.086 | 0.085 | 0.084 | 0.085 | 0.092 | 0.089 | ||
0.35 | 0.093 | 0.096 | 0.094 | 0.097 | 0.094 | 0.095 | 0.094 | 0.093 | 0.093 | 0.101 | 0.098 | ||
0.45 | 0.097 | 0.100 | 0.098 | 0.101 | 0.098 | 0.099 | 0.098 | 0.099 | 0.100 | 0.106 | 0.103 | ||
0.3 | 0.05 | 0.051 | 0.064 | 0.063 | 0.062 | 0.061 | 0.061 | 0.060 | 0.061 | 0.070 | 0.067 | 0.063 | |
0.15 | 0.088 | 0.103 | 0.100 | 0.103 | 0.100 | 0.100 | 0.100 | 0.091 | 0.362 | 0.106 | 0.101 | ||
0.25 | 0.112 | 0.124 | 0.120 | 0.125 | 0.121 | 0.122 | 0.121 | 0.115 | 0.121 | 0.128 | 0.122 | ||
0.35 | 0.128 | 0.137 | 0.133 | 0.138 | 0.134 | 0.135 | 0.134 | 0.129 | 0.132 | 0.140 | 0.135 | ||
0.45 | 0.135 | 0.142 | 0.138 | 0.144 | 0.140 | 0.141 | 0.140 | 0.136 | 0.137 | 0.146 | 0.141 | ||
0.5 | 0.05 | 0.063 | 0.080 | 0.078 | 0.077 | 0.075 | 0.076 | 0.075 | 0.072 | 0.085 | 0.084 | 0.079 | |
0.15 | 0.105 | 0.125 | 0.122 | 0.125 | 0.122 | 0.123 | 0.122 | 0.108 | 0.144 | 0.130 | 0.123 | ||
0.25 | 0.134 | 0.151 | 0.148 | 0.152 | 0.149 | 0.150 | 0.149 | 0.137 | 0.154 | 0.156 | 0.148 | ||
0.35 | 0.152 | 0.166 | 0.162 | 0.168 | 0.164 | 0.165 | 0.164 | 0.155 | 0.161 | 0.171 | 0.163 | ||
0.45 | 0.162 | 0.173 | 0.169 | 0.176 | 0.172 | 0.173 | 0.172 | 0.163 | 0.168 | 0.178 | 0.170 | ||
ED litter sizes | |||||||||||||
20 | 0.1 | 0.05 | 0.084 | 0.090 | 0.090 | 0.086 | 0.086 | 0.085 | 0.083 | 0.101 | 0.190 | 0.103 | 0.097 |
0.15 | 0.106 | 0.137 | 0.133 | 0.137 | 0.134 | 0.136 | 0.132 | 0.133 | 0.157 | 0.157 | 0.147 | ||
0.25 | 0.143 | 0.164 | 0.159 | 0.166 | 0.161 | 0.164 | 0.160 | 0.162 | 0.164 | 0.188 | 0.176 | ||
0.35 | 0.165 | 0.180 | 0.174 | 0.183 | 0.177 | 0.181 | 0.177 | 0.180 | 0.181 | 0.206 | 0.194 | ||
0.45 | 0.178 | 0.187 | 0.181 | 0.191 | 0.184 | 0.189 | 0.184 | 0.194 | 0.195 | 0.214 | 0.201 | ||
0.3 | 0.05 | 0.114 | 0.128 | 0.128 | 0.118 | 0.118 | 0.118 | 0.115 | 0.127 | 0.147 | 0.145 | 0.134 | |
0.15 | 0.164 | 0.192 | 0.187 | 0.193 | 0.188 | 0.192 | 0.187 | 0.172 | 0.247 | 0.218 | 0.200 | ||
0.25 | 0.208 | 0.228 | 0.222 | 0.234 | 0.227 | 0.232 | 0.226 | 0.214 | 0.232 | 0.259 | 0.237 | ||
0.35 | 0.235 | 0.249 | 0.242 | 0.257 | 0.249 | 0.255 | 0.249 | 0.240 | 0.245 | 0.282 | 0.259 | ||
0.45 | 0.249 | 0.260 | 0.252 | 0.269 | 0.261 | 0.268 | 0.261 | 0.252 | 0.256 | 0.294 | 0.271 | ||
0.5 | 0.05 | 0.147 | 0.164 | 0.162 | 0.151 | 0.147 | 0.151 | 0.147 | 0.153 | 0.182 | 0.188 | 0.172 | |
0.15 | 0.195 | 0.234 | 0.229 | 0.236 | 0.230 | 0.236 | 0.230 | 0.204 | 0.278 | 0.270 | 0.244 | ||
0.25 | 0.246 | 0.274 | 0.269 | 0.284 | 0.278 | 0.284 | 0.277 | 0.255 | 0.306 | 0.315 | 0.285 | ||
0.35 | 0.280 | 0.300 | 0.294 | 0.314 | 0.308 | 0.315 | 0.307 | 0.287 | 0.303 | 0.344 | 0.311 | ||
0.45 | 0.297 | 0.312 | 0.306 | 0.329 | 0.322 | 0.329 | 0.321 | 0.302 | 0.313 | 0.358 | 0.323 | ||
50 | 0.1 | 0.05 | 0.049 | 0.054 | 0.054 | 0.053 | 0.052 | 0.052 | 0.052 | 0.060 | 0.079 | 0.059 | 0.056 |
0.15 | 0.081 | 0.086 | 0.085 | 0.086 | 0.085 | 0.085 | 0.085 | 0.084 | 0.110 | 0.093 | 0.090 | ||
0.25 | 0.100 | 0.104 | 0.103 | 0.105 | 0.103 | 0.104 | 0.103 | 0.102 | 0.103 | 0.113 | 0.109 | ||
0.35 | 0.111 | 0.115 | 0.112 | 0.115 | 0.113 | 0.114 | 0.113 | 0.113 | 0.113 | 0.124 | 0.119 | ||
0.45 | 0.117 | 0.120 | 0.117 | 0.121 | 0.118 | 0.119 | 0.118 | 0.121 | 0.121 | 0.129 | 0.124 | ||
0.3 | 0.05 | 0.063 | 0.077 | 0.077 | 0.074 | 0.073 | 0.073 | 0.073 | 0.075 | 0.087 | 0.083 | 0.078 | |
0.15 | 0.106 | 0.122 | 0.120 | 0.122 | 0.120 | 0.121 | 0.120 | 0.110 | 0.207 | 0.130 | 0.123 | ||
0.25 | 0.135 | 0.148 | 0.145 | 0.149 | 0.147 | 0.148 | 0.146 | 0.139 | 0.145 | 0.156 | 0.148 | ||
0.35 | 0.154 | 0.163 | 0.160 | 0.165 | 0.162 | 0.163 | 0.162 | 0.156 | 0.159 | 0.172 | 0.163 | ||
0.45 | 0.162 | 0.170 | 0.167 | 0.172 | 0.169 | 0.170 | 0.169 | 0.163 | 0.166 | 0.179 | 0.170 | ||
0.5 | 0.05 | 0.079 | 0.098 | 0.096 | 0.093 | 0.091 | 0.092 | 0.091 | 0.089 | 0.105 | 0.105 | 0.098 | |
0.15 | 0.126 | 0.150 | 0.147 | 0.150 | 0.148 | 0.149 | 0.148 | 0.130 | 0.177 | 0.159 | 0.149 | ||
0.25 | 0.161 | 0.180 | 0.177 | 0.183 | 0.180 | 0.182 | 0.180 | 0.165 | 0.188 | 0.191 | 0.179 | ||
0.35 | 0.183 | 0.198 | 0.195 | 0.202 | 0.199 | 0.201 | 0.199 | 0.186 | 0.194 | 0.210 | 0.197 | ||
0.45 | 0.194 | 0.206 | 0.203 | 0.211 | 0.208 | 0.209 | 0.207 | 0.196 | 0.202 | 0.218 | 0.205 |
5 Applications to real data analysis
We apply all 15 CI methods considered in Section 3 to three data sets: (i) CTC images data [8], (ii) chronic obstructive pulmonary disease (COPD) data [2], and (iii) screening mammogram data [9]. The number of clusters and cluster sizes varied among these studies.
5.1 Example 1: CTC images data
This study was a contrast-enhanced multi-detector row spiral computed tomography coronary (CTC) angiography [8], which is an imaging test that can detect polyps before they develop into cancer. Investigators developed a computer algorithm, called computer aided detection (CAD), to help radiologists detect polyps on the CTC. The main purpose of this study was to assess the radiologists’ diagnostic accuracy of CAD-enhanced CTC for detecting polyps. In the trial, 270 patients from six institutions were compiled in the retrospective design. These patients had undergone CTC for several medical reasons. In order to assess the radiologists’ performance, 25 patients were randomly selected from the 119 test cases. This study showed that there were actually multiple polyps in some of those patients varying, from 1 to 3 polyps with sample mean 1.56 and standard deviation 0.58, which indicates that the detection capabilities for each patient may be correlated. In order to assess the radiologists’ diagnostic accuracy in this study, we considered the confidence interval procedures discussed in Section 3 to estimate the sensitivity of CAD-enhanced CTC for detecting polyps.
The values (standard error) of
Ninety-five percent confidence intervals of the single proportion
Method | Lower CI | Upper CI | Length | Length comparison individual/ML |
---|---|---|---|---|
PL | 0.6982 | 0.9419 | 0.2437 | 0.982 |
WI1 | 0.6735 | 0.9362 | 0.2626 | 1.058 |
WI2 | 0.6821 | 0.9340 | 0.2519 | 1.015 |
WA1 | 0.7133 | 0.9790 | 0.2656 | 1.070 |
WA2 | 0.7192 | 0.9736 | 0.2544 | 1.025 |
R1 | 0.7093 | 0.9657 | 0.2564 | 1.033 |
R2 | 0.6853 | 0.9426 | 0.2574 | 1.037 |
G1 | 0.7119 | 0.9631 | 0.2512 | 1.012 |
G2 | 0.6879 | 0.9400 | 0.2522 | 1.016 |
ML | 0.7223 | 0.9705 | 0.2482 | 1.000 |
EQL | 0.7269 | 0.9676 | 0.2407 | 0.970 |
DEQL | 0.6933 | 0.9978 | 0.3046 | 1.227 |
QEE | 0.7090 | 0.9776 | 0.2686 | 1.082 |
MCP | 0.6571 | 0.9551 | 0.2980 | 1.2007 |
MWS | 0.6697 | 0.9376 | 0.2679 | 1.0794 |
5.2 Example 2: Chronic obstructive pulmonary disease data
This was a case-control study of familial aggregation of chronic obstructive pulmonary disease (COPD). The main object of this study was to determine whether there is a significant probability that a given sibling of a COPD patient has impaired pulmonary function (IPF). The data given in Table 5 of Liang et al. ([2]) refer to the frequency distribution of the number of IPF cases per family. There are 203 siblings from 100 families of various sizes ranging from 1 to 6 with the sample mean 2.03 and standard deviation 1.32. The binary response of interest is whether a given sibling of a COPD patient has IPF. Siblings within each family are correlated and the number of IPF cases per family may be over-dispersed as compared to a binomial model. To examine the existence of IPF for a given sibling of this study, we considered the confidence interval procedures discussed in Section 3 to estimate the IPF rate.
We compute the values (standard error) of
Ninety-five percent confidence intervals of the single proportion
Method | Lower CI | Upper CI | Length | Length comparison individual/ML |
---|---|---|---|---|
PL | 0.2150 | 0.3563 | 0.1413 | 0.995 |
WI1 | 0.2284 | 0.3729 | 0.1445 | 1.018 |
WI2 | 0.2153 | 0.3603 | 0.1451 | 1.021 |
WA1 | 0.2226 | 0.3686 | 0.1460 | 1.028 |
WA2 | 0.2089 | 0.3555 | 0.1465 | 1.032 |
R1 | 0.2174 | 0.3758 | 0.1584 | 1.115 |
R2 | 0.2204 | 0.3786 | 0.1582 | 1.114 |
G1 | 0.2178 | 0.3754 | 0.1576 | 1.110 |
G2 | 0.2208 | 0.3782 | 0.1574 | 1.109 |
ML | 0.2112 | 0.3532 | 0.1420 | 1.000 |
EQL | 0.1987 | 0.3514 | 0.1527 | 1.075 |
DEQL | 0.1979 | 0.3678 | 0.1699 | 1.196 |
QEE | 0.2139 | 0.3570 | 0.1431 | 1.007 |
MCP | 0.1938 | 0.3484 | 0.1546 | 1.089 |
MWS | 0.1988 | 0.3458 | 0.1470 | 1.035 |
A third application of the methods to the analysis of a screening mammogram dataset is given in the Supplementary Materials.
6 Discussion and concluding remarks
This paper considers 13 asymptotic CIs for binary outcome data, taken from clusters, assuming a beta-binomial distribution and semi-parametric models such that only the first two moments of the responses need be specified. The results of our simulation studies in Section 4 suggest that the two versions of the Wilson score, the modified Wilson score, and the profile likelihood methods are preferable as their observed CPs are very close to the nominal coverage level. However, the Wilson score and the modified Wilson score methods are preferred to the profile likelihood method in that they are well controlled around the desired coverage level, though the profile likelihood method is preferred to the two versions of the Wilson score and the modified Wilson score in the sense that it generally possesses shorter ELs in almost all data situations.
Our results in this paper depend on the assumption of the beta-binomial distribution and semi-parametric models specified by only the first two moments of the response variable (which we considered similar to the mean and variance of the beta-binomial distribution). Moreover, we considered a common correlation structure assumption among the observations within the same cluster which is often applicable to family studies; however, one can extend this research by using a more general correlation structure arising in many genetic epidemiology studies.
In our simulation studies it appeared that in some situations the asymptotic confidence intervals based on ML, EQL, DEQL, and QEE showed serious lack of coverage, particularly EQL and DEQL. These CI procedures rely largely on the asymptotic normality distribution assumption for the estimate of the parameter, which may not hold in some situations, especially for small sample sizes or small parameter values. Moreover, the problem of lack of coverage could arise due to estimation of the asymptotic standard error. Further research that can solve these issues by incorporating alternative distributional approximations, such as parametric, nonparametric, and double bootstrap, to yield greater coverage accuracy would clearly be worthwhile.
Acknowledgments
The authors are grateful to the Associate Editor and two referees for their constructive comments and suggestions that have led to much improvement in this manuscript. This research was supported in part by a CSU-AAUP University research grant.
References
1. Paul SR. Analysis of proportions of affected foetuses in teratological experiments. Biometrics 1982;38:361–70.10.2307/2530450Search in Google Scholar
2. Liang KY, Qaqish B, Zeger SL. Multivariate regression analysis for categorical data. J Roy Stat Soc B 1992;54:3–40.10.1111/j.2517-6161.1992.tb01862.xSearch in Google Scholar
3. Saha KK. Profile likelihood-based confidence intervals of the intraclass correlation for binary outcome data sampled from clusters. Stat Med 2012;31:3982–4002.10.1002/sim.5489Search in Google Scholar
4. Hogg DR, Tanis DV. Introduction to probability statistics. London: Chapman and Hall, 2009.Search in Google Scholar
5. Rao JNK, Scott AJ. A simple method for the analysis of clustered binary data. Biometrics 1992;48:577–85.10.2307/2532311Search in Google Scholar
6. Kleinman J. Proportions with extraneous variance: single and independent sample. J Am Stat Assoc 1973;68:46–54.10.1080/01621459.1973.10481332Search in Google Scholar
7. Paul SR, Islam AS. Joint estimation of the mean and dispersion parameters in the analysis of proportions: a comparison of efficiency and bias. Can J Stat 1998;26:83–94.10.2307/3315675Search in Google Scholar
8. Zhou X-H, Obuchowski NA, McClish DK. Statistical methods in diagnostic medicine. New York: Wiley, 2002.10.1002/9780470317082Search in Google Scholar
9. Kim J, Lee JH. Simultaneous confidence intervals for a success probability and intraclass correlation, with an application to screening mammography. Biometric J 2013;55:944–54.10.1002/bimj.201200252Search in Google Scholar
10. Agresti A, Coull BA. Approximate is better than ‘exact’ for interval estimation of binomial proportions. Am Statistician 1998;52:119–26.10.1080/00031305.1998.10480550Search in Google Scholar
11. Newcombe RG. Two sided confidence intervals for the single proportion: comparison of seven methods. Stat Med 1998;17:857–72.10.1002/(SICI)1097-0258(19980430)17:8<857::AID-SIM777>3.0.CO;2-ESearch in Google Scholar
12. Korn EL, Graubard BI. Analysis of health surveys. New York: John Wiley & Sons, 1999.10.1002/9781118032619Search in Google Scholar
13. Lui KJ. Statistical estimation of epidemiological risk. New York: Wiley, 2004.10.1002/0470094087Search in Google Scholar
14. Rutter CM. Bootstrap estimation of diagnotic accuracy with patient-clustered data. Acad Radiol 2000;2:413–19.10.1016/S1076-6332(00)80381-5Search in Google Scholar
15. Paul SR, Zaihra T. Interval estimation of risk difference for data sampled from clusters. Stat Med 2008;27:4207–20.10.1002/sim.3289Search in Google Scholar PubMed
16. Williams DA. Analysis of binary responses from toxicological experiments involving reproduction and teratogenicity. Biometrics 1975;31:949–52.10.2307/2529820Search in Google Scholar
17. Paul SR, Islam AS. Analysis of proportions in the presence of over-/under-dispersion. Biometrics 1995;51:1400–11.10.2307/2533270Search in Google Scholar
18. Liang KY, McCullagh P. Case studies in binary dispersion. Biometrics 1993;49:623–30.10.2307/2532575Search in Google Scholar
19. Altaye M, Donner A, Klar N. Inference procedures for assessing interobserver agreement among multiple raters. Biometrics 2001;57:584–8.10.1111/j.0006-341X.2001.00584.xSearch in Google Scholar PubMed
20. Pradhan V, Saha KK, Banerjee T, Evans J. Weighted profile likelihood-based confidence interval for the difference between two proportions with paired binomial data. Stat Med 2014;33:2984–97.10.1002/sim.6130Search in Google Scholar PubMed
21. Zeger SL, Liang KY. Longitudinal data analysis for discrete and continuous outcomes. Biometrics 1986;42:121–30.10.2307/2531248Search in Google Scholar
22. Paul SR, Saha KK. The generalized linear model and extensions: a review and some biological and environmental applications. Environmetrics 2007;18:421–43.10.1002/env.849Search in Google Scholar
23. Inagaki N. Asymptotic relations between the likelihood estimating function and the maximum likelihood estimator. Ann Inst Stat Math 1973;25:1–26.10.1007/BF02479355Search in Google Scholar
24. Bowman D, George EO. A satured model for analyzing exchangeable binary data: Applications to clinical and developmental toxicity studies. J Am Stat Assoc 1995;90:871–9.10.1080/01621459.1995.10476586Search in Google Scholar
25. Kupper LL, Portier C, Hogan MD, Yamamoto E. The impact of litter effects on dose-response modeling in teratology. Biometrics 1986;42:85–98.10.2307/2531245Search in Google Scholar
© 2016 Walter de Gruyter GmbH, Berlin/Boston
Articles in the same Issue
- Research Articles
- A Comparison of Some Approximate Confidence Intervals for a Single Proportion for Clustered Binary Outcome Data
- Effect of Smoothing in Generalized Linear Mixed Models on the Estimation of Covariance Parameters for Longitudinal Data
- Adaptive Design for Staggered-Start Clinical Trial
- A Binomial Integer-Valued ARCH Model
- Testing Equality in Ordinal Data with Repeated Measurements: A Model-Free Approach
- Mendelian Randomization using Public Data from Genetic Consortia
- Tree Based Method for Aggregate Survival Data Modeling
- Multi-locus Test and Correction for Confounding Effects in Genome-Wide Association Studies
- Semiparametric Regression Estimation for Recurrent Event Data with Errors in Covariates under Informative Censoring
- Joint Model for Mortality and Hospitalization
- Effect Estimation in Point-Exposure Studies with Binary Outcomes and High-Dimensional Covariate Data – A Comparison of Targeted Maximum Likelihood Estimation and Inverse Probability of Treatment Weighting
- Sample Size for Assessing Agreement between Two Methods of Measurement by Bland−Altman Method
- Using Relative Statistics and Approximate Disease Prevalence to Compare Screening Tests
- Multiple Comparisons Using Composite Likelihood in Clustered Data
Articles in the same Issue
- Research Articles
- A Comparison of Some Approximate Confidence Intervals for a Single Proportion for Clustered Binary Outcome Data
- Effect of Smoothing in Generalized Linear Mixed Models on the Estimation of Covariance Parameters for Longitudinal Data
- Adaptive Design for Staggered-Start Clinical Trial
- A Binomial Integer-Valued ARCH Model
- Testing Equality in Ordinal Data with Repeated Measurements: A Model-Free Approach
- Mendelian Randomization using Public Data from Genetic Consortia
- Tree Based Method for Aggregate Survival Data Modeling
- Multi-locus Test and Correction for Confounding Effects in Genome-Wide Association Studies
- Semiparametric Regression Estimation for Recurrent Event Data with Errors in Covariates under Informative Censoring
- Joint Model for Mortality and Hospitalization
- Effect Estimation in Point-Exposure Studies with Binary Outcomes and High-Dimensional Covariate Data – A Comparison of Targeted Maximum Likelihood Estimation and Inverse Probability of Treatment Weighting
- Sample Size for Assessing Agreement between Two Methods of Measurement by Bland−Altman Method
- Using Relative Statistics and Approximate Disease Prevalence to Compare Screening Tests
- Multiple Comparisons Using Composite Likelihood in Clustered Data