Abstract
This paper considers a partially linear regression model relating a right-censored response variable to predictors and an extra covariate with measured error. The main problem here is that censorship and measurement error problems need to be solved to estimate the model correctly. In this sense, we propose three modified semiparametric estimators obtained from local polynomial regression, kernel smoothing, and B-spline smoothing methods based on kernel deconvolution approach and synthetic data transformation. Here, kernel deconvolution technique is used to solve the measurement error problem in the model and synthetic data transformation is considered to add the effect of censorship to the estimation procedure, which is a very common method in the literature. The performances of the introduced estimators are compared in the detailed Monte-Carlo simulation study. In addition, Carotid endarterectomy data is used as real-world data example and results are presented. According to the results, it is seen that the deconvoluted local polynomial method gives more qualified estimates than other two methods.
-
Author contributions: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.
-
Research funding: None declared.
-
Conflict of interest statement: The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Appendix A1: Proof of Lemma 2.1
Proof
From the Assumptions 2.1 (ii), we obtain
This result shows that Theorem 4.1 has been proven.
Appendix A2: Derivations of Equations (3.10a–d)
Let us consider the deconvoluted kernel smoother matrix (i.e.,
We also notice that the matrix
Supposing that
Simplifying,
If we take the derivative with respect to β and set it equal to zero:
Setting (A2.3) to zero and replacing β with
The solution to the normal Equation (A2.4) is
as defined in (3.10a). If we replace the β in (A2.1) with the
and the vector of fitted values based on deconvoluted kernel estimator is
where
as stated in Section 3.1.
Appendix A3: Derivations of Equations (3.18a–d)
Let
After algebraic operations, this expression can be written as
If we get the derivative with respect to b and set it equal to zero:
Hence, we obtain the weighted least squares normal equations
The solution to the (A3.2) is described as
Note that (A3.3) gives the coefficients of the Taylor expansion (3.11), but we needs to select the first element of the vector
as claimed. Then, as in (A2.1), if
Derivative of
From (A3.5)
By using (A3.5) and (A3.6), derivation of hat matrix
where
as stated in Section 3.2.
Appendix A4: Derivations of Equations (3.27a–d)
Minimization criterion for the B-spline estimator is given in Equation (3.23). Accordingly, proofs of estimates
Expansion of (A4.1) is given by:
From that with derivative of the equation at above, normal equation is calculated for c:
By using (A4.2),
Similar to (A3.3) deconvoluted B-spline estimator of unknown smooth function
as claimed. Partial residuals for DB is computed with
Then,
Finally
Also, using with (A4.3) and (A4.6) derivation of hat matrix
where
as explained in Section 3.3.
Appendix A5: Censoring mechanism in simulation design
Algorithm 1
Generation of censoring variable C i .
Input: Completely observed Z i |
Output: Right-censored dependent variable Y i |
1: For given censoring level (C.L.), generate
|
2: for
|
3: If (δ i = 0) |
4: while (Z i ≤ C i ) |
5: generate
|
6: Else |
7: C i = Z i |
8: end (for loop in Step 2) |
9: for
|
10: If (Z i ≤ C i ) |
11: Y i = Z i |
12: Else |
13: Y i = C i |
14: end (for loop in Step 9) |
References
1. Speckman, P. Kernel smoothing in partial linear models. J Roy Stat Soc B 1988;50:413–36. https://doi.org/10.1111/j.2517-6161.1988.tb01738.x.Search in Google Scholar
2. Green, PJ, Silverman, BW. Nonparametric regression and generalized linear models. Number 58 in monographs on statistics and applied probability. In: Nonparametric regression and generalized linear models. New York, NY: CRC Press; 1994.10.1007/978-1-4899-4473-3Search in Google Scholar
3. Ruppert, D, Wand, MP, Carroll, RJ. Semiparametric regression (no. 12). Cambridge: Cambridge University Press; 2003.10.1017/CBO9780511755453Search in Google Scholar
4. Fuller, WA. Measurement error models. New York: Wiley; 1987.10.1002/9780470316665Search in Google Scholar
5. Carroll, RJ, Küchenhoff, H, Lombard, F, Stefanski, LA. Asymptotics for the SIMEX estimator in nonlinear measurement error models. J Am Stat Assoc 1996;91:242–50. https://doi.org/10.1080/01621459.1996.10476682.Search in Google Scholar
6. Liang, H, Härdle, W, Carroll, RJ. Estimation in a semiparametric partially linear errors-in-variables model. Ann Stat 1999;27:1519–35. https://doi.org/10.1214/aos/1017939140.Search in Google Scholar
7. Orbe, J, Ferreira, E, Núñez‐Antón, V. Censored partial regression. Biostatistics 2003;4:109–21. https://doi.org/10.1093/biostatistics/4.1.109.Search in Google Scholar PubMed
8. Qin, G, Jing, BY. Asymptotic properties for estimation of partial linear models with censored data. J Stat Plann Inference 2000;84:95–110. https://doi.org/10.1016/s0378-3758(99)00141-x.Search in Google Scholar
9. Aydin, D, Yilmaz, E. Modified estimators in semiparametric regression models with right-censored data. J Stat Comput Simulat 2018;88:1470–98. https://doi.org/10.1080/00949655.2018.1439032.Search in Google Scholar
10. Koul, H, Susarla, V, Van Ryzin, J. Regression analysis with randomly right-censored data. Ann Stat 1981;9:1276–88. https://doi.org/10.1214/aos/1176345644.Search in Google Scholar
11. Stute, W. Almost sure representations of the product-limit estimator for truncated data. Ann Stat 1993;21:146–56. https://doi.org/10.1214/aos/1176349019.Search in Google Scholar
12. Stute, W. Nonlinear censored regression. Stat Sin 1999;9:1089–102.Search in Google Scholar
13. Kaplan, EL, Meier, P. Nonparametric estimation from incomplete observations. J Am Stat Assoc 1958;53:457–81. https://doi.org/10.1080/01621459.1958.10501452.Search in Google Scholar
14. Fan, J, Truong, YK. Nonparametric regression with errors in variables. Ann Stat 1993;21:1900–25.10.1214/aos/1176349402Search in Google Scholar
15. Carroll, RJ, Ruppert, D, Stefanski, LA, Crainiceanu, CM. Measurement error in nonlinear models: a modern perspective. New York, NY: Chapman and Hall/CRC; 2006.10.1201/9781420010138Search in Google Scholar
16. Delaigle, A, Meister, A. Nonparametric regression estimation in the heteroscedastic errors-in-variables problem. J Am Stat Assoc 2007;102:1416–26. https://doi.org/10.1198/016214507000000987.Search in Google Scholar
17. Wang, XF, Wang, B. Deconvolution estimation in measurement error models: the R package decon. J Stat Software 2011;39:i10. https://doi.org/10.18637/jss.v039.i10.Search in Google Scholar
18. Stefanski, LA, Carroll, RJ. Deconvolving kernel density estimators. Statistics 1990;21:169–84. https://doi.org/10.1080/02331889008802238.Search in Google Scholar
19. Nadaraya, EA. On estimating regression. Theor Probab Appl 1964;9:141–2. https://doi.org/10.1137/1109020.Search in Google Scholar
20. Watson, GS. Smooth regression analysis. Sankhya Indian J Stat 1964;26:359–72.Search in Google Scholar
21. Fan, J, Gijbels, I, Hu, TC, Huang, LS. A study of variable bandwidth selection for local polynomial regression. Stat Sin 1996;6:113–27.Search in Google Scholar
22. De Boor, C, De Boor, C. A practical guide to splines. New York: Springer-Verlag; 1978, vol 27:325 p.10.1007/978-1-4612-6333-3Search in Google Scholar
23. Theobald, CM. Generalizations of mean square error applied to ridge regression. J R Stat Soc Ser B Methodol 1974;36:103–6.10.1111/j.2517-6161.1974.tb00990.xSearch in Google Scholar
24. Fan, J. On the optimal rates of convergence for nonparametric deconvolution problems. Ann Stat 1991;19:1257–72. https://doi.org/10.1214/aos/1176348248.Search in Google Scholar
25. Li, T, Vuong, Q. Nonparametric estimation of the measurement error model using multiple indicators. J Multivariate Anal 1998;65:139–65. https://doi.org/10.1006/jmva.1998.1741.Search in Google Scholar
26. Heckman, NE. Spline smoothing in a partly linear model. J Roy Stat Soc B 1986;48:244–8. https://doi.org/10.1111/j.2517-6161.1986.tb01407.x.Search in Google Scholar
27. Rice, J. Convergence rates for partially splined models. Stat Probab Lett 1986;4:203–8. https://doi.org/10.1016/0167-7152(86)90067-2.Search in Google Scholar
28. Härdle, W, Liang, H, Gao, J. Partially linear models. Berlin: Springer Science & Business Media; 2000.10.1007/978-3-642-57700-0Search in Google Scholar
29. Han, K, Park, BU. Smooth backfitting for errors-in-variables additive models. Ann Stat 2018;46:2216–50. https://doi.org/10.1214/17-aos1617.Search in Google Scholar
30. Lee, ER, Han, K, Park, BU. Estimation of errors-in-variables partially linear additive models. Stat Sin 2018;28:2353–73. https://doi.org/10.5705/ss.202017.0101.Search in Google Scholar
31. Efron, B. Computers and the theory of statistics: thinking the unthinkable. SIAM Rev 1979;21:460–80. https://doi.org/10.1137/1021092.Search in Google Scholar
32. Hurvich, CM, Simonoff, JS, Tsai, CL. Smoothing parameter selection in nonparametric regression using an improved Akaike information criterion. J Roy Stat Soc B 1998;60:271–93. https://doi.org/10.1111/1467-9868.00125.Search in Google Scholar
© 2023 Walter de Gruyter GmbH, Berlin/Boston
Articles in the same Issue
- Frontmatter
- Research Articles
- Survival analysis using deep learning with medical imaging
- Using a population-based Kalman estimator to model the COVID-19 epidemic in France: estimating associations between disease transmission and non-pharmaceutical interventions
- Approximate reciprocal relationship between two cause-specific hazard ratios in COVID-19 data with mutually exclusive events
- Sensitivity of estimands in clinical trials with imperfect compliance
- Highly robust causal semiparametric U-statistic with applications in biomedical studies
- Hierarchical Bayesian bootstrap for heterogeneous treatment effect estimation
- Penalized logistic regression with prior information for microarray gene expression classification
- Bayesian learners in gradient boosting for linear mixed models
- Unequal allocation of sample/event sizes with considerations of sampling cost for testing equality, non-inferiority/superiority, and equivalence of two Poisson rates
- HiPerMAb: a tool for judging the potential of small sample size biomarker pilot studies
- Heterogeneity in meta-analysis: a comprehensive overview
- On stochastic dynamic modeling of incidence data
- Power of testing for exposure effects under incomplete mediation
- Exact correction factor for estimating the OR in the presence of sparse data with a zero cell in 2 × 2 tables
- Right-censored partially linear regression model with error in variables: application with carotid endarterectomy dataset
- Assessing HIV-infected patient retention in a program of differentiated care in sub-Saharan Africa: a G-estimation approach
- Prediction-based variable selection for component-wise gradient boosting
Articles in the same Issue
- Frontmatter
- Research Articles
- Survival analysis using deep learning with medical imaging
- Using a population-based Kalman estimator to model the COVID-19 epidemic in France: estimating associations between disease transmission and non-pharmaceutical interventions
- Approximate reciprocal relationship between two cause-specific hazard ratios in COVID-19 data with mutually exclusive events
- Sensitivity of estimands in clinical trials with imperfect compliance
- Highly robust causal semiparametric U-statistic with applications in biomedical studies
- Hierarchical Bayesian bootstrap for heterogeneous treatment effect estimation
- Penalized logistic regression with prior information for microarray gene expression classification
- Bayesian learners in gradient boosting for linear mixed models
- Unequal allocation of sample/event sizes with considerations of sampling cost for testing equality, non-inferiority/superiority, and equivalence of two Poisson rates
- HiPerMAb: a tool for judging the potential of small sample size biomarker pilot studies
- Heterogeneity in meta-analysis: a comprehensive overview
- On stochastic dynamic modeling of incidence data
- Power of testing for exposure effects under incomplete mediation
- Exact correction factor for estimating the OR in the presence of sparse data with a zero cell in 2 × 2 tables
- Right-censored partially linear regression model with error in variables: application with carotid endarterectomy dataset
- Assessing HIV-infected patient retention in a program of differentiated care in sub-Saharan Africa: a G-estimation approach
- Prediction-based variable selection for component-wise gradient boosting