Startseite Stability selection for lasso, ridge and elastic net implemented with AFT models
Artikel
Lizenziert
Nicht lizenziert Erfordert eine Authentifizierung

Stability selection for lasso, ridge and elastic net implemented with AFT models

  • Md Hasinur Rahaman Khan EMAIL logo , Anamika Bhadra und Tamanna Howlader
Veröffentlicht/Copyright: 7. Oktober 2019

Abstract

The instability in the selection of models is a major concern with data sets containing a large number of covariates. We focus on stability selection which is used as a technique to improve variable selection performance for a range of selection methods, based on aggregating the results of applying a selection procedure to sub-samples of the data where the observations are subject to right censoring. The accelerated failure time (AFT) models have proved useful in many contexts including the heavy censoring (as for example in cancer survival) and the high dimensionality (as for example in micro-array data). We implement the stability selection approach using three variable selection techniques—Lasso, ridge regression, and elastic net applied to censored data using AFT models. We compare the performances of these regularized techniques with and without stability selection approaches with simulation studies and two real data examples–a breast cancer data and a diffuse large B-cell lymphoma data. The results suggest that stability selection gives always stable scenario about the selection of variables and that as the dimension of data increases the performance of methods with stability selection also improves compared to methods without stability selection irrespective of the collinearity between the covariates.

Acknowledgement

We thank Institute of Statistical Research and Training (ISRT), University of Dhaka, Bangladesh for giving us the platform to conduct this research study.

  1. Conflict of interest statement: The authors have declared no conflict of interest.

References

Ambroise, C. and G. J. McLachlan (2002): “Selection bias in gene extraction on the basis of microarray gene-expression data,” PNAS, 99, 6562–6566.10.1073/pnas.102102699Suche in Google Scholar PubMed PubMed Central

Candes, E. and T. Tao (2007): “The dantzig selector: Statistical estimation when p is much larger than n,” Ann. Stat., 35, 2313–2351.10.1214/009053606000001523Suche in Google Scholar

Efron, B., T. Hastie, I. Johnstone and R. Tibshirani (2004): “Least angle regression,” Ann. Stat., 32, 407–499.10.1214/009053604000000067Suche in Google Scholar

Fan, J. and R. Li (2002): “Variable selection for Cox’s proportional hazards model and frailty model,” Ann. Stat., 30, 74–99.10.1214/aos/1015362185Suche in Google Scholar

Faraggi, D. and R. Simon (1998): “Bayesian variable selection method for censored survival data,” Biometrics, 54, 1475–85.10.2307/2533672Suche in Google Scholar PubMed

Gatter, K. and F. Pezzella (2010): “Diffuse large B-cell lymphoma,” Diagn. Histopathol., 16, 69–81.10.1016/j.mpdhp.2009.12.002Suche in Google Scholar

G’Sell, M. G., T. Hastie and R. Tibshirani (2013): “False variable selection rates in regression,” arXiv, arXiv:1302.2303.Suche in Google Scholar

Gui, J. and H. Li (2005a): “Penalized Cox regression analysis in the highdimensional and low-sample size settings, with applications to microarray gene expression data,” Bioinformatics, 21, 3001–3008.10.1093/bioinformatics/bti422Suche in Google Scholar PubMed

Gui, J. and H. Li (2005b): “Threshold gradient descent method for censored data regression, with applications in pharmacogenomics,” Pac. Symp. Biocomput., 10, 272–283.10.1142/9789812702456_0026Suche in Google Scholar PubMed

Hoerl, A. E. and R. W. Kennard (1970): “Ridge regression: applications to nonorthogonal problems,” Technometrics, 12, 69–82.10.1080/00401706.1970.10488635Suche in Google Scholar

Huang, J. and S. Ma (2010a): “Variable selection in the accelerated failure time model via the bridge method,” Lifetime Data Anal., 16, 176–195.10.1007/s10985-009-9144-2Suche in Google Scholar PubMed PubMed Central

Huang, J. and S. Ma (2010b): “Variable selection in the accelerated failure time model via the bridge method,” Lifetime Data Anal., 16, 176–195.10.1007/s10985-009-9144-2Suche in Google Scholar PubMed PubMed Central

Huang, J., S. Ma and H. Xie (2006): “Regularized estimation in the accelerated failure time model with high-dimensional covariates,” Biometrics, 62, 813–820.10.1111/j.1541-0420.2006.00562.xSuche in Google Scholar PubMed

Ibrahim, J. G., M.-H. Chen and S. N. Maceachern (1999): “Bayesian variable selection for proportional hazards models,” Can. J. Stat., 27, 701–717.10.2307/3316126Suche in Google Scholar

Ioannidis, J. P. A. (2005): “Selection bias in gene extraction on the basis of microarray gene-expression data,” PLoS Med., 2, e124.10.1371/journal.pmed.0020124Suche in Google Scholar PubMed PubMed Central

James, G. M. and P. Radchenko (2009): “A generalized dantzig selector with shrinkage tuning,” Biometrika, 96, 323–337.10.1093/biomet/asp013Suche in Google Scholar

Kalbfleisch, J. D. and R. L. Prentice (2011): The statistical analysis of failure time data. John Wiley & Sons, New York, USA.Suche in Google Scholar

Khan, M. H. R. (2013): “Variable selection and estimation procedures for high-dimensional survival data,” Ph.D. Thesis, Department of Statistics, University of Warwick, UK.Suche in Google Scholar

Khan, M. H. R. (2018): “On the performance of adaptive pre-processing technique in analysing high-dimensional censored data,” Biom. J., 60, 687–702.10.1002/bimj.201600256Suche in Google Scholar PubMed

Khan, M. H. R. and J. E. H. Shaw (2016): “Variable selection for survival data with a class of adaptive elastic net techniques,” Stat. Comput., 26, 725–741.10.1007/s11222-015-9555-8Suche in Google Scholar

Khan, M. H. R. and J. E. H. Shaw (2019): “Variable selection for accelerated lifetime models with synthesized estimation techniques,” Stat. Methods Med. Res., 28, 937–952.10.1177/0962280217739522Suche in Google Scholar PubMed

Leng, C., Y. Lin and G. Wahba (2006): “A note on the LASSO and related procedures in model selection,” Stat. Sin., 16, 1273–1284.Suche in Google Scholar

Li, H. and Y. Luan (2003): “Kernel Cox regression models for linking gene expression profiles to censored survival data,” Pac. Symp. Biocomput., 8, 65–76.10.1142/9789812776303_0007Suche in Google Scholar

Meinshausen, N. and P. Bühlmann (2010): “Stability selection,” J. R. Stat. Soc. B, 72, 417–473.10.1111/j.1467-9868.2010.00740.xSuche in Google Scholar

Sauerbrei, W. and M. Schumacher (1992): “A bootstrap resampling procedure for model building: Application to the cox regression model,” Stat. Med., 11, 2093–2109.10.1002/sim.4780111607Suche in Google Scholar PubMed

Stute, W. (1993): “Consistent estimation under random censorship when covariables are present,” J. Multivariate Anal., 45, 89–103.10.1006/jmva.1993.1028Suche in Google Scholar

Swindell, W. (2009): “Accelerated failure time models provide a useful statistical framework for aging research,” Exp. Gerontol., 44, 190–200.10.1016/j.exger.2008.10.005Suche in Google Scholar PubMed

Ternes, N., F. Rotolo and S. Michielsa (2016): “Empirical extensions of the LASSO penalty to reduce the false discovery rate in high dimensional cox regression models,” Stat. Med., 35, 2561–2573.10.1002/sim.6927Suche in Google Scholar PubMed

Tibshirani, R. (1996): “Regression shrinkage and selection via the lasso,” J. R. Stat. Soc. B, 58, 267–288.10.1111/j.2517-6161.1996.tb02080.xSuche in Google Scholar

Tibshirani, R. (1997): “The lasso method for variable selection in the cox model,” Stat. Med., 16, 385–395.10.1002/(SICI)1097-0258(19970228)16:4<385::AID-SIM380>3.0.CO;2-3Suche in Google Scholar PubMed

Van De Vijver, M. J., Y. D. He, L. J. van’t Veer, H. Dai, A. A. Hart, D. W. Voskuil, G. J. Schreiber, J. L. Peterse, C. Roberts, M. J. Marton, M. Parrish, D. Atsma, A. Witteveen, A. Glas, L. Delahaye, T. van der Velde, H. Bartelink, S. Rodenhuis, E. T. Rutgers, S. H. Friend and R. Bernards (2002): “A gene-expression signature as a predictor of survival in breast cancer,” N. Engl. J. Med., 347, 1999–2009.10.1056/NEJMoa021967Suche in Google Scholar PubMed

van’t Veer, L. J., H. Dai, M. J. Van De Vijver, Y. D. He, A. A. Hart, M. Mao, H. L. Peterse, K. van der Kooy, M. J. Marton, A. T. Witteveen, G. J. Schreiber, R. M. Kerkhoven, C. Roberts, P. S. Linsley, R. Bernards and S. H. Friend (2002): “Gene expression profiling predicts clinical outcome of breast cancer,” Nature, 415, 530–536.10.1038/415530aSuche in Google Scholar PubMed

Walschaerts, M., E. Leconte and P. Besse (2012): “Stable variable selection for right censored data: comparison of methods,” arXiv preprint arXiv:1203.4928.Suche in Google Scholar

Wang, S., B. Nan, J. Zhu and D. Beer (2008): “Doubly penalized buckley-james method for survival data with high-dimensional covariates,” Biometrics, 64, 132–140.10.1111/j.1541-0420.2007.00877.xSuche in Google Scholar PubMed

Wei, L. (1992): “The accelerated failure time model: a useful alternative to the cox regression model in survival analysis,” Stat. Med., 11, 1871–1879.10.1002/sim.4780111409Suche in Google Scholar

Wright, G., W. Chan, J. Connors, E. Campo, R. Fisher, R. Gascoyne, H. Muller-Hermelink, E. Smeland, J. Giltnane, E. Hurt, H. Zhao, L. Averett, L. Yang, W. Wilson, E. Jaffe, R. Simon, R. Klausner, J. Powell, P. Duffey, D. Longo, T. Greiner, D. Weisenburger, W. Sanger, B. Dave, J. Lynch, J. Vose, J. Armitage, E. Montserrat, A. Lopez-Guillermo, T. Grogan, T. Miller, M. LeBlanc, G. Ott, S. Kvaloy, J. Delabie, H. Holte, P. Krajci, T. Stokke and L. Staudt (2002): “The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma,” N. Engl. J. Med., 346, 1937–1947.10.1056/NEJMoa012914Suche in Google Scholar PubMed

Zhang, H. H. and W. Lu (2007): “Adaptive lasso for Cox’s proportional hazards model,” Biometrika, 94, 691–703.10.1093/biomet/asm037Suche in Google Scholar

Zou, H. and T. Hastie (2005): “Regularization and variable selection via the elastic net,” J. R. Stat. Soc. B, 67, 301–320.10.1111/j.1467-9868.2005.00503.xSuche in Google Scholar

Published Online: 2019-10-07

© 2019 Walter de Gruyter GmbH, Berlin/Boston

Heruntergeladen am 16.11.2025 von https://www.degruyterbrill.com/document/doi/10.1515/sagmb-2017-0001/pdf
Button zum nach oben scrollen