Startseite Potential application of elastic nets for shared polygenicity detection with adapted threshold selection
Artikel
Lizenziert
Nicht lizenziert Erfordert eine Authentifizierung

Potential application of elastic nets for shared polygenicity detection with adapted threshold selection

  • Majnu John ORCID logo EMAIL logo und Todd Lencz ORCID logo
Veröffentlicht/Copyright: 3. November 2022
Veröffentlichen auch Sie bei De Gruyter Brill

Abstract

Current research suggests that hundreds to thousands of single nucleotide polymorphisms (SNPs) with small to modest effect sizes contribute to the genetic basis of many disorders, a phenomenon labeled as polygenicity. Additionally, many such disorders demonstrate polygenic overlap, in which risk alleles are shared at associated genetic loci. A simple strategy to detect polygenic overlap between two phenotypes is based on rank-ordering the univariate p-values from two genome-wide association studies (GWASs). Although high-dimensional variable selection strategies such as Lasso and elastic nets have been utilized in other GWAS analysis settings, they are yet to be utilized for detecting shared polygenicity. In this paper, we illustrate how elastic nets, with polygenic scores as the dependent variable and with appropriate adaptation in selecting the penalty parameter, may be utilized for detecting a subset of SNPs involved in shared polygenicity. We provide theory to better understand our approaches, and illustrate their utility using synthetic datasets. Results from extensive simulations are presented comparing the elastic net approaches with the rank ordering approach, in various scenarios. Results from simulations studies exhibit one of the elastic net approaches to be superior when the correlations among the SNPs are high. Finally, we apply the methods on two real datasets to illustrate further the capabilities, limitations and differences among the methods.


Corresponding author: Majnu John, Institute of Behavioral Science, Feinstein Institutes of Medical Research, 350 Community Drive, Manhasset, NY 11030, USA; Division of Psychiatry Research, The Zucker Hillside Hospital, Northwell Health System, Glen Oaks, NY, USA; and Departments of Psychiatry and of Mathematics, Hofstra University, Hempstead, NY, USA, E-mail:

Funding source: National Institute of Mental Health

Award Identifier / Grant number: R01MH120313

Award Identifier / Grant number: R01MH120594

Award Identifier / Grant number: P50 MH080173

Award Identifier / Grant number: R01 MH095458

Award Identifier / Grant number: R01 MH117646

Acknowledgments

We appreciate very much the comments by two anonymous reviewers and the Associate Editor Dr. Vivian Viallon, which led to substantial improvement of the manuscript. We also sincerely thank the Editor-in-Chief Dr. Antoine Chambaz for his patience, and for providing us extended time to complete the revisions. The authors thank John Cholewa and Eugene Kats for help with computational resources and computer systems, and Dr. Max Lam for assistance with obtaining data used in an earlier version of the paper.

  1. Author contributions: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.

  2. Research funding: M.J’s work was supported in part by the following grants from the National Institute of Mental Health (NIMH) awarded to PI: Deligiannides, K (R01MH120313), PIs: Kane, J and Robinson, D (R01MH120594), and PI: Malhotra AK (P50 MH080173) and both authors’ work was supported by two project grants to T.L. (R01 MH095458 and R01 MH117646).

  3. Conflict of interest statement: The authors declare no conflicts of interest regarding this article.

References

1. Price, AL, Spencer, CC, Donnelly, P. Progress and promise in understanding the genetic basis of common diseases. Proc Biol Sci 2015;282:20151684. https://doi.org/10.1098/rspb.2015.1684.Suche in Google Scholar PubMed PubMed Central

2. Dudbridge, F. Polygenic epidemiology. Genet Epidemiol 2016;40:268–72. https://doi.org/10.1002/gepi.21966.Suche in Google Scholar PubMed PubMed Central

3. Purcell, SM, Wray, NR, Stone, JL, Visscher, PM, O’Donovan, MC, Sullivan, PF, et al.. International Schizophrenia Consortium. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 2009;460:748–52. https://doi.org/10.1038/nature08185.Suche in Google Scholar PubMed PubMed Central

4. Bush, WS, Sawcer, SJ, de Jager, PL, Oksenberg, JR, McCauley, JL, Pericak-Vance, MA, et al.. Evidence for polygenic susceptibility to multiple sclerosis—the shape of things to come. Am J Hum Genet 2010;86:621–5. https://doi.org/10.1016/j.ajhg.2010.02.027.Suche in Google Scholar PubMed PubMed Central

5. Lu, Y, Ek, WE, Whiteman, D, Vaughan, TL, Spurdle, AB, Easton, DF, et al.. Most common “sporadic” cancers have a significant germline genetic component. Hum Mol Genet 2014;23:6112–8. https://doi.org/10.1093/hmg/ddu312.Suche in Google Scholar PubMed PubMed Central

6. Lango Allen, H, Estrada, K, Lettre, G, Berndt, SI, Weedon, MN, Rivadeneira, F, et al.. Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature 2010;467:832–8.10.1038/nature09410Suche in Google Scholar PubMed PubMed Central

7. Speliotes, EK, Willer, CJ, Berndt, SI, Monda, KL, Thorleifsson, G, Jackson, AU, et al.. Association analyses of 249, 796 individuals reveal 18 new loci associated with bodymass index. Nat Genet 2010;42:937–48.10.1038/ng.686Suche in Google Scholar PubMed PubMed Central

8. Lee, SH, Ripke, S, Neale, BM, Faraone, SV, Purcell, SM, et al.. Cross-Disorder Group of the Psychiatric Genomics Consortium. Genetic relationship between five psychiatric disorders estimated from genome-wide SNPs. Nat Genet 2013;45:984–94. https://doi.org/10.1038/ng.2711.Suche in Google Scholar PubMed PubMed Central

9. Marioni, RE, Yang, J, Dykiert, D, Mõttus R, Campbell, A, CHARGE Cognitive Working Group, et al.. Assessing the genetic overlap between BMI and cognitive function. Mol Psychiatr 2016;21:1477–82. https://doi.org/10.1038/mp.2015.205.Suche in Google Scholar PubMed PubMed Central

10. Lencz, T, Knowles, E, Davies, G, Guha, S, Liewald, DC, Starr, JM, et al.. Molecular genetic evidence for overlap between general cognitive ability and risk for schizophrenia: a report from the Cognitive Genomics consorTium (COGENT). Mol Psychiatr 2014;19:168–74. https://doi.org/10.1038/mp.2013.166.Suche in Google Scholar PubMed PubMed Central

11. Bulik-Sullivan, B, Finucane, HK, Anttila, V, Gusev, A, Day, FR, Loh, PR, et al.. An atlas of genetic correlations across human diseases and traits. Nat Genet 2015;47:1236–41. https://doi.org/10.1038/ng.3406.Suche in Google Scholar PubMed PubMed Central

12. Yang, J, Lee, SH, Goddard, ME, Visscher, PM. GCTA: a tool for genome-wide complex trait analysis. Am J Hum Genet 2011;88:76–82. https://doi.org/10.1016/j.ajhg.2010.11.011.Suche in Google Scholar PubMed PubMed Central

13. Mak, TSH, Porsch, RM, Choi, SW, Zhou, X, Sham, PC. Polygenic scores via penalized regression on summary statistics. Genet Epidemiol 2017;41:469–80. https://doi.org/10.1002/gepi.22050.Suche in Google Scholar PubMed

14. Shi, H, Mancuso, N, Spendlove, S, Pasaniuc, B. Local genetic correlation gives insights into the shared genetic architecture of complex traits. Am J Hum Genet 2017;101:737–51. https://doi.org/10.1016/j.ajhg.2017.09.022.Suche in Google Scholar PubMed PubMed Central

15. Baierl, A, Bogdan, M, Frommlet, F, Futschik, A. On locating multiple interacting quantitative trait loci in intercross designs. Genetics 2016;171:783–90. https://doi.org/10.1534/genetics.104.036699.Suche in Google Scholar PubMed PubMed Central

16. Hoggart, CJ, Whittaker, JC, De Iorio, M, Balding, DJ. Simultaneous analysis of all SNPs in genome-wide and re-sequencing association studies. PLoS Genet 2008;4:e1000130. https://doi.org/10.1371/journal.pgen.1000130.Suche in Google Scholar PubMed PubMed Central

17. Carbonetto, P, Stephens, M. Scalable variational inference for Bayesian variable selection in regression, and its accuracy in genetic association studies. Bayesian Anal 2012;7:73–108. https://doi.org/10.1214/12-ba703.Suche in Google Scholar

18. Malo, N, Libiger, O, Schork, N. Accommodating linkage disequilibrium in genetic association analyses via Ridge regression. Am J Hum Genet 2008;82:375–85. https://doi.org/10.1016/j.ajhg.2007.10.012.Suche in Google Scholar PubMed PubMed Central

19. Wu, J, Devlin, B, Ringquist, S, Trucco, M, Roeder, K. Screen and clean: a tool for identifying interactions in genome-wide association studies. Genet Epidemiol 2010;34:275–85. https://doi.org/10.1002/gepi.20459.Suche in Google Scholar PubMed PubMed Central

20. Wu, M, Kraft, P, Epstein, M, Taylor, DM, Chanock, SJ, Hunter, DJ, et al.. Powerful SNP-set analysis for case-control genome-wide association studies. Am J Hum Genet 2010;86:929–42. https://doi.org/10.1016/j.ajhg.2010.05.002.Suche in Google Scholar PubMed PubMed Central

21. Rakitsch, B, Lippert, C, Stegle, O, Borgwardt, KA. Lasso multi-marker mixed model for association mapping with population structure correction. Bioinformatics 2013;29:206–14. https://doi.org/10.1093/bioinformatics/bts669.Suche in Google Scholar PubMed

22. Zhou, X, Carbonetto, P, Stephens, M. Bayesian sparse linear mixed models. PLoS Genet 2013;9:e1003264. https://doi.org/10.1371/journal.pgen.1003264.Suche in Google Scholar PubMed PubMed Central

23. Li, J, Das, K, Fu, G, Li, R, Wu, R. The Bayesian Lasso for genome-wide association studies. Bioinformatics 2011;27:516–23. https://doi.org/10.1093/bioinformatics/btq688.Suche in Google Scholar PubMed PubMed Central

24. Alexander, D, Lange, K. Stability selection for genome-wide association. Genet Epidemiol 2011;35:722–8. https://doi.org/10.1002/gepi.20623.Suche in Google Scholar PubMed

25. He, Q, Lin, DY. A variable selection method for genome-wide association studies. Bioinformatics 2011;27:1–8. https://doi.org/10.1093/bioinformatics/btq600.Suche in Google Scholar PubMed PubMed Central

26. Brzyski, D, Peterson, CB, Sobczyk, P, Candès, EJ, Bogdan, M, Sabatti, C. Controlling the rate of GWAS false discoveries. Genetics 2017;205:61–75. https://doi.org/10.1534/genetics.116.193987.Suche in Google Scholar PubMed PubMed Central

27. Szulc, P, Bogdan, M, Frommlet, F, Tang, H. Joint genotype and ancestry-based genome-wide association studies in admixed populations. Genet Epidemiol 2017;41:555–66. https://doi.org/10.1002/gepi.22056.Suche in Google Scholar PubMed

28. Hofer, P, Hagmann, M, Brezina, S, Dolejsi, E, Mach, K, Leeb, G, et al.. Bayesian and Frequentist analysis of an Austrian genome-wide association study of colorectal cancer and advanced adenomas. Oncotarget 2017;8:98623–34. https://doi.org/10.18632/oncotarget.21697.Suche in Google Scholar PubMed PubMed Central

29. Buzdugan, L, Kalisch, M, Navarro, A, Schunk, D, Fehr, E, Bühlmann, P. Assessing statistical significance in multivariable genomewide association analysis. Bioinformatics 2016;32:1990–2000. https://doi.org/10.1093/bioinformatics/btw128.Suche in Google Scholar PubMed PubMed Central

30. Frommlet, F, Bogdan, M, Ramsey, D. Phenotypes and genotypes: the search for influential genes. London: Springer; 2016.10.1007/978-1-4471-5310-8Suche in Google Scholar

31. Cotsapas, C, Voight, BF, Rossin, E, Lage, K, Neale, BM, Wallace, C, et al.. Pervasive sharing of genetic effects in autoimmune disease. PLoS Genet 2011;7:e1002254. https://doi.org/10.1371/journal.pgen.1002254.Suche in Google Scholar PubMed PubMed Central

32. Parkes, M, Cortes, A, van Heel, DA, Brown, MA. Genetic insights into common pathways and complex relationships among immune-mediated diseases. Nat Rev Genet 2013;14:661–73. https://doi.org/10.1038/nrg3502.Suche in Google Scholar PubMed

33. Fortune, MD, Guo, H, Burren, O, Schofield, E, Walker, NM, Ban, M, et al.. Statistical colocalization of genetic risk variants for related autoimmune diseases in the context of common controls. Nat Genet 2015;47:839–46. https://doi.org/10.1038/ng.3330.Suche in Google Scholar PubMed PubMed Central

34. Cross-Disorder Group of the Psychiatric Genomics Consortium. Identification of risk loci with shared effects on five major psychiatric disorders: a genome-wide analysis. Lancet 2013;381:1371–9.10.1016/S0140-6736(12)62129-1Suche in Google Scholar PubMed PubMed Central

35. Denny, JC, Bastarache, L, Ritchie, MD, Carroll, RJ, Zink, R, Mosley, JD, et al.. Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data. Nat Biotechnol 2013;31:1102–10. https://doi.org/10.1038/nbt.2749.Suche in Google Scholar PubMed PubMed Central

36. Li, L, Ruau, DJ, Patel, CJ, Weber, SC, Chen, R, Tatonetti, NP, et al.. Disease risk factors identified through shared genetic architecture and electronic medical records. Sci Transl Med 2014;6:234ra57. https://doi.org/10.1126/scitranslmed.3007191.Suche in Google Scholar PubMed PubMed Central

37. Pickrell, JK, Berisa, T, Liu, JZ, Ségurel, L, Tung, JY, Hinds, DA. Detection and interpretation of shared genetic influences on 42 human traits. Nat Genet 2016;48:709–17. https://doi.org/10.1038/ng.3570.Suche in Google Scholar PubMed PubMed Central

38. Zou, H, Hastie, T. Regularization and variable selection via the elastic net. J Roy Stat Soc B 2005;67:301–20. https://doi.org/10.1111/j.1467-9868.2005.00503.x.Suche in Google Scholar

39. Tibshirani, R. Regression shrinkage and selection via the lasso. J Roy Stat Soc B 1996;58:267–88. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x.Suche in Google Scholar

40. Hoerl, AE, Kennard, R. Ridge regression: biased estimation for nonorthogonal problems. Technometrics 1970;12:55–67. https://doi.org/10.1080/00401706.1970.10488634.Suche in Google Scholar

41. Frank, I, Friedman, J. A statistical view of some chemometrics regression tools. Technometrics 1993;35:109–48. https://doi.org/10.1080/00401706.1993.10485033.Suche in Google Scholar

42. Bühlmann, P, van de Geer, S. Statistics for high-dimensional data: methods, theory and applications. Berlin, Heidelberg: Springer; 2011.10.1007/978-3-642-20192-9Suche in Google Scholar

43. van de Geer, S. Estimation and testing under sparsity. In: Lecture notes in Mathematics. Cham: Springer; 2016.10.1007/978-3-319-32774-7Suche in Google Scholar

44. Qiao, W, Lian, H, Xie, M. Model selection of hierarchically structured covariates using elastic net. Electron J Stat 2016;10:3775–806. https://doi.org/10.1214/16-ejs1217.Suche in Google Scholar

45. Jia, J, Yu, B. On model selection consistency of the elastic net when p ≫ n. Stat Sin 2010;20:595–611.Suche in Google Scholar

46. Wang, H, Lengerich, BJ, Aragam, B, Xing, EP. Precision Lasso: accounting for correlations and linear dependencies in high-dimensional genomic data. Bioinformatics 2019;35:1181–7. https://doi.org/10.1093/bioinformatics/bty750.Suche in Google Scholar PubMed PubMed Central

47. Obozinski, G, Taskar, B, Jordan, MI. Joint covariate selection and joint subspace selection for multiple classification problems. Stat Comput 2010;20:231–52. https://doi.org/10.1007/s11222-008-9111-x.Suche in Google Scholar

48. Tibshirani, R, Saunders, M, Rosset, S, Knight, K. Sparsity and smoothness via the fused lasso. J Roy Stat Soc B 2005;67:91–108. https://doi.org/10.1111/j.1467-9868.2005.00490.x.Suche in Google Scholar

49. Yuan, M, Lin, Y. Model selection and estimation in regression with grouped variables. J Roy Stat Soc B 2006;68:49–67. https://doi.org/10.1111/j.1467-9868.2005.00532.x.Suche in Google Scholar

50. Liu, J, Wang, K, Ma, S, Huang, J. Accounting for linkage disequilibrium in genome-wide association studies: a penalized regression method. Stat Interface 2013;6:99–115. https://doi.org/10.4310/sii.2013.v6.n1.a10.Suche in Google Scholar PubMed PubMed Central

51. Simon, N, Tibshirani, R. Standardization and the group Lasso penalty. Stat Sin 2012;22:983–1001. https://doi.org/10.5705/ss.2011.075.Suche in Google Scholar PubMed PubMed Central

52. Hemphill, JF. Interpreting the magnitudes of correlation coefficients. Am Psychol 2003;58:78–9. https://doi.org/10.1037/0003-066x.58.1.78.Suche in Google Scholar PubMed

53. Cohen, J. Statistical power analysis for the behavioral sciences, 2nd ed. Hillsdale, NJ: Erlbaum; 2002.Suche in Google Scholar

54. Waldmann, P, Meszaros, G, Gredler, B, Fuerst, C, Sölkner, J. Evaluation of the lasso and the elastic net in genome-wide association studies. Front Genet 2013;4:270. https://doi.org/10.3389/fgene.2013.00270.Suche in Google Scholar PubMed PubMed Central

55. Vilhjálmsson, BJ, Yang, J, Finucane, HK, Gusev, A, Lindström, S, Ripke, S, et al.. Modeling linkage disequilibrium increases accuracy of polygenic risk scores. Am J Hum Genet 2015;97:576–92.10.1101/015859Suche in Google Scholar

56. IPCC. Climate change 2014: synthesis report. Contribution of working groups I, II and III to the fifth assessment report of the intergovernmental panel on climate change. Core writing team, Pachauri, RK, Meyer, LA, editors. Geneva, Switzerland: IPCC; 2014.Suche in Google Scholar

57. Millet, EJ, Welcker, C, Kruijer, W, Negro, S, Coupel-Ledru, A, Nicolas, SD, et al.. Genome-wide analysis of yield in Europe: allelic effects vary with drought and heat scenarios. Plant Physiol 2016;172:749–64.10.1104/pp.16.00621Suche in Google Scholar PubMed PubMed Central

58. Ganal, MW, Durstewitz, G, Polley, A, Bérard, A, Buckler, ES, Charcosset, A, et al.. A large maize (Zea mays L.) genotyping array: development and germplasm genotyping, and genetic mapping to compare with the B73 reference genome. PLoS One 2011;6:e28334. https://doi.org/10.1371/journal.pone.0028334.Suche in Google Scholar PubMed PubMed Central

59. Giambartolomei, C, Vukcevic, D, Schadt, EE, Franke, L, Hingorani, AD, Wallace, C, et al.. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet 2014;10:e1004383. https://doi.org/10.1371/journal.pgen.1004383.Suche in Google Scholar PubMed PubMed Central


Supplementary Material

The online version of this article offers supplementary material (https://doi.org/10.1515/ijb-2020-0108).


Received: 2020-01-28
Revised: 2022-09-28
Accepted: 2022-10-05
Published Online: 2022-11-03

© 2022 Walter de Gruyter GmbH, Berlin/Boston

Heruntergeladen am 2.11.2025 von https://www.degruyterbrill.com/document/doi/10.1515/ijb-2020-0108/html?lang=de
Button zum nach oben scrollen