Abstract
Many human disease conditions need to be measured by ordinal phenotypes, so analysis of ordinal phenotypes is valuable in genome-wide association studies (GWAS). However, existing association methods for dichotomous or quantitative phenotypes are not appropriate to ordinal phenotypes. Therefore, based on an aggregated Cauchy association test, we propose a fast and efficient association method to test the association between genetic variants and an ordinal phenotype. To enrich association signals of rare variants, we first use the burden method to aggregate rare variants. Then we respectively test the significance of the aggregated rare variants and other common variants. Finally, the combination of transformed variant-level P values is taken as test statistic, that approximately follows Cauchy distribution under the null hypothesis. Extensive simulation studies and analysis of GAW19 show that our proposed method is powerful and computationally fast as a gene-based method. Especially, in the presence of an extremely low proportion of causal variants in a gene, our method has better performance.
Funding source: National Natural Science Foundation of China
Award Identifier / Grant number: 61873087
Award Identifier / Grant number: 12071114
Funding source: Natural Science Foundation of Heilongjiang Province
Award Identifier / Grant number: LH2019A020
Acknowledgements
The GAW19 unrelated data were provided by Type 2 Diabetes Genetic Exploration by Next-generation sequencing in Ethnic Samples (T2D-GENES) Project 1.
-
Author contributions: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.
-
Research funding: This research was supported by the National Natural Science Foundation of China (Grant No. 12071114, Grant No. 61873087) and the Natural Science Foundation of Heilongjiang Province of China (Grant No. LH2019A020). The Genetic Analysis Workshops are supported by GAW grant R01 GM031575 from the National Institute of General Medical Sciences. Preparation of the Genetic Analysis Workshop 17 Simulated Exome Dataset was supported in part by NIH R01 MH059490 and used sequencing data from the 1,000 Genomes Project (http://www.1000genomes.org).
-
Declaration of interest: The authors declare that there is no conflict of interests regarding the publication of this paper.
-
Web resources: OR-ACAT, https://github.com/cappuccino19/OR-ACAT-CR.git.
References
Balzola, F., Bernstein, C., Ho, G.T., and Russell, R.K. (2012). Deep resequencing of GWAS loci identifies independent rare variants associated with inflammatory bowel disease. Nat. Genet. 43: 1066–1073.Suche in Google Scholar
Bansal, V., Libiger, O., Torkamani, A., and Schork, N.J. (2010). Statistical analysis strategies for association studies involving rare variants. Nat. Rev. Genet. 11: 662–676. https://doi.org/10.1038/nrg2867.Suche in Google Scholar PubMed PubMed Central
Barnett, I., Mukherjee, R., and Lin, X. (2017). The generalized higher criticism for testing SNP-set effects in genetic association studies. Am. Stat. Assoc. 112: 64–76. https://doi.org/10.1080/01621459.2016.1192039.Suche in Google Scholar PubMed PubMed Central
Bi, W., Zhou, W., Dey, R., Mukherjee, B., and Lee, S. (2021). Efficient mixed model approach for large-scale genome-wide association studies of ordinal categorical phenotypes. Am. J. Hum. Genet. 108: 825–839. https://doi.org/10.1016/j.ajhg.2021.03.019.Suche in Google Scholar PubMed PubMed Central
Bycroft, C., Freeman, C., Petkova, D., Band, G., Elliott, L.T., Sharp, K., Motyer, A., Vukcevic, D., Delaneau, O., OConnell, J., et al.. (2018). The UK Biobank resource with deep phenotyping and genomic data. Nature 562: 203–209. https://doi.org/10.1038/s41586-018-0579-z.Suche in Google Scholar PubMed PubMed Central
Cruchaga, C., Karch, C.M., Jin, S.C., Benitez, B.A., Cai, Y., and Guerreiro, R. (2013). Rare coding variants in the phospholipase d3 gene confer risk for alzheimer’s disease. Nature 505: 550–554. https://doi.org/10.1038/nature12825.Suche in Google Scholar PubMed PubMed Central
Dai, W., Yang, M., Wang, C., and Cai, T. (2017). Sequence robust association test for familial data. Biometrics 73: 876–884. https://doi.org/10.1111/biom.12643.Suche in Google Scholar PubMed
Donoho, D. and Jin, J. (2004). Higher criticism for detecting sparse heterogeneous mixtures. Ann. Stat. 32: 962–994. https://doi.org/10.1214/009053604000000265.Suche in Google Scholar
German, C.A., Sinsheimer, J.S., Klimentidis, Y.C., Zhou, H., and Zhou, J.J. (2019). Ordered multinomial regression for genetic association analysis of ordinal phenotypes at biobank scale. Genet. Epidemiol. 44: 248–260. https://doi.org/10.1002/gepi.22276.Suche in Google Scholar PubMed PubMed Central
Lee, S., Miropolsky, L., and Wu, M. (2013). Package “SKAT”, Available at: http://cran.r-project.org/web/packages/SKAT/index.html.Suche in Google Scholar
Liu, Y., Chen, S., Li, Z., Morrison, A.C., Boerwinkle, E., and Lin, X. (2019a). ACAT: a fast and powerful p value combination method for rare-variant analysis in sequencing studies. Am. J. Hum. Genet. 104: 410–421. https://doi.org/10.1016/j.ajhg.2019.01.002.Suche in Google Scholar PubMed PubMed Central
Liu, L., Wang, P., Meng, J., Chen, L., Zhu, W., and Ma, W. (2019b). A permutation method for detecting trend correlations in rare variant association studies. Genet. Res. 101: 1–8. https://doi.org/10.1017/S0016672319000120.Suche in Google Scholar PubMed PubMed Central
Liu, Y. and Xie, J. (2020). Cauchy combination test: a powerful test with analytic p-value calculation under arbitrary dependency structures. J. Am. Stat. Assoc. 115: 393–402. https://doi.org/10.1080/01621459.2018.1554485.Suche in Google Scholar PubMed PubMed Central
Madsen, B.E. and Browning, S.R. (2009). A groupwise association test for rare mutations using a weighted sum statistic. PLoS Genet. 5: e1000384. https://doi.org/10.1371/journal.pgen.1000384.Suche in Google Scholar PubMed PubMed Central
Morgenthaler, S. and Thilly, W.G. (2007). A strategy to discover genes that carry multi-allelic or mono-allelic risk for common diseases: a cohort allelic sums test (CAST). Mutat. Res. 615: 28–56. https://doi.org/10.1016/j.mrfmmm.2006.09.003.Suche in Google Scholar PubMed
Rabe, K.F., Hurd, S., Anzueto, A., Barnes, P.J., Buist, S.A., and Calverley, P. (2007). Global strategy for the diagnosis, management, and prevention of chronic obstructive pulmonary disease. Am. J. Respir. Crit. 176: 532–555. https://doi.org/10.1164/rccm.200703-456SO.Suche in Google Scholar PubMed
Schork, N.J., Murray, S.S., Frazer, K.A., and Topol, E.J. (2009). Common vs. rare allele hypotheses for complex diseases. Curr. Opin. Genet. Dev. 19: 212–219. https://doi.org/10.1016/j.gde.2009.04.010.Suche in Google Scholar PubMed PubMed Central
Seunggeun, L., Wu, M.C., and Lin, X. (2012). Optimal tests for rare variant effects in sequencing association studies. Biostatistics 4: 762–775. https://doi.org/10.1093/biostatistics/kxs014.Suche in Google Scholar PubMed PubMed Central
Seunggeung, L., Gonalo, R.A., Michael, B., and Xihong, L. (2014). Rare-variant association analysis: study designs and statistical tests. Am. J. Hum. Genet. 95: 5–23. https://doi.org/10.1016/j.ajhg.2014.06.009.Suche in Google Scholar PubMed PubMed Central
Sha, Q., Wang, X., Wang, X., and Zhang, S. (2012). Detecting association of rare and common variants by testing an optimally weighted combination of variants. Genet. Epidemiol. 36: 561–571. https://doi.org/10.1002/gepi.21649.Suche in Google Scholar PubMed
Sun, J., Bhatnagar, S.R., Oualkacha, K., Ciampi, A., and Greenwood, C.M. (2016). Joint analysis of multiple blood pressure phenotypes in GAW19 data by using a multivariate rare-variant association test. BMC 10: 309–313. https://doi.org/10.1186/s12919-016-0048-3.Suche in Google Scholar PubMed PubMed Central
Visscher, P.M., Brown, M.A., McCarthy, M.I., and Yang, J. (2012). Five years of GWAS discovery. Am. J. Hum. Genet. 90: 7–24. https://doi.org/10.1016/j.ajhg.2011.11.029.Suche in Google Scholar PubMed PubMed Central
Wang, M., Ma, W., and Zhou, Y. (2017). Association detection between ordinal trait and rare variants based on adaptive combination of P values. J. Hum. Genet. 63: 37–45. https://doi.org/10.1038/s10038-017-0354-2.Suche in Google Scholar PubMed
Wei, P. (2010). Asymptotic tests of association with multiple snps in linkage disequilibrium. Genet. Epidemiol. 33: 497–507.10.1002/gepi.20402Suche in Google Scholar
Wu, M., Lee, S., Cai, T., Li, Y., Boehnke, M., and Lin, X. (2011). Rare-variant association testing for sequencing data with the sequence kernel association test. Am. J. Hum. Genet. 89: 82–93. https://doi.org/10.1016/j.ajhg.2011.05.029.Suche in Google Scholar PubMed PubMed Central
© 2023 Walter de Gruyter GmbH, Berlin/Boston
Artikel in diesem Heft
- Review
- Mediation analysis method review of high throughput data
- Research Articles
- When is the allele-sharing dissimilarity between two populations exceeded by the allele-sharing dissimilarity of a population with itself?
- Patterns of differential expression by association in omic data using a new measure based on ensemble learning
- Integrated regulatory and metabolic networks of the tumor microenvironment for therapeutic target prioritization
- Randomized singular value decomposition for integrative subtype analysis of ‘omics data’ using non-negative matrix factorization
- A novel hybrid CNN and BiGRU-Attention based deep learning model for protein function prediction
- Accurate and fast small p-value estimation for permutation tests in high-throughput genomic data analysis with the cross-entropy method
- Improving the accuracy and internal consistency of regression-based clustering of high-dimensional datasets
- A Bayesian model to identify multiple expression patterns with simultaneous FDR control for a multi-factor RNA-seq experiment
- A fast and efficient approach for gene-based association studies of ordinal phenotypes
- Software and Application Note
- CAT PETR: a graphical user interface for differential analysis of phosphorylation and expression data
Artikel in diesem Heft
- Review
- Mediation analysis method review of high throughput data
- Research Articles
- When is the allele-sharing dissimilarity between two populations exceeded by the allele-sharing dissimilarity of a population with itself?
- Patterns of differential expression by association in omic data using a new measure based on ensemble learning
- Integrated regulatory and metabolic networks of the tumor microenvironment for therapeutic target prioritization
- Randomized singular value decomposition for integrative subtype analysis of ‘omics data’ using non-negative matrix factorization
- A novel hybrid CNN and BiGRU-Attention based deep learning model for protein function prediction
- Accurate and fast small p-value estimation for permutation tests in high-throughput genomic data analysis with the cross-entropy method
- Improving the accuracy and internal consistency of regression-based clustering of high-dimensional datasets
- A Bayesian model to identify multiple expression patterns with simultaneous FDR control for a multi-factor RNA-seq experiment
- A fast and efficient approach for gene-based association studies of ordinal phenotypes
- Software and Application Note
- CAT PETR: a graphical user interface for differential analysis of phosphorylation and expression data