Home Mathematics Testing for association between ordinal traits and genetic variants in pedigree-structured samples by collapsing and kernel methods
Article
Licensed
Unlicensed Requires Authentication

Testing for association between ordinal traits and genetic variants in pedigree-structured samples by collapsing and kernel methods

  • Li-Chu Chien ORCID logo EMAIL logo
Published/Copyright: September 26, 2023

Abstract

In genome-wide association studies (GWAS), logistic regression is one of the most popular analytics methods for binary traits. Multinomial regression is an extension of binary logistic regression that allows for multiple categories. However, many GWAS methods have been limited application to binary traits. These methods have improperly often been used to account for ordinal traits, which causes inappropriate type I error rates and poor statistical power. Owing to the lack of analysis methods, GWAS of ordinal traits has been known to be problematic and gaining attention. In this paper, we develop a general framework for identifying ordinal traits associated with genetic variants in pedigree-structured samples by collapsing and kernel methods. We use the local odds ratios GEE technology to account for complicated correlation structures between family members and ordered categorical traits. We use the retrospective idea to treat the genetic markers as random variables for calculating genetic correlations among markers. The proposed genetic association method can accommodate ordinal traits and allow for the covariate adjustment. We conduct simulation studies to compare the proposed tests with the existing models for analyzing the ordered categorical data under various configurations. We illustrate application of the proposed tests by simultaneously analyzing a family study and a cross-sectional study from the Genetic Analysis Workshop 19 (GAW19) data.


Corresponding author: Li-Chu Chien, Center for Fundamental Science, Kaohsiung Medical University, Kaohsiung, Taiwan, ROC, E-mail:

Award Identifier / Grant number: MOST 110-2118-M-037-001-MY2

Acknowledgments

We thank the GAW19 data provider for their generosity in sharing their data with us. The Genetic Analysis Workshops are supported by NIH grant R01 GM031575. The GAW19 whole genome sequence data were provided by the T2D-GENES Consortium, which is supported by NIH grants U01 DK085524, U01 DK085584, U01 DK085501, U01 DK085526, and U01 DK085545. The other genetic and phenotypic data for GAW19 were provided by the San Antonio Family Heart Study and San Antonio Family Diabetes/Gallbladder study, which are supported by NIH grants P01 HL045222, R01 DK047482, and R01 DK053889. Andrew R. Wood is supported by European Research Council grant SZ-245 50371-GLUCOSEGENES-FP7-IDEAS-ERC. We are grateful to the editor and the referees for their helpful comments and suggestions in improving the paper.

  1. Research ethics: Not applicable.

  2. Author contributions: The author has accepted responsibility for the entire content of this manuscript and approved its submission.

  3. Competing interests: The author states no conflict of interest.

  4. Research funding: This work is supported by grant MOST 110-2118-M-037-001- MY2 of Ministry of Science and Technology, Taiwan, R.O.C.

  5. Data availability: Not applicable.

References

1. Agresti, A. Analysis of ordinal categorical data. New Jersey: John Wiley & Sons Inc; 2010.10.1002/9780470594001Search in Google Scholar

2. McCullagh, P. Regression models for ordinal data. J Roy Stat Soc B 1980;42:109–42. https://doi.org/10.1111/j.2517-6161.1980.tb01109.x.Search in Google Scholar

3. Bedogni, G, Kahn, HS, Bellentani, S, Tiribelli, C. A simple index of lipid overaccumulation is a good marker of liver steatosis. BMC Gastroenterol 2010;10:98. https://doi.org/10.1186/1471-230x-10-98.Search in Google Scholar PubMed PubMed Central

4. Miller, ME, Davis, CS, Landis, JR. The analysis of longitudinal polytomous data: generalized estimating equations and connections with wighted least squares. Biometrics 1993;49:1033–44. https://doi.org/10.2307/2532245.Search in Google Scholar

5. Liang, K, Zeger, S. Longitudinal data-analysis using generalized linear-models. Biometrika 1986;73:13–22. https://doi.org/10.1093/biomet/73.1.13.Search in Google Scholar

6. Kenward, MG, Lesaffre, E, Molenberghs, G. An application of maximum likelihood and generalized estimating equations to the analysis of ordinal data from a longitudinal study with cases missing at random. Biometrics 1994;50:945–54. https://doi.org/10.2307/2533434.Search in Google Scholar

7. Lipsitz, SR, Kim, K, Zhao, L. Analysis of repeated categorical data using generalized estimating equations. Stat Med 1994;13:1149–63. https://doi.org/10.1002/sim.4780131106.Search in Google Scholar PubMed

8. Molenberghs, G, Lesaffre, E. Marginal modeling of correlated ordinal data using a multivariate plackett distribution. J Am Stat Assoc 1994;89:633–44. https://doi.org/10.1080/01621459.1994.10476788.Search in Google Scholar

9. Girard, P, Parent, E. Bayesian analysis of autocorrelated ordered categorical data for industrial quality monitoring. Technometrics 2001;43:180–91. https://doi.org/10.1198/004017001750386297.Search in Google Scholar

10. Parsons, NR, Edmondson, RN, Gilmour, SG. A generalized estimating equation method for fitting autocorrelated ordinal score data with an application in horticultural research. J R Stat Soc Ser C Appl Stat 2006;55:507–24. https://doi.org/10.1111/j.1467-9876.2006.00550.x.Search in Google Scholar

11. Das, U, Das, K. Inference on zero inflated ordinal models with semiparametric link. Comput Stat Data Anal 2018;128:104–15. https://doi.org/10.1016/j.csda.2018.06.016.Search in Google Scholar

12. Weiß, CH. Distance-based analysis of ordinal data and ordinal time series. J Am Stat Assoc 2020;115:1189–200. https://doi.org/10.1080/01621459.2019.1604370.Search in Google Scholar

13. German, CA, Sinsheimer, JS, Klimentidis, YC, Zhou, H, Zhou, JJ. Ordered multinomial regression for genetic association analysis of ordinal phenotypes at Biobank scale. Genet Epidemiol 2019;44:248–60. https://doi.org/10.1002/gepi.22276.Search in Google Scholar PubMed PubMed Central

14. Bi, W, Zhou, W, Dey, R, Mukherjee, B, Sampson, JN, Lee, S. Efficient mixed model approach for large-scale genome-wide association studies of ordinal categorical phenotypes. Am J Hum Genet 2021;108:825–39. https://doi.org/10.1016/j.ajhg.2021.03.019.Search in Google Scholar PubMed PubMed Central

15. Zhang, W, Li, Q. Incorporating Hardy–Weinberg equilibrium law to enhance the association strength for ordinal trait genetic study. Ann Hum Genet 2016;80:102–12. https://doi.org/10.1111/ahg.12142.Search in Google Scholar PubMed

16. Wang, J, Ding, J, Huang, S, Li, Q, Pan, D. A powerful method to test associations between ordinal traits and genotypes. G3 Genes Genom Genet 2019;9:2573–9. https://doi.org/10.1534/g3.119.400293.Search in Google Scholar PubMed PubMed Central

17. Xue, Y, Wang, J, Ding, J, Zhang, S, Li, Q. A powerful test for ordinal trait genetic association analysis. Stat Appl Genet Mol Biol 2019;18:20170066. https://doi.org/10.1515/sagmb-2017-0066.Search in Google Scholar PubMed

18. O’Reilly, PF, Hoggart, CJ, Pomyen, Y, Calboli, FCF, Elliott, P, Jarvelin, MR, et al.. MultiPhen: joint model of multiple phenotypes can increase discovery in GWAS. PLoS One 2012;7:e34861. https://doi.org/10.1371/journal.pone.0034861.Search in Google Scholar PubMed PubMed Central

19. Chiu, C-Y, Wang, S, Zhang, B, Luo, Y, Simpson, C, Zhang, W, et al.. Gene-level association analysis of ordinal traits with functional ordinal logistic regressions. Genet Epidemiol 2022;46:234–55. https://doi.org/10.1002/gepi.22451.Search in Google Scholar PubMed PubMed Central

20. Wang, S, Chiu, C, Wilson, AF, Bailey‐Wilson, JE, Agron, E, Chew, EY, et al.. Gene-level association analysis of bivariate ordinal traits with functional regressions. Genet Epidemiol 2023. https://doi.org/10.1002/gepi.22524.Search in Google Scholar PubMed

21. Touloumis, A, Agresti, A, Kateri, M. GEE for multinomial responses using a local odds ratios parameterization. Biometrics 2013;69:633–40. https://doi.org/10.1111/biom.12054.Search in Google Scholar PubMed

22. Schaid, DJ, McDonnell, SK, Sinnwell, JP, Thibodeau, SN. Multiple genetic variant association testing by collapsing and kernel methods with pedigree or population structured data. Genet Epidemiol 2013;37:409–18. https://doi.org/10.1002/gepi.21727.Search in Google Scholar PubMed PubMed Central

23. Heagerty, PJ, Zeger, SL. Marginal regression models for clustered ordinal measurements. J Am Stat Assoc 1996;91:1024–36. https://doi.org/10.1080/01621459.1996.10476973.Search in Google Scholar

24. Yee, T. Vector generalized linear and additive models, R package version 1.1; 2021.10.1007/s10687-007-0032-4Search in Google Scholar

25. Nooraee, N, Molenberghs, G, Heuvel, ERVD. GEE for longitudinal ordinal data: comparing R-geepack, R-multgee, R-repolr, SAS-GENMOD, SPSS-GENLIN. Comput Stat Data Anal 2014;77:70–83. https://doi.org/10.1016/j.csda.2014.03.009.Search in Google Scholar

26. Fréchet, M. Les probabilités associées à un système d’événements compatibles et dépendants. Paris: Hermann & Cie; 1940.Search in Google Scholar

27. Touloumis, A. GEE solver for correlated nominal or ordinal multinomial responses, R package version 1.8; 2021.Search in Google Scholar

28. Thornton, T, McPeek, MS. ROADTRIPS: case-control association testing with partially or completely unknown population and pedigree structure. Am J Hum Genet 2010;86:172–84. https://doi.org/10.1016/j.ajhg.2010.01.001.Search in Google Scholar PubMed PubMed Central

29. Kuonen, D. Saddlepoint approximations for distributions of quadratic forms in normal variables. Biometrika 1999;86:929–35. https://doi.org/10.1093/biomet/86.4.929.Search in Google Scholar

30. Liu, Y, Xie, J. Cauchy combination test: a powerful test with analytic p-value calculation under arbitrary dependency structures. J Am Stat Assoc 2020;115:393–402. https://doi.org/10.1080/01621459.2018.1554485.Search in Google Scholar PubMed PubMed Central

31. Liu, Y, Chen, S, Li, Z, Morrison, AC, Boerwinkle, E, Lin, X. ACAT: a fast and powerful p value combination method for rare-variant analysis in sequencing studies. Am J Hum Genet 2019;104:410–21. https://doi.org/10.1016/j.ajhg.2019.01.002.Search in Google Scholar PubMed PubMed Central

32. McCaw, ZR, Lane, JM, Saxena, R, Redline, S, Lin, X. Operating characteristics of the rank‐based inverse normal transformation for quantitative trait analysis in genome‐wide association studies. Biometrics 2020;76:1262–72. https://doi.org/10.1111/biom.13214.Search in Google Scholar PubMed PubMed Central

33. McCaw, ZR. Rank normal transformation omnibus test, R package version 0.7.1; 2019.Search in Google Scholar

34. Schaffner, SF, Foo, C, Gabriel, S, Reich, D, Daly, MJ, Altshuler, D. Calibrating a coalescent simulation of human genome sequence variation. Genome Res 2005;15:1576–83. https://doi.org/10.1101/gr.3709305.Search in Google Scholar PubMed PubMed Central

35. Lee, S, Teslovich, TM, Boehnke, M, Lin, X. General framework for meta-analysis of rare variants in sequencing association studies. Am J Hum Genet 2013;93:42–53. https://doi.org/10.1016/j.ajhg.2013.05.010.Search in Google Scholar PubMed PubMed Central

36. Schaid, DJ, Alessia, V, Sinnwell, JP. Gene-level association tests with disease status for pedigree data: kernel and burden association statistics, R package version 3.3; 2020.Search in Google Scholar

37. Højsgaard, S, Halekoh, U, Yan, J, Ekstrøm, CT. Generalized estimating equation package, R package version 1.3.9; 2022.Search in Google Scholar

38. Touloumis, A. Simulates correlated multinomial responses, R package version 1.8; 2021.Search in Google Scholar

39. Blangero, J, Teslovich, TM, Sim, X, Almeida, MA, Jun, G, Dyer, TD, et al.. Omics-squared: human genomic, transcriptomic and phenotypic data for genetic analysis workshop 19. BMC Proc 2016;10:71–7. https://doi.org/10.1186/s12919-016-0008-y.Search in Google Scholar PubMed PubMed Central

40. Engelman, CD, Greenwood, CMT, Bailey, JN, Cantor, RM, Kent, JW, König, IR, et al.. Genetic Analysis Workshop 19: methods and strategies for analyzing human sequence and gene expression data in extended families and unrelated individuals. BMC Proc 2016;10:67–70. https://doi.org/10.1186/s12919-016-0007-z.Search in Google Scholar PubMed PubMed Central

41. Fuchsberger, C, Flannick, J, Teslovich, TM, Mahajan, A, Agarwala, V, Gaulton, KJ, et al.. The genetic architecture of type 2 diabetes. Nature 2016;536:41–7. https://doi.org/10.1038/nature18642.Search in Google Scholar PubMed PubMed Central

42. Heiber, M, Marchese, A, Nguyen, T, Heng, HH, George, SR, O’Dowd, BF. A novel human gene encoding a G-protein-coupled receptor (GPR15) is located on chromosome 3. Genomics 1996;32:462–5. https://doi.org/10.1006/geno.1996.0143.Search in Google Scholar PubMed

43. Bauer, M. The role of GPR15 function in blood and vasculature. Int J Mol Sci 2021;22:10824. https://doi.org/10.3390/ijms221910824.Search in Google Scholar PubMed PubMed Central

44. Harris, DM, Cohn, HI, Pesant, S, Eckhart, AD. GPCR signalling in hypertension: role of GRKs. Clin Sci 2008;15:79–89. https://doi.org/10.1042/cs20070442.Search in Google Scholar PubMed

45. Rockman, HA, Koch, WJ, Lefkowitz, RJ. Seven-transmembrane-spanning receptors and heart function. Nature 2002;415:206–12. https://doi.org/10.1038/415206a.Search in Google Scholar PubMed

46. Lee, S, Emond, MJ, Bamshad, MJ, Barnes, KC, Rieder, MJ, Nickerson, DA, et al.. Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies. Am J Hum Genet 2012;91:224–37. https://doi.org/10.1016/j.ajhg.2012.06.007.Search in Google Scholar PubMed PubMed Central


Supplementary Material

This article contains supplementary material (https://doi.org/10.1515/ijb-2022-0123).


Received: 2022-09-30
Accepted: 2023-07-28
Published Online: 2023-09-26

© 2023 Walter de Gruyter GmbH, Berlin/Boston

Articles in the same Issue

  1. Frontmatter
  2. Research Articles
  3. Random forests for survival data: which methods work best and under what conditions?
  4. Flexible variable selection in the presence of missing data
  5. An interpretable cluster-based logistic regression model, with application to the characterization of response to therapy in severe eosinophilic asthma
  6. MBPCA-OS: an exploratory multiblock method for variables of different measurement levels. Application to study the immune response to SARS-CoV-2 infection and vaccination
  7. Detecting differentially expressed genes from RNA-seq data using fuzzy clustering
  8. Hypothesis testing for detecting outlier evaluators
  9. Response to comments on ‘sensitivity of estimands in clinical trials with imperfect compliance’
  10. Commentary
  11. Comments on “sensitivity of estimands in clinical trials with imperfect compliance” by Chen and Heitjan
  12. Research Articles
  13. Optimizing personalized treatments for targeted patient populations across multiple domains
  14. Statistical models for assessing agreement for quantitative data with heterogeneous random raters and replicate measurements
  15. History-restricted marginal structural model and latent class growth analysis of treatment trajectories for a time-dependent outcome
  16. Revisiting incidence rates comparison under right censorship
  17. Ensemble learning methods of inference for spatially stratified infectious disease systems
  18. The survival function NPMLE for combined right-censored and length-biased right-censored failure time data: properties and applications
  19. Hybrid classical-Bayesian approach to sample size determination for two-arm superiority clinical trials
  20. Estimation of a decreasing mean residual life based on ranked set sampling with an application to survival analysis
  21. Improving the mixed model for repeated measures to robustly increase precision in randomized trials
  22. Bayesian second-order sensitivity of longitudinal inferences to non-ignorability: an application to antidepressant clinical trial data
  23. A modified rule of three for the one-sided binomial confidence interval
  24. Kalman filter with impulse noised outliers: a robust sequential algorithm to filter data with a large number of outliers
  25. Bayesian estimation and prediction for network meta-analysis with contrast-based approach
  26. Testing for association between ordinal traits and genetic variants in pedigree-structured samples by collapsing and kernel methods
Downloaded on 11.3.2026 from https://www.degruyterbrill.com/document/doi/10.1515/ijb-2022-0123/html
Scroll to top button