Startseite Lebenswissenschaften A statistical test for detecting parent-of-origin effects when parental information is missing
Artikel
Lizenziert
Nicht lizenziert Erfordert eine Authentifizierung

A statistical test for detecting parent-of-origin effects when parental information is missing

  • Chiara Sacco EMAIL logo , Cinzia Viroli und Mario Falchi
Veröffentlicht/Copyright: 1. September 2017

Abstract

Genomic imprinting is an epigenetic mechanism that leads to differential contributions of maternal and paternal alleles to offspring gene expression in a parent-of-origin manner. We propose a novel test for detecting the parent-of-origin effects (POEs) in genome wide genotype data from related individuals (twins) when the parental origin cannot be inferred. The proposed method exploits a finite mixture of linear mixed models: the key idea is that in the case of POEs the population can be clustered in two different groups in which the reference allele is inherited by a different parent. A further advantage of this approach is the possibility to obtain an estimation of parental effect when the parental information is missing. We will also show that the approach is flexible enough to be applicable to the general scenario of independent data. The performance of the proposed test is evaluated through a wide simulation study. The method is finally applied to known imprinted genes of the MuTHER twin study data.

Appendix: EM algorithm

For the sake of brevity we denote the posterior probability in case of MZ twins as τkMZ(yi) and τkDZ(yij), for DZ twin pairs.

In the E-step, in order to compute 𝒬(𝜽,𝜽(r)), we require the conditional variance and the conditional mean. For the MZ twin pair, the conditional variance is

(19)ΣuiMZ=(1τ2+2σ2)1

and the conditional mean is given by

(20)μui,kMZ=Σui1σ21(yiμik)2

where yi is the observed data vector 2 × 1 dimensional of the i-th twin pair and μik is the mean vector 2 × 1 dimensional of the i-th twin pair of the k-th component.

If 𝟙MZ(i)=0, we obtain that the conditional variance is defined by

(21)Σui,kDZ=(τ2+σ2)1σ2τ2

and the conditional mean is given by

(22)μui,kjDZ=(1+σ2τ2)1(yijμkij)2.

The M-step consists of determining the values maximizing the equation (12) where

(23)𝔼[lnf(yi,ui|zikMZ;𝜽k)]lnσ212[(yiμikμui,kMZ)2+Σui,kMZ]12lnτ212τ2(μui,kMZ+Σui,kMZ)

and

(24)𝔼[lnf(yij,ui|zijkDZ;𝜽k)]12lnσ212[(yijμkijμui,kjDZ)2+Σui,kDZ]12lnτ212τ2(μui,kjDZ+Σui,kDZ).

It follows that

(25)𝒬(𝜽,𝜽(r))k=12i=1mj=12{12lnσ212lnτ212𝟙MZ(i)[(yijμijkμui,kMZ)2+Σui,kMZσ21τ2(μui,kMZ+Σui,kMZ)]12(1𝟙MZ(i))[(yijμijkμui,kjDZ)2+Σui,kDZσ21τ2(μui,kjDZ+Σui,kDZ)]}.

For this model the parameters can be determined in closed form by solving the equations derived by computing the derivatives of the expected complete likelihood, (25), with respect to parameters α, βM, βP, γ, τ2 and σ2, and setting them to zero. Thus we obtain:

(26)α^=1Nk=12i=1mj=12y~kij(𝟙MZ(i)τkMZ+(1𝟙MZ(i))τkDZ)

where N = 2m and

(27)y~kij={yij(βM+βP)wijBBβPwijABXij𝜸μui,1k=1yij(βM+βP)wijBBβMwijABXij𝜸μui,2k=2,

where μui,k=𝟙MZ(i)μui,kMZ+(1𝟙MZ(i))μui,kjDZ.

We have that the parental effect of the “B” allele are equal, respectively,

(28)β^M=1nBB+i=1mj=12τ2wijAB{k=12i=1mj=12wijBB(yijαXij𝜸βPTk)+i=1mj=12τ2wijAB(yijαXij𝜸μui,2)}

where nBB=i=1mj=12wijBB, τk=𝟙MZ(i)τkMZ(yi)+(1𝟙MZ(i))τkDZ(yij) and Tk=τkμui,k and

(29)β^P=1nBB+i=1mj=12τ1wijAB{k=12i=1mj=12wijBB(yijαXij𝜸βMTk)+i=1mj=12τ1wijAB(yijαXij𝜸μui,1)}.

The covariate coefficients 𝜸 are

(30)𝜸^=(i=1mj=12XijXij)1k=12i=1mj=12τkXij(y¨kij)

where

(31)y¨kij={yijα(βM+βP)wijBBβPwijABμui,1k=1yijα(βM+βP)wijBBβMwijABμui,2k=2.

Finally, tha variance parameters of the model are defined by:

(32)τ2=1Nk=12i=1mj=12τk(μui,k+Σui,k)

where Σui,k=𝟙MZ(i)Σui,kMZ+(1𝟙MZ(i))Σui,kDZ, and

(33)σ2=1Nk=12i=1mj=12τkeijk2

with

eijk2=𝟙MZ(i)(yijμijkμui,kMZ)2+(1𝟙MZ(i))(yijμijkμui,kjDZ)2.

References

Baran, Y., M. Subramaniam, A. Biton, T. Tukiainen, E. K. Tsang, M. A. Rivas, M. Pirinen, M. Gutierrez-Arcelus, K. S. Smith, K. R. Kukurba and R. Zhang (2015): “The landscape of genomic imprinting across diverse adult human tissues,” Genom. Res., 25, 927–936.10.1101/gr.192278.115Suche in Google Scholar

Belonogova, N. M., T. I. Axenovich and Y. S. Aulchenko (2010): “A powerful genome-wide feasible approach to detect parent-of-origin effects in studies of quantitative traits,” Eur. J. Hum. Genet., 18, 379–384.10.1038/ejhg.2009.167Suche in Google Scholar

Biernacki, C., G. Celeux and G. Govaert (2003): “Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate gaussian mixture models,” Comput. Stat. Data Anal., 41, 561–575.10.1016/S0167-9473(02)00163-9Suche in Google Scholar

Bush, W. S. and J. H. Moore (2012): “Genome-wide association studies,” PLoS Comput. Biol., 8, e1002822.10.1016/B978-0-12-809633-8.20232-XSuche in Google Scholar

Celeux, G., O. Martin and C. Lavergne (2005): “Mixture of linear mixed models for clustering gene expression profiles from repeated microarray experiments,” Stat. Model., 5, 243–267.10.1191/1471082X05st096oaSuche in Google Scholar

Cui, Y., Q. Lu, J. M. Cheverud, R. C. Littell and R. Wu (2006): “Model for mapping imprinted quantitative trait loci in an inbred F 2 design,” Genomics, 87, 543–551.10.1016/j.ygeno.2005.11.021Suche in Google Scholar PubMed

Dempster, A. P., N. M. Laird and D. B. Rubin (1977): “Maximum likelihood from incomplete data via the EM algorithm,” J. R. Stat. Soc. Ser. B Methodol., 39, 1–38.Suche in Google Scholar

Eichler, E. E., J. Flint, G. Gibson, A. Kong, S. M. Leal, J. H. Moore and J. H. Nadeau (2010): “Missing heritability and strategies for finding the underlying causes of complex disease,” Nat. Rev. Genet., 11, 446–450.10.1038/nrg2809Suche in Google Scholar PubMed PubMed Central

Grundberg, E., K. S. Small, Å. K. Hedman, A. C. Nica, A. Buil, S. Keildson, J. T. Bell, T.-P. Yang, E. Meduri, A. Barrett, J. Nisbett, M. Sekowska, A. Wilk, S.-Y. Shin, D. Glass, M. Travers, J. L. Min, S. Ring, K. Ho, G. Thorleifsson, A. Kong, U. Thorsteindottir, C. Ainali, A. S. Dimas, N. Hassanali, C. Ingle, D. Knowles, M. Krestyaninova, C. E. Lowe, P. D. Meglio, S. B. Montgomery, L. Parts, S. Potter, G. Surdulescu, L. Tsaprouni, S. Tsoka, V. Bataille, R. Durbin, F. O. Nestle, S. O’Rahilly, N. Soranzo, C. M. Lindgren, K. T. Zondervan, K. R. Ahmadi, E. E. Schadt, K. Stefansson, G. D. Smith, M. I. McCarthy, P. Deloukas, E. T. Dermitzakis, T. D. Spector and The Multiple Tissue Human Expression Resource (MuTHER) Consortium (2012): “Mapping cis-and trans-regulatory effects across multiple tissues in twins,” Nat. Genet., 44, 1084–1089.10.1038/ng.2394Suche in Google Scholar PubMed PubMed Central

Guilmatre, A. and A. Sharp (2012): “Parent of origin effects,” Clin. Genet., 81, 201–209.10.1111/j.1399-0004.2011.01790.xSuche in Google Scholar PubMed

Hanson, R. L., S. Kobes, R. S. Lindsay and W. C. Knowler (2001): “Assessment of parent-of-origin effects in linkage analysis of quantitative traits,” Am. J. Hum. Genet., 68, 951–962.10.1086/319508Suche in Google Scholar PubMed PubMed Central

Hirschhorn, J. N. and M. J. Daly (2005): “Genome-wide association studies for common diseases and complex traits,” Nat. Rev. Genet., 6, 95–108.10.1038/nrg1521Suche in Google Scholar PubMed

Hoggart, C. J., G. Venturini, M. Mangino, F. Gomez, G. Ascari, J. H. Zhao, A. Teumer, T. W. Winkler, N. Tšernikova, J. Luan and E. Mihailov (2014): “Novel approach identifies SNPs in SLC2A10 and KCNK9 with evidence for parent-of-origin effect on body mass index,” PLoS Genet., 10, e1004508.10.1371/journal.pgen.1004508Suche in Google Scholar PubMed PubMed Central

Howie, B. N., P. Donnelly and J. Marchini (2009): “A flexible and accurate genotype imputation method for the next generation of genome-wide association studies,” PLoS Genet. 5, e1000529.10.1371/journal.pgen.1000529Suche in Google Scholar PubMed PubMed Central

Lawson, H., J. Cheverud and J. Wolf (2013): “Genomic imprinting and parent-of-origin effects on complex traits,” Nat. Rev. Genet., 14, 609–617.10.1038/nrg3543Suche in Google Scholar PubMed PubMed Central

Manolio, T. A., F. S. Collins, N. J. Cox3, D. B. Goldstein, L. A. Hindorff, D. J. Hunter, M. I. McCarthy, E. M. Ramos, L. R. Cardon, A. Chakravarti, J. H. Cho, A. E. Guttmacher, A. Kong, L. Kruglyak, E. Mardis, C. N. Rotimi, M. Slatkin, D. Valle, A. S. Whittemore, M. Boehnke, A. G. Clark, E. E. Eichler, G. Gibson, J. L. Haines, T. F. C. Mackay, S. A. McCarroll and P. M. Visscher (2009): “Finding the missing heritability of complex diseases,” Nature, 461, 747–753.10.1038/nature08494Suche in Google Scholar PubMed PubMed Central

McLachlan, G. and D. Peel (2004): Finite mixture models, Hoboken, NJ: John Wiley & Sons.Suche in Google Scholar

McLachlan, G. J. and K. E. Basford (1988): “Mixture models. inference and applications to clustering, Statistics: textbooks and monographs,” 84, New York, US: Marcel Dekker.Suche in Google Scholar

Peters, J. (2014): “The role of genomic imprinting in biology and disease: an expanding view,” Nat. Rev. Genet., 15, 517–530.10.1038/nrg3766Suche in Google Scholar PubMed

Pinheiro, J. and D. Bates (2006): Mixed-effects models in S and S-PLUS, New York, NY: Springer Science & Business Media.Suche in Google Scholar

Price, A. L., N. J. Patterson, R. M. Plenge, M. E. Weinblatt, N. A. Shadick and D. Reich (2006): “Principal components analysis corrects for stratification in genome-wide association studies,” Nat. Genet., 38, 904–909.10.1038/ng1847Suche in Google Scholar PubMed

Small, K. S., Å. K. Hedman, E. Grundberg, A. C. Nica, G. Thorleifsson, A. Kong, U. Thorsteindottir, S.-Y. Shin, H. B. Richards, the GIANT Consortium, the MAGIC Investigators, the DIAGRAM Consortium, N. Soranzo, K. R. Ahmadi, C. M. Lindgren, K. Stefansson, E. T. Dermitzakis, P. Deloukas, T. D. Spector and M. I. McCarthy for the MuTHER Consortium (2011): “Identification of an imprinted master trans regulator at the KLF14 locus related to multiple metabolic phenotypes,” Nat. Genet., 43, 561–564.10.1038/ng1011-1040cSuche in Google Scholar

Stephens, M. (2000): “Dealing with label switching in mixture models,” J. R. Stat. Soc. Ser. B Stat. Methodol., 62, 795–809.10.1111/1467-9868.00265Suche in Google Scholar

Sun, X., R. Elston, N. Morris and X. Zhu (2013): “What is the significance of difference in phenotypic variability across snp genotypes?” Am. J. Hum. Genet., 93, 390–397.10.1016/j.ajhg.2013.06.017Suche in Google Scholar PubMed PubMed Central

Wang, K., S. K. Ng and G. J. McLachlan (2012): “Clustering of time-course gene expression profiles using normal mixture models with autoregressive random effects,” BMC Bioinformatics, 13, 1.10.1186/1471-2105-13-300Suche in Google Scholar PubMed PubMed Central

Weinberg, C., A. Wilcox and R. Lie (1998): “A log-linear approach to case-parent–triad data: assessing effects of disease genes that act either directly or through maternal effects and that may be subject to parental imprinting,” Am. J. Hum. Genet., 62, 969–978.10.1086/301802Suche in Google Scholar PubMed PubMed Central

Weinberg, C. R. (1999): “Methods for detection of parent-of-origin effects in genetic studies of case-parents triads,” Am. J. Hum. Genet., 65, 229–235.10.1086/302466Suche in Google Scholar PubMed PubMed Central

Published Online: 2017-9-1
Published in Print: 2017-9-26

©2017 Walter de Gruyter GmbH, Berlin/Boston

Heruntergeladen am 7.2.2026 von https://www.degruyterbrill.com/document/doi/10.1515/sagmb-2017-0007/html
Button zum nach oben scrollen