Abstract
The ENCODE project has funded the generation of a diverse collection of methylation profiles using reduced representation bisulfite sequencing (RRBS) technology, enabling the analysis of epigenetic variation on a genomic scale at single-site resolution. A standard application of RRBS experiments is in the location of differentially methylated regions (DMRs) between two sets of samples. Despite numerous publications reporting DMRs identified from RRBS datasets, there have been no formal analyses of the effects of experimental and biological factors on the performance of existing or newly developed analytical methods. These factors include variable read coverage, differing group sample sizes across genomic regions, uneven spacing between CpG dinucleotide sites, and correlation in methylation levels among sites in close proximity. To better understand the interplay among technical and biological variables in the analysis of RRBS methylation profiles, we have developed an algorithm for the generation of experimentally realistic RRBS datasets. Applying insights derived from our simulation studies, we present a novel procedure that can identify DMRs spanning as few as three CpG sites with both high sensitivity and specificity. Using RRBS data from muscle vs. non-muscle cell cultures as an example, we demonstrate that our method reveals many more DMRs that are likely to be of biological significance than previous methods.
References
Akalin, A., F. Garrett-Bakelman, M. Kormaksson, J. Busuttil, L. Zhang, I. Khrebtukova, T. Milne, Y. Huang, D. Biswas, J. Hess, C. Allis, R. Roeder, P. Valk, B. Löwenberg, R. Delwel, H. Fernandez, E. Paietta, M. Tallman, G. Schroth, C. Mason, A. Melnick and M. Figueroa (2012a): “Base-pair resolution DNA methylation sequencing reveals profoundly divergent epigenetic landscapes in acute myeloid leukemia,” PLoS Genet., 8, e1002781.10.1371/journal.pgen.1002781Suche in Google Scholar PubMed PubMed Central
Akalin, A., M. Kormaksson, S. Li, F. Garrett-Bakelman, M. Figueroa, A. Melnick and C. Mason (2012b): “Methylkit: a comprehensive R package for the analysis of genome-wide DNA methylation profiles,” Genome Biol., 13, R87.10.1186/gb-2012-13-10-r87Suche in Google Scholar PubMed PubMed Central
Benjamini, Y. and Y. Hochberg (1995): “Controlling the false discovery rate: a practical and powerful approach to multiple testing,” J. Roy. Stat. Soc. B., 57, 289–300.Suche in Google Scholar
Berman, B., D. Weisenberger, J. Aman, T. Hinoue, Z. Ramjan, Y. Liu, H. Noushmehr, C. Lange, C. van Dijk, R. Tollenaar, D. V. D. Berg and P. Laird (2012): “Regions of focal DNA hypermethylation and long-range hypomethylation in colorectal cancer coincide with nuclear lamina-associated domains,” Nat. Genet., 44, 40–46.Suche in Google Scholar
Bock, C., E. Tomazou, A. Brinkman, F. Müller, F. Simmer, H. Gu, N. Jäger, A. Gnirke, H. Stunnenberg and A. Meissner (2010): “Quantitative comparison of genome-wide DNA methylation mapping technologies,” Nat. Biotechnol., 28, 1106–1114.Suche in Google Scholar
Chou, R. H., Y. L. Yu and M. C. Hung (2011): “The roles of EZH2 in cell lineage commitment,” Am. J. Transl. Res., 3, 243–250.Suche in Google Scholar
Coulombe, Y., M. Lemieux, J. Moreau, J. Aubin, M. Joksimovic, F. A. Berube-Simard, S. Tabaries, O. Boucherat, F. Guillou, C. Larochelle, C. K. Tuggle and L. Jeannotte (2010): “Multiple promoters and alternative splicing: Hoxa5 transcriptional complexity in the mouse embryo,” PLoS One, 5, e10600.10.1371/journal.pone.0010600Suche in Google Scholar PubMed PubMed Central
Fong, A. P., Z. Yao, J. W. Zhong, Y. Cao, W. L. Ruzzo, R. C. Gentleman and S. J. Tapscott (2012): “Genetic and epigenetic determinants of neurogenesis and myogenesis,” Dev. Cell, 22, 721–735.Suche in Google Scholar
Gentleman, R. C., V. J. Carey, D. M. Bates and others (2004): “Bioconductor: Open software development for computational biology and bioinformatics,” Genome Biol., 5, R80. URL http://genomebiology.com/2004/5/10/R80.Suche in Google Scholar
Hansen, K., B. Langmead and R. Irizarry (2012): “Bsmooth: from whole genome bisulfite sequencing reads to differentially methylated regions,” Genome Biol., 13, R83.Suche in Google Scholar
Harte, D. (2012): HiddenMarkov: Hidden Markov Models, Statistics Research Associates, Wellington. URL http://cran.at.r-project.org/web/packages/HiddenMarkov, R package version 1.7-0.Suche in Google Scholar
Hartung, T., L. Zhang, R. Kanwar, I. Khrebtukova, M. Reinhardt, C. Wang, T. Therneau, M. Banck, G. Schroth and A. Beutler (2012): “Diametrically opposite methylome-transcriptome relationships in high- and low-CpG promoter genes in postmitotic neural rat tissue,” Epigenetics, 7, 421–428.10.4161/epi.19565Suche in Google Scholar PubMed PubMed Central
Hebestreit, K., M. Dugas and H. U. Klein (2013): “Detection of significantly differentially methylated regions in targeted bisulfite sequencing data,” Bioinformatics, 29, 1647–1653.10.1093/bioinformatics/btt263Suche in Google Scholar PubMed
Hofert, M., I. Kojadinovic, M. Maechler and J. Yan (2012): Copula: Multivariate Dependence with Copulas. URL http://CRAN.R-project.org/package=copula, R package version 0.999-5.Suche in Google Scholar
Huang, D., B. Sherman and R. Lempicki (2009a): “Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists,” Nucleic Acids Res., 37, 1–13.10.1093/nar/gkn923Suche in Google Scholar PubMed PubMed Central
Huang, D., B. Sherman and R. Lempicki (2009b): “Systematic and integrative analysis of large gene lists using DAVID Bioinformatics resources,” Nat. Protocols, 4, 44–57.10.1038/nprot.2008.211Suche in Google Scholar PubMed
Hupkes, M., M. K. Jonsson, W. J. Scheenen, W. van Rotterdam, A. M. Sotoca, E. P. van Someren, M. A. van der Heyden, T. A. van Veen, R. I. van Ravestein-van Os, S. Bauerschmidt, E. Piek, D. L. Ypey, E. J. van Zoelen and K. J. Dechering (2011): “Epigenetics: DNA demethylation promotes skeletal myotube maturation,” FASEB J., 25, 3861–3872.Suche in Google Scholar
Inose, H., H. Ochi, A. Kimura, K. Fujita, R. Xu, S. Sato, M. Iwasaki, S. Sunamura, Y. Takeuchi, S. Fukumoto, K. Saito, T. Nakamura, H. Siomi, H. Ito, Y. Arai, K. Shinomiya and S. Takeda (2009): “A microRNA regulatory mechanism of osteoblast differentiation,” Proc. Natl. Acad. Sci. USA, 106, 20794–20799.10.1073/pnas.0909311106Suche in Google Scholar PubMed PubMed Central
Kane, M. J. and J. W. Emerson (2011): Bigmemory: Manage massive matrices with shared memory and memory-mapped files. URL http://CRAN.R-project.org/package=bigmemory, R package version 4.2.11.Suche in Google Scholar
Kojadinovic, I. and J. Yan (2010): “Modeling multivariate distributions with continuous margins using the copula R package,” J. Stat. Software, 34, 1–20, URL http://www.jstatsoft.org/v34/i09/.10.18637/jss.v034.i09Suche in Google Scholar
Laird, P. (2010): “Principles and challenges of genomewide DNA methylation analysis,” Nat. Rev. Genet., 11, 191–203.Suche in Google Scholar
Lawrence, M., R. Gentleman and V. Carey (2009): “rtracklayer: an R package for interfacing with genome browsers,” Bioinformatics, 25, 1841–1842, URL http://bioinformatics.oxfordjournals.org/content/25/14/1841.abstract.10.1093/bioinformatics/btp328Suche in Google Scholar PubMed PubMed Central
Li, S., F. E. Garrett-Bakelman, A. Akalin, P. Zumbo, R. Levine, B. L. To, I. D. Lewis, A. L. Brown, R. J. D’Andrea, A. Melnick and C. E. Mason (2013): “An optimized algorithm for detecting and annotating regional differential methylation,” BMC Bioinformatics, 14 Suppl 5, S10.10.1186/1471-2105-14-S5-S10Suche in Google Scholar PubMed PubMed Central
Lister, R., M. Pelizzola, R. Dowen, R. Hawkins, G. Hon, J. Tonti-Filippini, J. Nery, L. Lee, Z. Ye, Q. Ngo, L. Edsall, J. Antosiewicz-Bourget, R. Stewart, V. Ruotti, A. Millar, J. Thomson, B. Ren and J. Ecker (2009): “Human DNA methylomes at base resolution show widespread epigenomic differences,” Nature, 462, 315–322.10.1038/nature08514Suche in Google Scholar PubMed PubMed Central
Maconochie, M., S. Nonchev, A. Morrison and R. Krumlauf (1996): “Paralogous Hox genes: function and regulation,” Annu. Rev. Genet., 30, 529–556.Suche in Google Scholar
Meissner, A., T. Mikkelsen, H. Gu, M. Wernig, J. Hanna, A. Sivachenko, X. Zhang, B. Bernstein, C. Nusbaum, D. Jaffe, A. Gnirke, R. Jaenisch and E. Lander (2007): “Genome-scale DNA methylation maps of pluripotent and differentiated cells,” Nature, 454, 766–770.10.1038/nature07107Suche in Google Scholar PubMed PubMed Central
Mercer, T. R., D. J. Gerhardt, M. E. Dinger, J. Crawford, C. Trapnell, J. A. Jeddeloh, J. S. Mattick and J. L. Rinn (2012): “Targeted RNA sequencing reveals the deep complexity of the human transcriptome,” Nat. Biotechnol., 30, 99–104.Suche in Google Scholar
Nelsen, R. B. (1999): An Introduction to Copulas, Lecture Notes in Statistics, vol. 139, Springer: New York, USA.10.1007/978-1-4757-3076-0Suche in Google Scholar
Ng, C., F. Yildirim, Y. Yap, S. Dalin, B. Matthews, P. Velez, A. Labadorf, D. Housman and E. Fraenkel (2013): “Extensive changes in DNA methylation are associated with expression of mutant huntingtin,” Proc. Natl. Acad. Sci. USA, 110, 2354–2359.10.1073/pnas.1221292110Suche in Google Scholar PubMed PubMed Central
Pedersen, B. S., D. A. Schwartz, I. V. Yang and K. J. Kechris (2012): “Comb-p: software for combining, analyzing, grouping and correcting spatially correlated P-values,” Bioinformatics, 28, 2986–2988.10.1093/bioinformatics/bts545Suche in Google Scholar PubMed PubMed Central
Ramachandrareddy, H., A. Bouska, Y. Shen, M. Ji, A. Rizzino, W. C. Chan and T. W. McKeithan (2010): “BCL6 promoter interacts with far upstream sequences with greatly enhanced activating histone modifications in germinal center B cells,” Proc. Natl. Acad. Sci. USA, 107, 11930–11935.10.1073/pnas.1004962107Suche in Google Scholar PubMed PubMed Central
R Core Team (2012): R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria, URL http://www.R-project.org/, ISBN 3-900051-07-0.Suche in Google Scholar
Rosca, A. M. and A. Burlacu (2011): “Effect of 5-azacytidine: evidence for alteration of the multipotent ability of mesenchymal stem cells,” Stem Cells Dev., 20, 1213–1221.Suche in Google Scholar
Schlather, M., P. Menck, R. Singleton, B. Pfaff and R. C. team (2013): RandomFields: simulation and analysis of random fields, URL http://CRAN.R-project.org/package=RandomFields, R package version 2.0.66.Suche in Google Scholar
Schoofs, T., C. Rohde, K. Hebestreit, H. U. Klein, S. Gollner, I. Schulze, M. Lerdrup, N. Dietrich, S. Agrawal-Singh, A. Witten, M. Stoll, E. Lengfelder, W. K. Hofmann, P. Schlenke, T. Buchner, K. Hansen, W. E. Berdel, F. Rosenbauer, M. Dugas and C. Muller-Tidow (2013): “DNA methylation changes are a late event in acute promyelocytic leukemia and coincide with loss of transcription factor binding,” Blood, 121, 178–187.10.1182/blood-2012-08-448860Suche in Google Scholar PubMed
Sessa, L., A. Breiling, G. Lavorgna, L. Silvestri, G. Casari and V. Orlando (2007): “Noncoding RNA synthesis and loss of Polycomb group repression accompanies the colinear activation of the human HOXA cluster,” RNA, 13, 223–239.10.1261/rna.266707Suche in Google Scholar PubMed PubMed Central
Tsumagari, K., S. Chang, M. Lacey, C. Baribault, S. Chittur, J. Sowden, R. Tawil, G. Crawford and M. Ehrlich (2011): “Gene expression during normal and FSHD myogenesis,” BMC Medical Genomics, 4, 67.10.1186/1755-8794-4-67Suche in Google Scholar PubMed PubMed Central
Tsumagari, K., C. Baribault, J. Terragni, K. Varley, J. Gertz, S. Pradhan, M. Badoo, C. Crain, L. Song, G. Crawford, R. Myers, M. Lacey and M. Ehrlich (2013): “Early de novo DNA methylation and prolonged demethylation in the muscle lineage,” Epigenetics, 8, 317–332.10.4161/epi.23989Suche in Google Scholar PubMed PubMed Central
Varley, K., J. Gertz, K. Bowling, S. Parker, T. Reddy, F. Pauli-Behn, M. Cross, B. Williams, J. Stamatoyannopoulos, G. Crawford, D. Absher, B. Wold and R. Myers (2013): “Dynamic DNA methylation across diverse human cell lines and tissues,” Genome Res., 23, 555–567.Suche in Google Scholar
von Maltzahn, J., N. Chang, C. Bentzinger and M. Rudnicki (2012a): “Wnt signaling in myogenesis,” Trends Cell Biol., 22, 602–609.10.1016/j.tcb.2012.07.008Suche in Google Scholar PubMed PubMed Central
von Maltzahn, J., J. Renaud, G. Parise and M. Rudnicki (2012b): “Wnt7a treatment ameliorates muscular dystrophy,” Proc. Natl. Acad. Sci. USA, 109, 20614–20619.10.1073/pnas.1215765109Suche in Google Scholar PubMed PubMed Central
Zaykin, D., L. Zhivotovsky, P. Westfall and B. Weir (2002): “Truncated product method for combining p-values,” Genetic Epidemiol., 22, 170–185.Suche in Google Scholar
Zhang, Y., H. Liu, J. Lv, X. Xiao, J. Zhu, X. Liu, J. Su, X. Li, Q. Wu, F. Wang and Y. Cui (2011): “QDMR: a quantitative method for identification of differentially methylated regions by entropy,” Nucleic Acids Res., 39, e58.Suche in Google Scholar
©2013 by Walter de Gruyter Berlin Boston
Artikel in diesem Heft
- Masthead
- Masthead
- Research Articles
- A new variance stabilizing transformation for gene expression data analysis
- Kernel approximate Bayesian computation in population genetic inferences
- Permutation tests for analyzing cospeciation in multiple phylogenies: applications in tri-trophic ecology
- Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies
- Modeling, simulation and analysis of methylation profiles from reduced representation bisulfite sequencing experiments
- Estimation of weighted log partial area under the ROC curve and its application to MicroRNA expression data
- Random forests on distance matrices for imaging genetics studies
Artikel in diesem Heft
- Masthead
- Masthead
- Research Articles
- A new variance stabilizing transformation for gene expression data analysis
- Kernel approximate Bayesian computation in population genetic inferences
- Permutation tests for analyzing cospeciation in multiple phylogenies: applications in tri-trophic ecology
- Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies
- Modeling, simulation and analysis of methylation profiles from reduced representation bisulfite sequencing experiments
- Estimation of weighted log partial area under the ROC curve and its application to MicroRNA expression data
- Random forests on distance matrices for imaging genetics studies