Abstract
The increasing availability of ChIP-seq data demands for advanced statistical tools to analyze the results of such experiments. The inherent features of high-throughput sequencing output call for a modelling framework that can account for the spatial dependency between neighboring regions of the genome and the temporal dimension that arises from observing the protein binding process at progressing time points; also, multiple biological/technical replicates of the experiment are usually produced and methods to jointly account for them are needed. Furthermore, the antibodies used in the experiment lead to potentially different immunoprecipitation efficiencies, which can affect the capability of distinguishing between the true signal in the data and the background noise. The statistical procedure proposed consist of a discrete mixture model with an underlying latent Markov random field: the novelty of the model is to allow both spatial and temporal dependency to play a role in determining the latent state of genomic regions involved in the protein binding process, while combining all the information of the replicates available instead of treating them separately. It is also possible to take into account the different antibodies used, in order to obtain better insights of the process and exploit all the biological information available.
References
Bao, Y., V. Vinciotti, E. Wit and P. T. Hoen (2014): “Joint modelling of ChIP-seq data via a Markov Random Field model,” Biostatistics, 15(2), 296–310.10.1093/biostatistics/kxt047Search in Google Scholar PubMed
Bardet, A. F., Q. He, J. Zeitlinger and A. Stark (2012): “A computational pipeline for comparative ChIP-seq analyses,” Nature Protocols, 7(1), 45–61.10.1038/nprot.2011.420Search in Google Scholar PubMed
Benjamini, Yoav and Yosef Hochberg (1995): “Controlling the false discovery rate: a practical and powerful approach to multiple testing,” Journal of the Royal Statistical Society. Series B (Methodological), 57(1), 289–300.10.1111/j.2517-6161.1995.tb02031.xSearch in Google Scholar
Hilbe, J. M. (2011): Negative binomial regression, Cambridge University Press, Cambridge, England, UK.10.1017/CBO9780511973420Search in Google Scholar
Kharchenko, Peter V., Michael Y. Tolstorukov and Michael Y. Park (2008): “Design and analysis of ChIP-seq experiments for DNA-binding proteins.” Nature Biotechnology, 26, 1351–1359.10.1038/nbt.1508Search in Google Scholar PubMed PubMed Central
Kindermann, Ross and J. Laurie Snell (1980): “Markov Random Fields and Their Applications,” American Mathematical Society, Providence, USA.10.1090/conm/001Search in Google Scholar
Kuan, Pei Fen, Dongjun Chung, Guangjin Pan, James A. Thomson, Ron Stewart and Sündüz Keleş (2001): “A statistical framework for the analysis of ChIP-Seq data,” Journal of the American Statistical Association, 106(495), 891–903.10.1198/jasa.2011.ap09706Search in Google Scholar PubMed PubMed Central
Lauritzen, Steffen L. (1996): Graphical models. Oxford University Press, Oxford, England, UK.Search in Google Scholar
Qin, Zhaohui S., Jianjun Yu, Jincheng Shen, Christopher A. Maher, Ming Hu, Shanker Kalyana-Sundaram, Jindan Yu and Arul M. Chinnaiyan (2010): “HPeak: an HMM-based algorithm for defining read-enriched regions in ChIP-Seq data,” BMC Bioinformatics, 11(1), 369.10.1186/1471-2105-11-369Search in Google Scholar PubMed PubMed Central
Ramos, Yolande F. M., Matthew S. Hestand, Matty Verlaan, Elise Krabbendam, Yavuz Ariyurek, Michiel van Galen, Hans van Dam, Gert-Jan B. van Ommen, Johan T. den Dunnen, Alt Zantema and Peter A. C. ’t Hoen (2010): “Genome-wide assessment of differential roles for p300 and CBP in transcription regulation,” Nucleic Acids Research, 38(16), 5396–5408.10.1093/nar/gkq184Search in Google Scholar PubMed PubMed Central
Spyrou, C., R. Stark, A. G. Lynch and S. Tavar (2009): “BayesPeak: Bayesian analysis of ChIP-seq data,” BMC Bioinformatics, 10(1), 299.10.1186/1471-2105-10-299Search in Google Scholar PubMed PubMed Central
Zeng, Xin, Rajendran Sanalkumar, Emery H. Bresnick, Hongda Li, Qiang Chang and Sündüz Keles (2013): “jMOSAiCS: joint analysis of multiple ChIP-seq datasets,” Genome Biology, 14(4), R38.10.1186/gb-2013-14-4-r38Search in Google Scholar PubMed PubMed Central
Supplemental Material
The online version of this article (DOI: 10.1515/sagmb-2014-0074) offers supplementary material, available to authorized users.
©2015 by De Gruyter
Articles in the same Issue
- Frontmatter
- Research Articles
- Study of triplet periodicity differences inside and between genomes
- H-CLAP: hierarchical clustering within a linear array with an application in genetics
- Inferring bi-directional interactions between circadian clock genes and metabolism with model ensembles
- Bayesian inference for Markov jump processes with informative observations
- Likelihood free inference for Markov processes: a comparison
- Spatio-temporal model for multiple ChIP-seq experiments
- Software and Application Note
- GenePEN: analysis of network activity alterations in complex diseases via the pairwise elastic net
- Corrigendum
- Corrigendum to: Simple estimators of false discovery rates given as few as one or two p-values without strong parametric assumptions
Articles in the same Issue
- Frontmatter
- Research Articles
- Study of triplet periodicity differences inside and between genomes
- H-CLAP: hierarchical clustering within a linear array with an application in genetics
- Inferring bi-directional interactions between circadian clock genes and metabolism with model ensembles
- Bayesian inference for Markov jump processes with informative observations
- Likelihood free inference for Markov processes: a comparison
- Spatio-temporal model for multiple ChIP-seq experiments
- Software and Application Note
- GenePEN: analysis of network activity alterations in complex diseases via the pairwise elastic net
- Corrigendum
- Corrigendum to: Simple estimators of false discovery rates given as few as one or two p-values without strong parametric assumptions