Abstract
We present a novel method for simultaneous inference and nonparametric clustering of transcriptional dynamics from gene expression data. The proposed method uses gene expression data to infer time-varying TF profiles and cluster these temporal profiles according to the dynamics they exhibit. We use the latent structure of factorial hidden Markov model to model the transcription factor profiles as Markov chains and cluster these profiles using nonparametric mixture modeling. An efficient Gibbs sampling scheme is proposed for inference of latent variables and grouping of transcriptional dynamics into a priori unknown number of clusters. We test our model on simulated data and analyse its performance on two expression datasets; S. cerevisiae cell cycle data and E. coli oxygen starvation response data. Our results show the applicability of the method for genome wide analysis of expression data.
- 1
The network connectivity matrix is a binary matrix: an entry 1 in position ln denotes a (directed) edge between node l and node n.
- 2
We attempted to cluster the inferred transition rates by fitting a mixture of Beta distributions, but the predictor obtained in terms of co-occurrence matrix was hardly better than random.
- 3
References
Asif, H. S. and G. Sanguinetti (2011): “Large-scale learning of combinatorial transcriptional dynamics from gene expression,” Bioinformatics, 27, 1277–1283.10.1093/bioinformatics/btr113Suche in Google Scholar PubMed
Asif, H. S., M. D. Rolfe, J. Green, N. D. Lawrence, M. Rattray and G. Sanguinetti (2010): “TFInfer: a tool for probabilistic inference of transcription factor activities,” Bioinformatics, 26, 2635–2636.10.1093/bioinformatics/btq469Suche in Google Scholar PubMed
Boys, R., D. Henderson and D. Wilkinson (2000): “Detecting homogeneous segments in DNA sequences by using hidden Markov models,” J. Roy. Stat. Soc. C -App., 49, 269–285.Suche in Google Scholar
Dahl, D. (2006): Model-based clustering for expression data via a Dirichlet process mixture model. In: Do, K.A., et al., (Eds.), Bayesian Inference for Gene Expression and Proteomics. Cambridge: Cambridge University Press, pp. 201–218.10.1017/CBO9780511584589.011Suche in Google Scholar
Davidge, K. S., G. Sanguinetti, C. H. Yee, A. G. Cox, C. W. McLeod, C. E. Monk, B. E. Mann, R. Motterlini and R. K. Poole (2009): “Carbon monoxide-releasing antibacterial molecules target respiration and global transcriptional regulators,” J. Biol. Chem., 284, 4516–4524.Suche in Google Scholar
Ferguson, T. (1973): “A Bayesian analysis of some nonparametric problems,” Ann. Stat., 1, 209–230.Suche in Google Scholar
Ghahramani, Z. and M. Jordan (1997): “Factorial hidden Markov models,” Mach. Learn., 29, 245–273.Suche in Google Scholar
Lee, T. I., N. J. Rinaldi, F. Robert, D. T. Odom, Z. Bar-Joseph, G. K. Gerber, N. M. Hannett, C. T. Harbison, C. M. Thompson, I. Simon, J. Zeitlinger, E. G. Jennings, H. L. Murray, D. B. Gordon, B. Ren, J. J. Wyrick, J.-B. Tagne, T. L. Volkert, E. Fraenkel, D. K. Gifford and R. A. Young (2002): “Transcriptional regulatory networks in Saccharomyces Cerevisiae,” Science, 298, 799–804.10.1126/science.1075090Suche in Google Scholar PubMed
Liao, J., R. Boscolo, Y. Yang, L. Tran, C. Sabatti and V. Roychowdhury (2003): “Network component analysis: reconstruction of regulatory signals in biological systems,” Proc. Natl. Acad. Sci., 100, 15522–15527.Suche in Google Scholar
Medvedovic, M. and S. Sivaganesan (2002): “Bayesian infinite mixture model based clustering of gene expression profiles,” Bioinformatics, 18, 1194–1206.10.1093/bioinformatics/18.9.1194Suche in Google Scholar PubMed
Ocone, A., A. J. Millar and G. Sanguinetti (2013): “Hybrid regulatory models: a statistically tractable approach to model regulatory network dynamics,” Bioinformatics, 29, 910–916.10.1093/bioinformatics/btt069Suche in Google Scholar PubMed
Opper, M. and G. Sanguinetti (2010): “Learning combinatorial transcriptional dynamics from gene expression data,” Bioinformatics, 26, 1623–1629.10.1093/bioinformatics/btq244Suche in Google Scholar PubMed
Partridge, J., G. Sanguinetti, D. Dibden, R. Roberts, R. Poole, and J. Green (2007): “Transition of Escherichia coli from aerobic to micro-aerobic conditions involves fast and slow reacting regulatory components,” J. Biol. Chem., 282, 11230–11237.Suche in Google Scholar
Ptashne, M. and A. Gann (2002): Genes & signals. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press.Suche in Google Scholar
Sanguinetti, G., N. Lawrence and M. Rattray (2006): “Probabilistic inference of transcription factor concentrations and gene-specific regulatory activities,” Bioinformatics, 22, 2775–2881.10.1093/bioinformatics/btl473Suche in Google Scholar PubMed
Savage, R., Z. Ghahramani, J. Griffin, B. de la Cruz and D. Wild (2010): “Discovering transcriptional modules by Bayesian data integration,” Bioinformatics, 26, i158–i167.10.1093/bioinformatics/btq210Suche in Google Scholar PubMed PubMed Central
Shiraishi, Y., S. Kimura and M. Okada (2010): “Inferring cluster-based networks from differently stimulated multiple time-course gene expression data,” Bioinformatics, 26, 1073–1081.10.1093/bioinformatics/btq094Suche in Google Scholar PubMed PubMed Central
Shi, Y., M. Klutstein, I. Simon, T. Mitchell and Z. Bar-Joseph (2009): “A combined expression-interaction model for inferring the temporal activity of transcription factors,” J. Comput. Biol., 16, 1035–1049.Suche in Google Scholar
Spellman, P. T., G. Sherlock, M. Q. Zhang, V. R. Iyer, K. Anders, M. B. Eisen, P. O. Brown, D. Botstein and B. Futcher (1998): “Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces Cerevisiae by microarray hybridization,” Mol. Biol. Cell, 9, 3273–3297.Suche in Google Scholar
Stephens, M. (2000): “Dealing with label switching in mixture models,” J. Roy. Stat. Soc B, 62, 795–809.Suche in Google Scholar
Van Gael, J., Y. Teh and Z. Ghahramani (2009): “The infinite factorial hidden Markov model,” Adv. Neural Inf. Process. Syst., 21, 1697–1704.Suche in Google Scholar
©2013 by Walter de Gruyter Berlin Boston
Artikel in diesem Heft
- Masthead
- Masthead
- Research Articles
- Simultaneous inference and clustering of transcriptional dynamics in gene regulatory networks
- Markov chain Monte Carlo sampling of gene genealogies conditional on unphased SNP genotype data
- Performance and estimation of the true error rate of classification rules built with additional information. An application to a cancer trial
- Optimizing threshold-schedules for sequential approximate Bayesian computation: applications to molecular systems
- Model selection for prognostic time-to-event gene signature discovery with applications in early breast cancer data
- Identifying clusters in genomics data by recursive partitioning
Artikel in diesem Heft
- Masthead
- Masthead
- Research Articles
- Simultaneous inference and clustering of transcriptional dynamics in gene regulatory networks
- Markov chain Monte Carlo sampling of gene genealogies conditional on unphased SNP genotype data
- Performance and estimation of the true error rate of classification rules built with additional information. An application to a cancer trial
- Optimizing threshold-schedules for sequential approximate Bayesian computation: applications to molecular systems
- Model selection for prognostic time-to-event gene signature discovery with applications in early breast cancer data
- Identifying clusters in genomics data by recursive partitioning