Combining gene expression data and prior knowledge for inferring gene regulatory networks via Bayesian networks using structural restrictions
Abstract
Gene Regulatory Networks (GRNs) are known as the most adequate instrument to provide a clear insight and understanding of the cellular systems. One of the most successful techniques to reconstruct GRNs using gene expression data is Bayesian networks (BN) which have proven to be an ideal approach for heterogeneous data integration in the learning process. Nevertheless, the incorporation of prior knowledge has been achieved by using prior beliefs or by using networks as a starting point in the search process. In this work, the utilization of different kinds of structural restrictions within algorithms for learning BNs from gene expression data is considered. These restrictions will codify prior knowledge, in such a way that a BN should satisfy them. Therefore, one aim of this work is to make a detailed review on the use of prior knowledge and gene expression data to inferring GRNs from BNs, but the major purpose in this paper is to research whether the structural learning algorithms for BNs from expression data can achieve better outcomes exploiting this prior knowledge with the use of structural restrictions. In the experimental study, it is shown that this new way to incorporate prior knowledge leads us to achieve better reverse-engineered networks.
Funding source: Ministerio de Economía y Competitividad
Award Identifier / Grant number: TEC2015-69496-R and TIN2016-77902-C3-2-P
Funding statement: This work has been supported by the Spanish “Ministerio de Economía y Competitividad” and by “Fondo Europeo de Desarrollo Regional” (FEDER) under Project Projects, Funder Id http://dx.doi.org/10.13039/501100006280, TEC2015-69496-R and TIN2016-77902-C3-2-P.
References
Acid, S. and L. M. de Campos (2003): “Searching for Bayesian network structures in the space of restricted acyclic partially directed graphs,” J. Artif. Intell Res., 18, 445–490.10.1613/jair.1061Suche in Google Scholar
Acid, S., L. M. de Campos, J. M. Fernández-Luna, S. Rodríguez, J. M. Rodríguez and J. L. Salcedo (2004): “A comparison of learning algorithms for bayesian networks: a case study based on data from an emergency medical service,” Artif. Intell. Med., 30, 215–232.10.1016/j.artmed.2003.11.002Suche in Google Scholar PubMed
Acid, S., L. M. de Campos and M. Fernández (2013): “Score-based methods for learning markov boundaries by searching in onstrained spaces,” Data Min. Knowl. Disc., 26, 174–212.10.1007/s10618-011-0247-5Suche in Google Scholar
Aderhold, A., D. Husmeier and M. Grzegorczyk (2014): “Statistical inference of regulatory networks for circadian regulation,” Stat. Appl. Genet. Mo. B., 13, 227–273.10.1515/sagmb-2013-0051Suche in Google Scholar PubMed
Almasri, E., P. Larsen, G. Chen and Y. Dai (2008): “Incorporating literature knowledge in Bayesian network for inferring gene networks with gene expression data,” Lect. Notes Comput. Sc., 4983, 184–195.10.1007/978-3-540-79450-9_18Suche in Google Scholar
Banf, M. and S. Y. Rhee (2017): “Computational inference of gene regulatory networks: Approaches, limitations and opportunities,” Biochim. Biophys. Acta, 1860, 1, 41–52.10.1016/j.bbagrm.2016.09.003Suche in Google Scholar PubMed
Bansal, M., V. Belcastro, A. Ambesi-Impiombato and D. di Bernardo (2007): “How to infer gene networks from expression profiles,” Mol. Syst. Biol., 3, 1. doi:10.1038/msb4100120.Suche in Google Scholar PubMed PubMed Central
Bellman, R. E. (1957): Dynamic programming, Princeton University Press, Princeton, New Jersey.Suche in Google Scholar
Buntine, W. (1991): “Theory refinement on Bayesian networks.” In: D’Ambrosio, Bruce D., Smets, Philippe & Bonissone, Piero P. (eds.), Proceedings Uncertainty in Artificial Intelligence, pp. 52–60. doi:10.1016/B978-1-55860-203-8.50010-3.Suche in Google Scholar
Buntine, W. (1996): “A guide to the literature on learning probabilistic networks from data,” IEEE T. Knowl. Data En., 8, 195–210.10.1109/69.494161Suche in Google Scholar
Chai, L. E., S. K. Loh, S. T. Low, M. S. Mohamad, S. Deris and Z. Zakaria (2014): “A review on the computational approaches for gene regulatory network construction,” Comput. Biol. Med., 48, 55–65.10.1016/j.compbiomed.2014.02.011Suche in Google Scholar PubMed
Chen, G., M. J. Cairelli, H. Kilicoglu, D. Shin and T. C. Rindflesch (2014): “Augmenting microarray data with literature-based knowledge to enhance gene regulatory network inference,” PLoS Comput. Biol., 10, e1003666. doi:10.1371/journal.pcbi.1003666.Suche in Google Scholar
Cheng, J., R. Greiner, J. Kelly, D. Bell and W. Liu (2002): “Learning Bayesian networks from data: An information-theory based approach,” Artif. Intell., 137, 43–90.10.1016/S0004-3702(02)00191-1Suche in Google Scholar
Chickering, D. M. (1995): “A transformational characterization of equivalent Bayesian network structures,” In: Besnard, Philippe & Hanks, Steve (eds.), Proceedings Uncertainty in Artificial Intelligence, pp. 87–98.Suche in Google Scholar
Cho, R. J., M. J. Campbell, E. A. Winzeler, L. Steinmetz, A. Conway, L. Wodicka, T. G. Wolfsberg, A. E. Gabrielian, D. Landsman, D. J. Lockhart and R. W. Davis (1998): “A genome-wide transcriptional analysis of the mitotic cell cycle,” Mol. Cell., 2, 65–73.10.1016/S1097-2765(00)80114-8Suche in Google Scholar
Chow, C. and C. Liu (1968): “Approximating discrete probability distributions with dependence trees,” IEEE T. Inform. Theory, 14, 462–467.10.1109/TIT.1968.1054142Suche in Google Scholar
Cooper, G. F. and E. Herskovits. (1992): “A Bayesian method for the induction of probabilistic networks from data,” Mach. Learn., 9, 309–347.10.1007/BF00994110Suche in Google Scholar
de Campos, L. M. and J. G. Castellano (2007): “Bayesian network learning algorithms using structural restrictions,” Int. J. Approx. Reason., 45, 2, 233–254.10.1016/j.ijar.2006.06.009Suche in Google Scholar
de Campos, L. M. and J. F. Huete (2000): “A new approach for learning belief networks using independence criteria,” Int. J. Approx. Reason., 24, 11–37.10.1016/S0888-613X(99)00042-0Suche in Google Scholar
Djebbari, A. and J. Quackenbush (2008): “Seeded Bayesian networks: constructing genetic networks from microarray data,” BMC Syst. Biol., 2, 57.10.1186/1752-0509-2-57Suche in Google Scholar PubMed PubMed Central
Elvira Consortium (2002): “Elvira: An environment for probabilistic graphical models.” In: Gámez, J. and A. Salmerón, (eds.), Proceedings of the 1st European Workshop on Probabilistics Graphical Models, pp. 222–230.Suche in Google Scholar
Esteves, G. H. and L. F. L. Reis (2018): “A statistical method for measuring activation of gene regulatory networks,” Stat. Appl. Genet. Mo. B., 17, 3. doi:10.1515/sagmb-2016-0059.Suche in Google Scholar PubMed
Friedman, N. (2004): “Inferring cellular networks using probabilistic graphical models,” Science, 303, 5659, 799–805.10.1126/science.1094068Suche in Google Scholar PubMed
Friedman, N., M. Linial, I. Nachman and D. Pe’er (2000): “Using Bayesian networks to analyze expression data,” J. Comput. Biol., 7, 601–620.10.1145/332306.332355Suche in Google Scholar
Gámez, J. A., J. L. Mateo and J. M. Puerta (2011): “Learning Bayesian networks by hill climbing: efficient methods based on progressive restriction of the neighborhood,” Data Min. Knowl. Disc., 22, 106–148.10.1007/s10618-010-0178-6Suche in Google Scholar
Gifford, D. K. (2001): “Blazing pathways through genetic mountains,” Science, 293, 2049–2051.10.1126/science.1065113Suche in Google Scholar
Good, I. J. (1965): The estimation of probabilities, The MIT Press, Cambridge, MA.Suche in Google Scholar
Hartemink, A. J.,D. K. Gifford, T. S. Jaakkola and R. A. Young (2001): “Using graphical models and genomic expression data to statistically validate models of genetic regulatory networks,” Pac. Symp. Biocomput., 422–433. DOI: 10.1142/9789812799623_0041.Suche in Google Scholar
Hartemink, A. J., D. K. Gifford, T. S. Jaakkola and R. A. Young (2002a): “Bayesian methods for elucidating genetic regulatory networks,” IEEE Intell. Syst., 17, 37–43.10.1109/MIS.2002.999218Suche in Google Scholar
Hartemink, A. J., D. K. Gifford, T. S. Jaakkola and R. A. Young (2002b): “Combining location and expression data for principled discovery of genetic regulatory network models,” Pac. Symp. Biocomput., 437–449. DOI: 10.1142/9789812799623_0041.Suche in Google Scholar
Heckerman, D., D. Geiger and D. M. Chickering (1995): “Learning Bayesian networks: The combination of knowledge and statistical data,” Mach. Learn., 20, 197–243.10.1016/B978-1-55860-332-5.50042-0Suche in Google Scholar
Hoefsloot, H. C., S. Smit and A. K. Smilde (2008): “A classification model for the leiden proteomics competition,” Stat. Appl. Genet. Mo. B. 7, 2. doi:10.2202/1544-6115.1351.Suche in Google Scholar
Hughes, T. R., M. J. Marton, A. R. Jones, C. J. Roberts, R. Stoughton, C. D. Armour, H. A. Bennett, E. Coffey, H. Dai, Y. D. He, M. J. Kidd, A. M. King, M. R. Meyer, D. Slade, P. Y. Lum, S. B. Stepaniants, D. D. Shoemaker, D. Gachotte, K. Chakraburtty, J. Simon, M. Bard and S. H. Friend (2000): “Functional discovery via a compendium of expression profiles,” Cell, 102, 109–126.10.1016/S0092-8674(00)00015-5Suche in Google Scholar
Imoto, S., T. Higuchi, T. Goto, K. Tashiro, S. Kuhara and S. Miyano (2003a): “Combining microarrays and biological knowledge for estimating gene networks via Bayesian networks,” In: 2nd IEEE Computer Society Bioinformatics Conf., pp. 104–113.Suche in Google Scholar
Imoto, S., S. Kim, T. Goto, S. Miyano, S. Aburatani, K. Tashiro and S. Kuhara (2003b): “Bayesian network and nonparametric heteroscedastic regression for nonlinear modeling of genetic network,” J. Bioinf. Comput. Biol. 1, 231–252.10.1109/CSB.2002.1039344Suche in Google Scholar
Imoto, S., T. Higuchi, T. Goto, K. Tashiro, S. Kuhara and S. Miyano (2004): “Combining microarrays and biological knowledge for estimating gene networks via Bayesian networks,” J. Bioinf. Comput. Biol., 2, 77–98.10.1109/CSB.2003.1227309Suche in Google Scholar
Isci, S., H. Dogan, C. Ozturk and H. H. Otu (2014): “Bayesian network prior: network analysis of biological data using external knowledge,” Bioinformatics, 30, 860–867.10.1093/bioinformatics/btt643Suche in Google Scholar PubMed PubMed Central
Kanehisa, M., M. Araki, S. Goto, M. Hattori, M. Hirakawa, M. Itoh, T. Katayama, S. Kawashima, S. Okuda, T. Tokimatsu and Y. Yamanishi (2008): “KEGG for linking genomes to life and the environment,” Nucleic Acids Res., 36, 480–484.10.1093/nar/gkm882Suche in Google Scholar PubMed PubMed Central
Kim, H., G. H. Golub and H. Park (2005): “Missing value estimation for DNA microarray gene expression data: local least squares imputation,” Bioinformatics, 21, 187–198.10.1093/bioinformatics/bth499Suche in Google Scholar PubMed
Lam, W. and F. Bacchus (1994): “Learning Bayesian belief networks: An approach based on the MDL principle,” Compu. Intell., 10, 269–293.10.1111/j.1467-8640.1994.tb00166.xSuche in Google Scholar
Larsen, P., E. Almasri, G. Chen and Y. Dai (2007): “A statistical method to incorporate biological knowledge for generating testable novel gene regulatory interactions from microarray experiments,” BMC Bioinformatics, 8, 317. doi:10.1186/1471-2105-8-317.Suche in Google Scholar PubMed PubMed Central
Le Phillip, P., A. Bahl and L. H. Ungar (2004): “Using prior knowledge to improve genetic network reconstruction from microarray data,” In Silico Biology, 4, 335–353.Suche in Google Scholar
Lee, W. P. and W. S. Tzou (2009): “Computational methods for discovering gene networks from expression data,” Brief. Bioinform., 10, 408–423.10.1093/bib/bbp028Suche in Google Scholar PubMed
Li, S., L. Wu and Z. Zhang (2006): “Constructing biological networks through combined literature mining and microarray analysis: a LMMA approach,” Bioinformatics, 22, 2143–2150.10.1093/bioinformatics/btl363Suche in Google Scholar PubMed
Linde, J., S. Schulze, S. G. Henkel and R. Guthke (2015): “Data- and knowledge-based modeling of gene regulatory networks: an update,” Exp. and Clin. Sci., 14, 346–378.Suche in Google Scholar
Markowetz, F. and R. Spang (2007): “Inferring cellular networks – a review,” BMC Bioinformatics, 8, 6. doi:10.1186/1471-2105-8-S6-S5.Suche in Google Scholar PubMed PubMed Central
Mewes, H. W., D. Frishman, U. Güldener, G. Mannhaupt, K. F. X. Mayer, M. Mokrejs, B. Morgenstern, M. Münsterkötter, S. Rudd and B. Weil (2002): “MIPS: a database for genomes and protein sequences,” Nucleic Acids Res., 30, 31–34.10.1093/nar/30.1.31Suche in Google Scholar PubMed PubMed Central
Mukherjee, S. and T. P. Speed (2008): “Network inference using informative priors,” PNAS, 105, 14313–14318.10.1073/pnas.0802272105Suche in Google Scholar PubMed PubMed Central
Nariai, N., S. Kim, S. Imoto and S. Miyano ( 2004): “Using protein-protein interactions for refining gene networks estimated from microarray data by Bayesian networks,” Pac. Symp. Biocomput., 336–347. DOI: 10.1142/9789812704856_0032.Suche in Google Scholar PubMed
Nariai, N., Y. Tamada, S. Imoto and S. Miyano (2005): “Estimating gene regulatory networks and protein–protein interactions of Saccharomyces cerevisiae from multiple genome-wide data,” Bioinformatics, 21, 206–212.10.1093/bioinformatics/bti1133Suche in Google Scholar PubMed
Njah, H. and S. Jamoussi (2015): “Weighted ensemble learning of Bayesian network for gene regulatory networks,” Neurocomputing, 150, 404–416.10.1016/j.neucom.2014.05.078Suche in Google Scholar
Oates, C. J., R. Amos and S. E. Spencer (2014): “Quantifying the multi-scale performance of network inference algorithms,” Stat. Appl. Genet. Mo. B., 13, 611–631.10.1515/sagmb-2014-0012Suche in Google Scholar PubMed
Pearl, J. (1988): Probabilistic reasoning in intelligent systems: Networks of plausible inference, Morgan Kaufmann Publishers Inc., San Francisco (CA).10.1016/B978-0-08-051489-5.50008-4Suche in Google Scholar
Sachs, K., O. Perez, D. Pe’er, D. A. Lauffenburger and G. P. Nolan (2005): “Causal protein-signaling networks derived from multiparameter single-cell data,” Science, 308, 523–529.10.1126/science.1105809Suche in Google Scholar PubMed
Sauta, E., A. Demartini, F. Vitali, A. Riva and & R. Bellazzi (2017): “Data Fusion Approach for Learning Transcriptional Bayesian Networks,” Conf. on Artificial Intelligence in Medicine in Europe, 76–80. DOI: 10.1007/978-3-319-59758-4_8.Suche in Google Scholar
Schlitt, T. and A. Brazma (2007): “Current approaches to gene regulatory network modelling,” BMC Bioinformatics, 8, 6. doi:10.1186/1471-2105-8-S6-S9.Suche in Google Scholar PubMed PubMed Central
Segal, E., H. Wang and D. Koller (2003): “Discovering molecular pathways from protein interaction and gene expression data,” Bioinformatics, 19, i264–i272.10.1093/bioinformatics/btg1037Suche in Google Scholar PubMed
Spellman, P. T., G. Sherlock, M. Q. Zhang, V. R. Iyer, K. Anders, M. B. Eisen, P. O. Brown, D. Botstein and B. Futcher (1998): “Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization,” Mol. Biol. Cell, 9, 3273–3297.10.1091/mbc.9.12.3273Suche in Google Scholar PubMed PubMed Central
Spirtes, P., C. Glymour and R. Scheines (1993): Causation, Prediction, and Search, MIT press, Cambridge, MA.10.1007/978-1-4612-2748-9Suche in Google Scholar
Spirtes, P., C. Glymour, R. Scheines, S. Kauffman, V. Aimale and F. Wimberly (2000): “Constructing Bayesian network models of gene expression networks from microarray data,” In: Proceedings of the Atlantic Symposium on Computational Biology, Genome Information Systems and Technology, doi:10.1184/R1/6491291.v1.Suche in Google Scholar
Spirtes, P., C. Glymour and R. Scheines (2001): Causation, Prediction, and Search, 2nd Edition, MIT Press, Cambridge, MA.10.7551/mitpress/1754.001.0001Suche in Google Scholar
Stacklies, W., H. Redestig, M. Scholz, D. Walther and J. Selbig (2007): “pcaMethods—a bioconductor package providing PCA methods for incomplete data,” Bioinformatics, 23, 1164–1167.10.1093/bioinformatics/btm069Suche in Google Scholar PubMed
Steele, E., A. Tucker, P. Hoen and M. Schuemie (2009): “Literature-based priors for gene regulatory networks,” Bioinformatics, 25, 1768–1774.10.1093/bioinformatics/btp277Suche in Google Scholar PubMed
Styczynski, M. P. and G. Stephanopoulos (2005): “Overview of computational methods for the inference of gene regulatory networks,” Comput. Chem. Eng., 29, 519–534.10.1016/j.compchemeng.2004.08.029Suche in Google Scholar
Tamada, Y., S. Kim, H. Bannai, S. Imoto, K. Tashiro, S. Kuhara and S. Miyano (2003a): “Combining gene expression data with DNA sequence information for estimating gene networks using Bayesian network model,” Genome Inform. Ser., 14, 352–353.Suche in Google Scholar
Tamada, Y., S. Kim, H. Bannai, S. Imoto, K. Tashiro, S. Kuhara and S. Miyano (2003b): “Estimating gene networks from gene expression data by combining Bayesian network model with promoter element detection,” Bioinformatics, 19, 227–236.10.1093/bioinformatics/btg1082Suche in Google Scholar PubMed
Wang, M., Z. Chen and S. Cloutier (2007): “A hybrid Bayesian network learning method for constructing gene networks,” Comput. Biol. Chem., 31, 361–372.10.1016/j.compbiolchem.2007.08.005Suche in Google Scholar PubMed
Wang, Y. R. and H. Huang (2014): “Review on statistical methods for gene network reconstruction using expression data,” J. Theor. Biol., 362, 53–61.10.1016/j.jtbi.2014.03.040Suche in Google Scholar PubMed
Werhli, A. V., M. Grzegorczyk and D. Husmeier (2006): “Comparative evaluation of reverse engineering gene regulatory networks with relevance networks, graphical Gaussian models and Bayesian networks,” Bioinformatics, 22, 2523–2531.10.1093/bioinformatics/btl391Suche in Google Scholar PubMed
Werhli, A. V. and D. Husmeier (2007a): “Reconstructing gene regulatory networks with Bayesian networks by combining expression data with multiple sources of prior knowledge,” Stat. Appl. Genet. Mo. B., 6, 1. doi:10.2202/1544-6115.1282.Suche in Google Scholar PubMed
Werhli, A. V. and D. Husmeier (2007b): “Reverse engineering gene regulatory networks with Bayesian networks from expression data combined with multiple sources of biological prior knowledge,” Lect. N. Bioinformat., 49, 1–2.10.2202/1544-6115.1282Suche in Google Scholar
Wit, E. and J. Mcclure (2004): Statistics for microarrays, John Wiley & Sons, Chichester, UK.10.1002/0470011084Suche in Google Scholar
Yu, J., V. A. Smith, P. P. Wang, A. J. Hartemink and E. D. Jarvis (2004): “Advances to Bayesian network inference for generating causal networks from observational biological data,” Bioinformatics, 20, 3594–3603.10.1093/bioinformatics/bth448Suche in Google Scholar PubMed
Zhou, H. and T. Zheng (2014): “Bayesian hierarchical graph-structured model for pathway analysis using gene expression data,” Stat. Appl. Genet. Mo. B., 12, 393–412.10.1515/sagmb-2013-0011Suche in Google Scholar PubMed
Zhu, J., B. Zhang, E. N. Smith, B. Drees, R. B. Brem, L. Kruglyak, R. E. Bumgarner and E. E. Schadt (2008): “Integrating large-scale functional genomic data to dissect the complexity of yeast regulatory networks,” Nat. Genet., 40, 854–861.10.1038/ng.167Suche in Google Scholar PubMed PubMed Central
©2019 Walter de Gruyter GmbH, Berlin/Boston
Artikel in diesem Heft
- netprioR: a probabilistic model for integrative hit prioritisation of genetic screens
- Combining gene expression data and prior knowledge for inferring gene regulatory networks via Bayesian networks using structural restrictions
- Reproducibility of biomarker identifications from mass spectrometry proteomic data in cancer studies
- Data-adaptive multi-locus association testing in subjects with arbitrary genealogical relationships
Artikel in diesem Heft
- netprioR: a probabilistic model for integrative hit prioritisation of genetic screens
- Combining gene expression data and prior knowledge for inferring gene regulatory networks via Bayesian networks using structural restrictions
- Reproducibility of biomarker identifications from mass spectrometry proteomic data in cancer studies
- Data-adaptive multi-locus association testing in subjects with arbitrary genealogical relationships