Combining gene expression data and prior knowledge for inferring gene regulatory networks via Bayesian networks using structural restrictions

Luis M. de Campos; Andrés Cano; Javier G. Castellano; Serafín Moral

doi:10.1515/sagmb-2018-0042

Article

Combining gene expression data and prior knowledge for inferring gene regulatory networks via Bayesian networks using structural restrictions

Luis M. de Campos , Andrés Cano , Javier G. Castellano and Serafín Moral

Published/Copyright: May 1, 2019

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information

From the journal Statistical Applications in Genetics and Molecular Biology Volume 18 Issue 3

Abstract

Gene Regulatory Networks (GRNs) are known as the most adequate instrument to provide a clear insight and understanding of the cellular systems. One of the most successful techniques to reconstruct GRNs using gene expression data is Bayesian networks (BN) which have proven to be an ideal approach for heterogeneous data integration in the learning process. Nevertheless, the incorporation of prior knowledge has been achieved by using prior beliefs or by using networks as a starting point in the search process. In this work, the utilization of different kinds of structural restrictions within algorithms for learning BNs from gene expression data is considered. These restrictions will codify prior knowledge, in such a way that a BN should satisfy them. Therefore, one aim of this work is to make a detailed review on the use of prior knowledge and gene expression data to inferring GRNs from BNs, but the major purpose in this paper is to research whether the structural learning algorithms for BNs from expression data can achieve better outcomes exploiting this prior knowledge with the use of structural restrictions. In the experimental study, it is shown that this new way to incorporate prior knowledge leads us to achieve better reverse-engineered networks.

Keywords: Bayesian networks; genetic regulatory networks; microarray data; prior knowledge; structural restrictions

Funding source: Ministerio de Economía y Competitividad

Award Identifier / Grant number: TEC2015-69496-R and TIN2016-77902-C3-2-P

Funding statement: This work has been supported by the Spanish “Ministerio de Economía y Competitividad” and by “Fondo Europeo de Desarrollo Regional” (FEDER) under Project Projects, Funder Id http://dx.doi.org/10.13039/501100006280, TEC2015-69496-R and TIN2016-77902-C3-2-P.

References

Acid, S. and L. M. de Campos (2003): “Searching for Bayesian network structures in the space of restricted acyclic partially directed graphs,” J. Artif. Intell Res., 18, 445–490.10.1613/jair.1061Search in Google Scholar

Acid, S., L. M. de Campos, J. M. Fernández-Luna, S. Rodríguez, J. M. Rodríguez and J. L. Salcedo (2004): “A comparison of learning algorithms for bayesian networks: a case study based on data from an emergency medical service,” Artif. Intell. Med., 30, 215–232.10.1016/j.artmed.2003.11.002Search in Google Scholar PubMed

Acid, S., L. M. de Campos and M. Fernández (2013): “Score-based methods for learning markov boundaries by searching in onstrained spaces,” Data Min. Knowl. Disc., 26, 174–212.10.1007/s10618-011-0247-5Search in Google Scholar

Aderhold, A., D. Husmeier and M. Grzegorczyk (2014): “Statistical inference of regulatory networks for circadian regulation,” Stat. Appl. Genet. Mo. B., 13, 227–273.10.1515/sagmb-2013-0051Search in Google Scholar PubMed

Almasri, E., P. Larsen, G. Chen and Y. Dai (2008): “Incorporating literature knowledge in Bayesian network for inferring gene networks with gene expression data,” Lect. Notes Comput. Sc., 4983, 184–195.10.1007/978-3-540-79450-9_18Search in Google Scholar

Banf, M. and S. Y. Rhee (2017): “Computational inference of gene regulatory networks: Approaches, limitations and opportunities,” Biochim. Biophys. Acta, 1860, 1, 41–52.10.1016/j.bbagrm.2016.09.003Search in Google Scholar PubMed

Bansal, M., V. Belcastro, A. Ambesi-Impiombato and D. di Bernardo (2007): “How to infer gene networks from expression profiles,” Mol. Syst. Biol., 3, 1. doi:10.1038/msb4100120.Search in Google Scholar PubMed PubMed Central

Bellman, R. E. (1957): Dynamic programming, Princeton University Press, Princeton, New Jersey.Search in Google Scholar

Buntine, W. (1991): “Theory refinement on Bayesian networks.” In: D’Ambrosio, Bruce D., Smets, Philippe & Bonissone, Piero P. (eds.), Proceedings Uncertainty in Artificial Intelligence, pp. 52–60. doi:10.1016/B978-1-55860-203-8.50010-3.Search in Google Scholar

Buntine, W. (1996): “A guide to the literature on learning probabilistic networks from data,” IEEE T. Knowl. Data En., 8, 195–210.10.1109/69.494161Search in Google Scholar

Chai, L. E., S. K. Loh, S. T. Low, M. S. Mohamad, S. Deris and Z. Zakaria (2014): “A review on the computational approaches for gene regulatory network construction,” Comput. Biol. Med., 48, 55–65.10.1016/j.compbiomed.2014.02.011Search in Google Scholar PubMed

Chen, G., M. J. Cairelli, H. Kilicoglu, D. Shin and T. C. Rindflesch (2014): “Augmenting microarray data with literature-based knowledge to enhance gene regulatory network inference,” PLoS Comput. Biol., 10, e1003666. doi:10.1371/journal.pcbi.1003666.Search in Google Scholar

Cheng, J., R. Greiner, J. Kelly, D. Bell and W. Liu (2002): “Learning Bayesian networks from data: An information-theory based approach,” Artif. Intell., 137, 43–90.10.1016/S0004-3702(02)00191-1Search in Google Scholar

Chickering, D. M. (1995): “A transformational characterization of equivalent Bayesian network structures,” In: Besnard, Philippe & Hanks, Steve (eds.), Proceedings Uncertainty in Artificial Intelligence, pp. 87–98.Search in Google Scholar

Cho, R. J., M. J. Campbell, E. A. Winzeler, L. Steinmetz, A. Conway, L. Wodicka, T. G. Wolfsberg, A. E. Gabrielian, D. Landsman, D. J. Lockhart and R. W. Davis (1998): “A genome-wide transcriptional analysis of the mitotic cell cycle,” Mol. Cell., 2, 65–73.10.1016/S1097-2765(00)80114-8Search in Google Scholar

Chow, C. and C. Liu (1968): “Approximating discrete probability distributions with dependence trees,” IEEE T. Inform. Theory, 14, 462–467.10.1109/TIT.1968.1054142Search in Google Scholar

Cooper, G. F. and E. Herskovits. (1992): “A Bayesian method for the induction of probabilistic networks from data,” Mach. Learn., 9, 309–347.10.1007/BF00994110Search in Google Scholar

de Campos, L. M. and J. G. Castellano (2007): “Bayesian network learning algorithms using structural restrictions,” Int. J. Approx. Reason., 45, 2, 233–254.10.1016/j.ijar.2006.06.009Search in Google Scholar

de Campos, L. M. and J. F. Huete (2000): “A new approach for learning belief networks using independence criteria,” Int. J. Approx. Reason., 24, 11–37.10.1016/S0888-613X(99)00042-0Search in Google Scholar

Djebbari, A. and J. Quackenbush (2008): “Seeded Bayesian networks: constructing genetic networks from microarray data,” BMC Syst. Biol., 2, 57.10.1186/1752-0509-2-57Search in Google Scholar PubMed PubMed Central

Elvira Consortium (2002): “Elvira: An environment for probabilistic graphical models.” In: Gámez, J. and A. Salmerón, (eds.), Proceedings of the 1st European Workshop on Probabilistics Graphical Models, pp. 222–230.Search in Google Scholar

Esteves, G. H. and L. F. L. Reis (2018): “A statistical method for measuring activation of gene regulatory networks,” Stat. Appl. Genet. Mo. B., 17, 3. doi:10.1515/sagmb-2016-0059.Search in Google Scholar PubMed

Friedman, N. (2004): “Inferring cellular networks using probabilistic graphical models,” Science, 303, 5659, 799–805.10.1126/science.1094068Search in Google Scholar PubMed

Friedman, N., M. Linial, I. Nachman and D. Pe’er (2000): “Using Bayesian networks to analyze expression data,” J. Comput. Biol., 7, 601–620.10.1145/332306.332355Search in Google Scholar

Gámez, J. A., J. L. Mateo and J. M. Puerta (2011): “Learning Bayesian networks by hill climbing: efficient methods based on progressive restriction of the neighborhood,” Data Min. Knowl. Disc., 22, 106–148.10.1007/s10618-010-0178-6Search in Google Scholar

Gifford, D. K. (2001): “Blazing pathways through genetic mountains,” Science, 293, 2049–2051.10.1126/science.1065113Search in Google Scholar

Good, I. J. (1965): The estimation of probabilities, The MIT Press, Cambridge, MA.Search in Google Scholar

Hartemink, A. J.,D. K. Gifford, T. S. Jaakkola and R. A. Young (2001): “Using graphical models and genomic expression data to statistically validate models of genetic regulatory networks,” Pac. Symp. Biocomput., 422–433. DOI: 10.1142/9789812799623_0041.Search in Google Scholar

Hartemink, A. J., D. K. Gifford, T. S. Jaakkola and R. A. Young (2002a): “Bayesian methods for elucidating genetic regulatory networks,” IEEE Intell. Syst., 17, 37–43.10.1109/MIS.2002.999218Search in Google Scholar

Hartemink, A. J., D. K. Gifford, T. S. Jaakkola and R. A. Young (2002b): “Combining location and expression data for principled discovery of genetic regulatory network models,” Pac. Symp. Biocomput., 437–449. DOI: 10.1142/9789812799623_0041.Search in Google Scholar

Heckerman, D., D. Geiger and D. M. Chickering (1995): “Learning Bayesian networks: The combination of knowledge and statistical data,” Mach. Learn., 20, 197–243.10.1016/B978-1-55860-332-5.50042-0Search in Google Scholar

Hoefsloot, H. C., S. Smit and A. K. Smilde (2008): “A classification model for the leiden proteomics competition,” Stat. Appl. Genet. Mo. B. 7, 2. doi:10.2202/1544-6115.1351.Search in Google Scholar

Hughes, T. R., M. J. Marton, A. R. Jones, C. J. Roberts, R. Stoughton, C. D. Armour, H. A. Bennett, E. Coffey, H. Dai, Y. D. He, M. J. Kidd, A. M. King, M. R. Meyer, D. Slade, P. Y. Lum, S. B. Stepaniants, D. D. Shoemaker, D. Gachotte, K. Chakraburtty, J. Simon, M. Bard and S. H. Friend (2000): “Functional discovery via a compendium of expression profiles,” Cell, 102, 109–126.10.1016/S0092-8674(00)00015-5Search in Google Scholar

Imoto, S., T. Higuchi, T. Goto, K. Tashiro, S. Kuhara and S. Miyano (2003a): “Combining microarrays and biological knowledge for estimating gene networks via Bayesian networks,” In: 2nd IEEE Computer Society Bioinformatics Conf., pp. 104–113.Search in Google Scholar

Imoto, S., S. Kim, T. Goto, S. Miyano, S. Aburatani, K. Tashiro and S. Kuhara (2003b): “Bayesian network and nonparametric heteroscedastic regression for nonlinear modeling of genetic network,” J. Bioinf. Comput. Biol. 1, 231–252.10.1109/CSB.2002.1039344Search in Google Scholar

Imoto, S., T. Higuchi, T. Goto, K. Tashiro, S. Kuhara and S. Miyano (2004): “Combining microarrays and biological knowledge for estimating gene networks via Bayesian networks,” J. Bioinf. Comput. Biol., 2, 77–98.10.1109/CSB.2003.1227309Search in Google Scholar

Isci, S., H. Dogan, C. Ozturk and H. H. Otu (2014): “Bayesian network prior: network analysis of biological data using external knowledge,” Bioinformatics, 30, 860–867.10.1093/bioinformatics/btt643Search in Google Scholar PubMed PubMed Central

Kanehisa, M., M. Araki, S. Goto, M. Hattori, M. Hirakawa, M. Itoh, T. Katayama, S. Kawashima, S. Okuda, T. Tokimatsu and Y. Yamanishi (2008): “KEGG for linking genomes to life and the environment,” Nucleic Acids Res., 36, 480–484.10.1093/nar/gkm882Search in Google Scholar PubMed PubMed Central

Kim, H., G. H. Golub and H. Park (2005): “Missing value estimation for DNA microarray gene expression data: local least squares imputation,” Bioinformatics, 21, 187–198.10.1093/bioinformatics/bth499Search in Google Scholar PubMed

Lam, W. and F. Bacchus (1994): “Learning Bayesian belief networks: An approach based on the MDL principle,” Compu. Intell., 10, 269–293.10.1111/j.1467-8640.1994.tb00166.xSearch in Google Scholar

Larsen, P., E. Almasri, G. Chen and Y. Dai (2007): “A statistical method to incorporate biological knowledge for generating testable novel gene regulatory interactions from microarray experiments,” BMC Bioinformatics, 8, 317. doi:10.1186/1471-2105-8-317.Search in Google Scholar PubMed PubMed Central

Le Phillip, P., A. Bahl and L. H. Ungar (2004): “Using prior knowledge to improve genetic network reconstruction from microarray data,” In Silico Biology, 4, 335–353.Search in Google Scholar

Lee, W. P. and W. S. Tzou (2009): “Computational methods for discovering gene networks from expression data,” Brief. Bioinform., 10, 408–423.10.1093/bib/bbp028Search in Google Scholar PubMed

Li, S., L. Wu and Z. Zhang (2006): “Constructing biological networks through combined literature mining and microarray analysis: a LMMA approach,” Bioinformatics, 22, 2143–2150.10.1093/bioinformatics/btl363Search in Google Scholar PubMed

Linde, J., S. Schulze, S. G. Henkel and R. Guthke (2015): “Data- and knowledge-based modeling of gene regulatory networks: an update,” Exp. and Clin. Sci., 14, 346–378.Search in Google Scholar

Markowetz, F. and R. Spang (2007): “Inferring cellular networks – a review,” BMC Bioinformatics, 8, 6. doi:10.1186/1471-2105-8-S6-S5.Search in Google Scholar PubMed PubMed Central

Mewes, H. W., D. Frishman, U. Güldener, G. Mannhaupt, K. F. X. Mayer, M. Mokrejs, B. Morgenstern, M. Münsterkötter, S. Rudd and B. Weil (2002): “MIPS: a database for genomes and protein sequences,” Nucleic Acids Res., 30, 31–34.10.1093/nar/30.1.31Search in Google Scholar PubMed PubMed Central

Mukherjee, S. and T. P. Speed (2008): “Network inference using informative priors,” PNAS, 105, 14313–14318.10.1073/pnas.0802272105Search in Google Scholar PubMed PubMed Central

Nariai, N., S. Kim, S. Imoto and S. Miyano ( 2004): “Using protein-protein interactions for refining gene networks estimated from microarray data by Bayesian networks,” Pac. Symp. Biocomput., 336–347. DOI: 10.1142/9789812704856_0032.Search in Google Scholar PubMed

Nariai, N., Y. Tamada, S. Imoto and S. Miyano (2005): “Estimating gene regulatory networks and protein–protein interactions of Saccharomyces cerevisiae from multiple genome-wide data,” Bioinformatics, 21, 206–212.10.1093/bioinformatics/bti1133Search in Google Scholar PubMed

Njah, H. and S. Jamoussi (2015): “Weighted ensemble learning of Bayesian network for gene regulatory networks,” Neurocomputing, 150, 404–416.10.1016/j.neucom.2014.05.078Search in Google Scholar

Oates, C. J., R. Amos and S. E. Spencer (2014): “Quantifying the multi-scale performance of network inference algorithms,” Stat. Appl. Genet. Mo. B., 13, 611–631.10.1515/sagmb-2014-0012Search in Google Scholar PubMed

Pearl, J. (1988): Probabilistic reasoning in intelligent systems: Networks of plausible inference, Morgan Kaufmann Publishers Inc., San Francisco (CA).10.1016/B978-0-08-051489-5.50008-4Search in Google Scholar

Sachs, K., O. Perez, D. Pe’er, D. A. Lauffenburger and G. P. Nolan (2005): “Causal protein-signaling networks derived from multiparameter single-cell data,” Science, 308, 523–529.10.1126/science.1105809Search in Google Scholar PubMed

Sauta, E., A. Demartini, F. Vitali, A. Riva and & R. Bellazzi (2017): “Data Fusion Approach for Learning Transcriptional Bayesian Networks,” Conf. on Artificial Intelligence in Medicine in Europe, 76–80. DOI: 10.1007/978-3-319-59758-4_8.Search in Google Scholar

Schlitt, T. and A. Brazma (2007): “Current approaches to gene regulatory network modelling,” BMC Bioinformatics, 8, 6. doi:10.1186/1471-2105-8-S6-S9.Search in Google Scholar PubMed PubMed Central

Segal, E., H. Wang and D. Koller (2003): “Discovering molecular pathways from protein interaction and gene expression data,” Bioinformatics, 19, i264–i272.10.1093/bioinformatics/btg1037Search in Google Scholar PubMed

Spellman, P. T., G. Sherlock, M. Q. Zhang, V. R. Iyer, K. Anders, M. B. Eisen, P. O. Brown, D. Botstein and B. Futcher (1998): “Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization,” Mol. Biol. Cell, 9, 3273–3297.10.1091/mbc.9.12.3273Search in Google Scholar PubMed PubMed Central

Spirtes, P., C. Glymour and R. Scheines (1993): Causation, Prediction, and Search, MIT press, Cambridge, MA.10.1007/978-1-4612-2748-9Search in Google Scholar

Spirtes, P., C. Glymour, R. Scheines, S. Kauffman, V. Aimale and F. Wimberly (2000): “Constructing Bayesian network models of gene expression networks from microarray data,” In: Proceedings of the Atlantic Symposium on Computational Biology, Genome Information Systems and Technology, doi:10.1184/R1/6491291.v1.Search in Google Scholar

Spirtes, P., C. Glymour and R. Scheines (2001): Causation, Prediction, and Search, 2nd Edition, MIT Press, Cambridge, MA.10.7551/mitpress/1754.001.0001Search in Google Scholar

Stacklies, W., H. Redestig, M. Scholz, D. Walther and J. Selbig (2007): “pcaMethods—a bioconductor package providing PCA methods for incomplete data,” Bioinformatics, 23, 1164–1167.10.1093/bioinformatics/btm069Search in Google Scholar PubMed

Steele, E., A. Tucker, P. Hoen and M. Schuemie (2009): “Literature-based priors for gene regulatory networks,” Bioinformatics, 25, 1768–1774.10.1093/bioinformatics/btp277Search in Google Scholar PubMed

Styczynski, M. P. and G. Stephanopoulos (2005): “Overview of computational methods for the inference of gene regulatory networks,” Comput. Chem. Eng., 29, 519–534.10.1016/j.compchemeng.2004.08.029Search in Google Scholar

Tamada, Y., S. Kim, H. Bannai, S. Imoto, K. Tashiro, S. Kuhara and S. Miyano (2003a): “Combining gene expression data with DNA sequence information for estimating gene networks using Bayesian network model,” Genome Inform. Ser., 14, 352–353.Search in Google Scholar

Tamada, Y., S. Kim, H. Bannai, S. Imoto, K. Tashiro, S. Kuhara and S. Miyano (2003b): “Estimating gene networks from gene expression data by combining Bayesian network model with promoter element detection,” Bioinformatics, 19, 227–236.10.1093/bioinformatics/btg1082Search in Google Scholar PubMed

Wang, M., Z. Chen and S. Cloutier (2007): “A hybrid Bayesian network learning method for constructing gene networks,” Comput. Biol. Chem., 31, 361–372.10.1016/j.compbiolchem.2007.08.005Search in Google Scholar PubMed

Wang, Y. R. and H. Huang (2014): “Review on statistical methods for gene network reconstruction using expression data,” J. Theor. Biol., 362, 53–61.10.1016/j.jtbi.2014.03.040Search in Google Scholar PubMed

Werhli, A. V., M. Grzegorczyk and D. Husmeier (2006): “Comparative evaluation of reverse engineering gene regulatory networks with relevance networks, graphical Gaussian models and Bayesian networks,” Bioinformatics, 22, 2523–2531.10.1093/bioinformatics/btl391Search in Google Scholar PubMed

Werhli, A. V. and D. Husmeier (2007a): “Reconstructing gene regulatory networks with Bayesian networks by combining expression data with multiple sources of prior knowledge,” Stat. Appl. Genet. Mo. B., 6, 1. doi:10.2202/1544-6115.1282.Search in Google Scholar PubMed

Werhli, A. V. and D. Husmeier (2007b): “Reverse engineering gene regulatory networks with Bayesian networks from expression data combined with multiple sources of biological prior knowledge,” Lect. N. Bioinformat., 49, 1–2.10.2202/1544-6115.1282Search in Google Scholar

Wit, E. and J. Mcclure (2004): Statistics for microarrays, John Wiley & Sons, Chichester, UK.10.1002/0470011084Search in Google Scholar

Yu, J., V. A. Smith, P. P. Wang, A. J. Hartemink and E. D. Jarvis (2004): “Advances to Bayesian network inference for generating causal networks from observational biological data,” Bioinformatics, 20, 3594–3603.10.1093/bioinformatics/bth448Search in Google Scholar PubMed

Zhou, H. and T. Zheng (2014): “Bayesian hierarchical graph-structured model for pathway analysis using gene expression data,” Stat. Appl. Genet. Mo. B., 12, 393–412.10.1515/sagmb-2013-0011Search in Google Scholar PubMed

Zhu, J., B. Zhang, E. N. Smith, B. Drees, R. B. Brem, L. Kruglyak, R. E. Bumgarner and E. E. Schadt (2008): “Integrating large-scale functional genomic data to dissect the complexity of yeast regulatory networks,” Nat. Genet., 40, 854–861.10.1038/ng.167Search in Google Scholar PubMed PubMed Central

Published Online: 2019-05-01

You are currently not able to access this content.

Articles in the same Issue

https://doi.org/10.1515/sagmb-2018-0042

Keywords for this article

Bayesian networks; genetic regulatory networks; microarray data; prior knowledge; structural restrictions