Combining gene expression data and prior knowledge for inferring gene regulatory networks via Bayesian networks using structural restrictions

Luis M. de Campos; Andrés Cano; Javier G. Castellano; Serafín Moral

doi:10.1515/sagmb-2018-0042

Artikel

Combining gene expression data and prior knowledge for inferring gene regulatory networks via Bayesian networks using structural restrictions

Luis M. de Campos , Andrés Cano , Javier G. Castellano und Serafín Moral

Veröffentlicht/Copyright: 1. Mai 2019

Veröffentlicht von

Veröffentlichen auch Sie bei De Gruyter Brill

Manuskript einreichen Informationen für Autor*innen

Aus der Zeitschrift Statistical Applications in Genetics and Molecular Biology Band 18 Heft 3

Abstract

Gene Regulatory Networks (GRNs) are known as the most adequate instrument to provide a clear insight and understanding of the cellular systems. One of the most successful techniques to reconstruct GRNs using gene expression data is Bayesian networks (BN) which have proven to be an ideal approach for heterogeneous data integration in the learning process. Nevertheless, the incorporation of prior knowledge has been achieved by using prior beliefs or by using networks as a starting point in the search process. In this work, the utilization of different kinds of structural restrictions within algorithms for learning BNs from gene expression data is considered. These restrictions will codify prior knowledge, in such a way that a BN should satisfy them. Therefore, one aim of this work is to make a detailed review on the use of prior knowledge and gene expression data to inferring GRNs from BNs, but the major purpose in this paper is to research whether the structural learning algorithms for BNs from expression data can achieve better outcomes exploiting this prior knowledge with the use of structural restrictions. In the experimental study, it is shown that this new way to incorporate prior knowledge leads us to achieve better reverse-engineered networks.

Keywords: Bayesian networks; genetic regulatory networks; microarray data; prior knowledge; structural restrictions

Funding source: Ministerio de Economía y Competitividad

Award Identifier / Grant number: TEC2015-69496-R and TIN2016-77902-C3-2-P

Funding statement: This work has been supported by the Spanish “Ministerio de Economía y Competitividad” and by “Fondo Europeo de Desarrollo Regional” (FEDER) under Project Projects, Funder Id http://dx.doi.org/10.13039/501100006280, TEC2015-69496-R and TIN2016-77902-C3-2-P.

References

Acid, S. and L. M. de Campos (2003): “Searching for Bayesian network structures in the space of restricted acyclic partially directed graphs,” J. Artif. Intell Res., 18, 445–490.10.1613/jair.1061Suche in Google Scholar

Acid, S., L. M. de Campos, J. M. Fernández-Luna, S. Rodríguez, J. M. Rodríguez and J. L. Salcedo (2004): “A comparison of learning algorithms for bayesian networks: a case study based on data from an emergency medical service,” Artif. Intell. Med., 30, 215–232.10.1016/j.artmed.2003.11.002Suche in Google Scholar PubMed

Acid, S., L. M. de Campos and M. Fernández (2013): “Score-based methods for learning markov boundaries by searching in onstrained spaces,” Data Min. Knowl. Disc., 26, 174–212.10.1007/s10618-011-0247-5Suche in Google Scholar

Aderhold, A., D. Husmeier and M. Grzegorczyk (2014): “Statistical inference of regulatory networks for circadian regulation,” Stat. Appl. Genet. Mo. B., 13, 227–273.10.1515/sagmb-2013-0051Suche in Google Scholar PubMed

Almasri, E., P. Larsen, G. Chen and Y. Dai (2008): “Incorporating literature knowledge in Bayesian network for inferring gene networks with gene expression data,” Lect. Notes Comput. Sc., 4983, 184–195.10.1007/978-3-540-79450-9_18Suche in Google Scholar

Banf, M. and S. Y. Rhee (2017): “Computational inference of gene regulatory networks: Approaches, limitations and opportunities,” Biochim. Biophys. Acta, 1860, 1, 41–52.10.1016/j.bbagrm.2016.09.003Suche in Google Scholar PubMed

Bansal, M., V. Belcastro, A. Ambesi-Impiombato and D. di Bernardo (2007): “How to infer gene networks from expression profiles,” Mol. Syst. Biol., 3, 1. doi:10.1038/msb4100120.Suche in Google Scholar PubMed PubMed Central

Bellman, R. E. (1957): Dynamic programming, Princeton University Press, Princeton, New Jersey.Suche in Google Scholar

Buntine, W. (1991): “Theory refinement on Bayesian networks.” In: D’Ambrosio, Bruce D., Smets, Philippe & Bonissone, Piero P. (eds.), Proceedings Uncertainty in Artificial Intelligence, pp. 52–60. doi:10.1016/B978-1-55860-203-8.50010-3.Suche in Google Scholar

Buntine, W. (1996): “A guide to the literature on learning probabilistic networks from data,” IEEE T. Knowl. Data En., 8, 195–210.10.1109/69.494161Suche in Google Scholar

Chai, L. E., S. K. Loh, S. T. Low, M. S. Mohamad, S. Deris and Z. Zakaria (2014): “A review on the computational approaches for gene regulatory network construction,” Comput. Biol. Med., 48, 55–65.10.1016/j.compbiomed.2014.02.011Suche in Google Scholar PubMed

Chen, G., M. J. Cairelli, H. Kilicoglu, D. Shin and T. C. Rindflesch (2014): “Augmenting microarray data with literature-based knowledge to enhance gene regulatory network inference,” PLoS Comput. Biol., 10, e1003666. doi:10.1371/journal.pcbi.1003666.Suche in Google Scholar

Cheng, J., R. Greiner, J. Kelly, D. Bell and W. Liu (2002): “Learning Bayesian networks from data: An information-theory based approach,” Artif. Intell., 137, 43–90.10.1016/S0004-3702(02)00191-1Suche in Google Scholar

Chickering, D. M. (1995): “A transformational characterization of equivalent Bayesian network structures,” In: Besnard, Philippe & Hanks, Steve (eds.), Proceedings Uncertainty in Artificial Intelligence, pp. 87–98.Suche in Google Scholar

Cho, R. J., M. J. Campbell, E. A. Winzeler, L. Steinmetz, A. Conway, L. Wodicka, T. G. Wolfsberg, A. E. Gabrielian, D. Landsman, D. J. Lockhart and R. W. Davis (1998): “A genome-wide transcriptional analysis of the mitotic cell cycle,” Mol. Cell., 2, 65–73.10.1016/S1097-2765(00)80114-8Suche in Google Scholar

Chow, C. and C. Liu (1968): “Approximating discrete probability distributions with dependence trees,” IEEE T. Inform. Theory, 14, 462–467.10.1109/TIT.1968.1054142Suche in Google Scholar

Cooper, G. F. and E. Herskovits. (1992): “A Bayesian method for the induction of probabilistic networks from data,” Mach. Learn., 9, 309–347.10.1007/BF00994110Suche in Google Scholar

de Campos, L. M. and J. G. Castellano (2007): “Bayesian network learning algorithms using structural restrictions,” Int. J. Approx. Reason., 45, 2, 233–254.10.1016/j.ijar.2006.06.009Suche in Google Scholar

de Campos, L. M. and J. F. Huete (2000): “A new approach for learning belief networks using independence criteria,” Int. J. Approx. Reason., 24, 11–37.10.1016/S0888-613X(99)00042-0Suche in Google Scholar

Djebbari, A. and J. Quackenbush (2008): “Seeded Bayesian networks: constructing genetic networks from microarray data,” BMC Syst. Biol., 2, 57.10.1186/1752-0509-2-57Suche in Google Scholar PubMed PubMed Central

Elvira Consortium (2002): “Elvira: An environment for probabilistic graphical models.” In: Gámez, J. and A. Salmerón, (eds.), Proceedings of the 1st European Workshop on Probabilistics Graphical Models, pp. 222–230.Suche in Google Scholar

Esteves, G. H. and L. F. L. Reis (2018): “A statistical method for measuring activation of gene regulatory networks,” Stat. Appl. Genet. Mo. B., 17, 3. doi:10.1515/sagmb-2016-0059.Suche in Google Scholar PubMed

Friedman, N. (2004): “Inferring cellular networks using probabilistic graphical models,” Science, 303, 5659, 799–805.10.1126/science.1094068Suche in Google Scholar PubMed

Friedman, N., M. Linial, I. Nachman and D. Pe’er (2000): “Using Bayesian networks to analyze expression data,” J. Comput. Biol., 7, 601–620.10.1145/332306.332355Suche in Google Scholar

Gámez, J. A., J. L. Mateo and J. M. Puerta (2011): “Learning Bayesian networks by hill climbing: efficient methods based on progressive restriction of the neighborhood,” Data Min. Knowl. Disc., 22, 106–148.10.1007/s10618-010-0178-6Suche in Google Scholar

Gifford, D. K. (2001): “Blazing pathways through genetic mountains,” Science, 293, 2049–2051.10.1126/science.1065113Suche in Google Scholar

Good, I. J. (1965): The estimation of probabilities, The MIT Press, Cambridge, MA.Suche in Google Scholar

Hartemink, A. J.,D. K. Gifford, T. S. Jaakkola and R. A. Young (2001): “Using graphical models and genomic expression data to statistically validate models of genetic regulatory networks,” Pac. Symp. Biocomput., 422–433. DOI: 10.1142/9789812799623_0041.Suche in Google Scholar

Hartemink, A. J., D. K. Gifford, T. S. Jaakkola and R. A. Young (2002a): “Bayesian methods for elucidating genetic regulatory networks,” IEEE Intell. Syst., 17, 37–43.10.1109/MIS.2002.999218Suche in Google Scholar

Hartemink, A. J., D. K. Gifford, T. S. Jaakkola and R. A. Young (2002b): “Combining location and expression data for principled discovery of genetic regulatory network models,” Pac. Symp. Biocomput., 437–449. DOI: 10.1142/9789812799623_0041.Suche in Google Scholar

Heckerman, D., D. Geiger and D. M. Chickering (1995): “Learning Bayesian networks: The combination of knowledge and statistical data,” Mach. Learn., 20, 197–243.10.1016/B978-1-55860-332-5.50042-0Suche in Google Scholar

Hoefsloot, H. C., S. Smit and A. K. Smilde (2008): “A classification model for the leiden proteomics competition,” Stat. Appl. Genet. Mo. B. 7, 2. doi:10.2202/1544-6115.1351.Suche in Google Scholar

Hughes, T. R., M. J. Marton, A. R. Jones, C. J. Roberts, R. Stoughton, C. D. Armour, H. A. Bennett, E. Coffey, H. Dai, Y. D. He, M. J. Kidd, A. M. King, M. R. Meyer, D. Slade, P. Y. Lum, S. B. Stepaniants, D. D. Shoemaker, D. Gachotte, K. Chakraburtty, J. Simon, M. Bard and S. H. Friend (2000): “Functional discovery via a compendium of expression profiles,” Cell, 102, 109–126.10.1016/S0092-8674(00)00015-5Suche in Google Scholar

Imoto, S., T. Higuchi, T. Goto, K. Tashiro, S. Kuhara and S. Miyano (2003a): “Combining microarrays and biological knowledge for estimating gene networks via Bayesian networks,” In: 2nd IEEE Computer Society Bioinformatics Conf., pp. 104–113.Suche in Google Scholar

Imoto, S., S. Kim, T. Goto, S. Miyano, S. Aburatani, K. Tashiro and S. Kuhara (2003b): “Bayesian network and nonparametric heteroscedastic regression for nonlinear modeling of genetic network,” J. Bioinf. Comput. Biol. 1, 231–252.10.1109/CSB.2002.1039344Suche in Google Scholar

Imoto, S., T. Higuchi, T. Goto, K. Tashiro, S. Kuhara and S. Miyano (2004): “Combining microarrays and biological knowledge for estimating gene networks via Bayesian networks,” J. Bioinf. Comput. Biol., 2, 77–98.10.1109/CSB.2003.1227309Suche in Google Scholar

Isci, S., H. Dogan, C. Ozturk and H. H. Otu (2014): “Bayesian network prior: network analysis of biological data using external knowledge,” Bioinformatics, 30, 860–867.10.1093/bioinformatics/btt643Suche in Google Scholar PubMed PubMed Central

Kanehisa, M., M. Araki, S. Goto, M. Hattori, M. Hirakawa, M. Itoh, T. Katayama, S. Kawashima, S. Okuda, T. Tokimatsu and Y. Yamanishi (2008): “KEGG for linking genomes to life and the environment,” Nucleic Acids Res., 36, 480–484.10.1093/nar/gkm882Suche in Google Scholar PubMed PubMed Central

Kim, H., G. H. Golub and H. Park (2005): “Missing value estimation for DNA microarray gene expression data: local least squares imputation,” Bioinformatics, 21, 187–198.10.1093/bioinformatics/bth499Suche in Google Scholar PubMed

Lam, W. and F. Bacchus (1994): “Learning Bayesian belief networks: An approach based on the MDL principle,” Compu. Intell., 10, 269–293.10.1111/j.1467-8640.1994.tb00166.xSuche in Google Scholar

Larsen, P., E. Almasri, G. Chen and Y. Dai (2007): “A statistical method to incorporate biological knowledge for generating testable novel gene regulatory interactions from microarray experiments,” BMC Bioinformatics, 8, 317. doi:10.1186/1471-2105-8-317.Suche in Google Scholar PubMed PubMed Central

Le Phillip, P., A. Bahl and L. H. Ungar (2004): “Using prior knowledge to improve genetic network reconstruction from microarray data,” In Silico Biology, 4, 335–353.Suche in Google Scholar

Lee, W. P. and W. S. Tzou (2009): “Computational methods for discovering gene networks from expression data,” Brief. Bioinform., 10, 408–423.10.1093/bib/bbp028Suche in Google Scholar PubMed

Li, S., L. Wu and Z. Zhang (2006): “Constructing biological networks through combined literature mining and microarray analysis: a LMMA approach,” Bioinformatics, 22, 2143–2150.10.1093/bioinformatics/btl363Suche in Google Scholar PubMed

Linde, J., S. Schulze, S. G. Henkel and R. Guthke (2015): “Data- and knowledge-based modeling of gene regulatory networks: an update,” Exp. and Clin. Sci., 14, 346–378.Suche in Google Scholar

Markowetz, F. and R. Spang (2007): “Inferring cellular networks – a review,” BMC Bioinformatics, 8, 6. doi:10.1186/1471-2105-8-S6-S5.Suche in Google Scholar PubMed PubMed Central

Mewes, H. W., D. Frishman, U. Güldener, G. Mannhaupt, K. F. X. Mayer, M. Mokrejs, B. Morgenstern, M. Münsterkötter, S. Rudd and B. Weil (2002): “MIPS: a database for genomes and protein sequences,” Nucleic Acids Res., 30, 31–34.10.1093/nar/30.1.31Suche in Google Scholar PubMed PubMed Central

Mukherjee, S. and T. P. Speed (2008): “Network inference using informative priors,” PNAS, 105, 14313–14318.10.1073/pnas.0802272105Suche in Google Scholar PubMed PubMed Central

Nariai, N., S. Kim, S. Imoto and S. Miyano ( 2004): “Using protein-protein interactions for refining gene networks estimated from microarray data by Bayesian networks,” Pac. Symp. Biocomput., 336–347. DOI: 10.1142/9789812704856_0032.Suche in Google Scholar PubMed

Nariai, N., Y. Tamada, S. Imoto and S. Miyano (2005): “Estimating gene regulatory networks and protein–protein interactions of Saccharomyces cerevisiae from multiple genome-wide data,” Bioinformatics, 21, 206–212.10.1093/bioinformatics/bti1133Suche in Google Scholar PubMed

Njah, H. and S. Jamoussi (2015): “Weighted ensemble learning of Bayesian network for gene regulatory networks,” Neurocomputing, 150, 404–416.10.1016/j.neucom.2014.05.078Suche in Google Scholar

Oates, C. J., R. Amos and S. E. Spencer (2014): “Quantifying the multi-scale performance of network inference algorithms,” Stat. Appl. Genet. Mo. B., 13, 611–631.10.1515/sagmb-2014-0012Suche in Google Scholar PubMed

Pearl, J. (1988): Probabilistic reasoning in intelligent systems: Networks of plausible inference, Morgan Kaufmann Publishers Inc., San Francisco (CA).10.1016/B978-0-08-051489-5.50008-4Suche in Google Scholar

Sachs, K., O. Perez, D. Pe’er, D. A. Lauffenburger and G. P. Nolan (2005): “Causal protein-signaling networks derived from multiparameter single-cell data,” Science, 308, 523–529.10.1126/science.1105809Suche in Google Scholar PubMed

Sauta, E., A. Demartini, F. Vitali, A. Riva and & R. Bellazzi (2017): “Data Fusion Approach for Learning Transcriptional Bayesian Networks,” Conf. on Artificial Intelligence in Medicine in Europe, 76–80. DOI: 10.1007/978-3-319-59758-4_8.Suche in Google Scholar

Schlitt, T. and A. Brazma (2007): “Current approaches to gene regulatory network modelling,” BMC Bioinformatics, 8, 6. doi:10.1186/1471-2105-8-S6-S9.Suche in Google Scholar PubMed PubMed Central

Segal, E., H. Wang and D. Koller (2003): “Discovering molecular pathways from protein interaction and gene expression data,” Bioinformatics, 19, i264–i272.10.1093/bioinformatics/btg1037Suche in Google Scholar PubMed

Spellman, P. T., G. Sherlock, M. Q. Zhang, V. R. Iyer, K. Anders, M. B. Eisen, P. O. Brown, D. Botstein and B. Futcher (1998): “Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization,” Mol. Biol. Cell, 9, 3273–3297.10.1091/mbc.9.12.3273Suche in Google Scholar PubMed PubMed Central

Spirtes, P., C. Glymour and R. Scheines (1993): Causation, Prediction, and Search, MIT press, Cambridge, MA.10.1007/978-1-4612-2748-9Suche in Google Scholar

Spirtes, P., C. Glymour, R. Scheines, S. Kauffman, V. Aimale and F. Wimberly (2000): “Constructing Bayesian network models of gene expression networks from microarray data,” In: Proceedings of the Atlantic Symposium on Computational Biology, Genome Information Systems and Technology, doi:10.1184/R1/6491291.v1.Suche in Google Scholar

Spirtes, P., C. Glymour and R. Scheines (2001): Causation, Prediction, and Search, 2nd Edition, MIT Press, Cambridge, MA.10.7551/mitpress/1754.001.0001Suche in Google Scholar

Stacklies, W., H. Redestig, M. Scholz, D. Walther and J. Selbig (2007): “pcaMethods—a bioconductor package providing PCA methods for incomplete data,” Bioinformatics, 23, 1164–1167.10.1093/bioinformatics/btm069Suche in Google Scholar PubMed

Steele, E., A. Tucker, P. Hoen and M. Schuemie (2009): “Literature-based priors for gene regulatory networks,” Bioinformatics, 25, 1768–1774.10.1093/bioinformatics/btp277Suche in Google Scholar PubMed

Styczynski, M. P. and G. Stephanopoulos (2005): “Overview of computational methods for the inference of gene regulatory networks,” Comput. Chem. Eng., 29, 519–534.10.1016/j.compchemeng.2004.08.029Suche in Google Scholar

Tamada, Y., S. Kim, H. Bannai, S. Imoto, K. Tashiro, S. Kuhara and S. Miyano (2003a): “Combining gene expression data with DNA sequence information for estimating gene networks using Bayesian network model,” Genome Inform. Ser., 14, 352–353.Suche in Google Scholar

Tamada, Y., S. Kim, H. Bannai, S. Imoto, K. Tashiro, S. Kuhara and S. Miyano (2003b): “Estimating gene networks from gene expression data by combining Bayesian network model with promoter element detection,” Bioinformatics, 19, 227–236.10.1093/bioinformatics/btg1082Suche in Google Scholar PubMed

Wang, M., Z. Chen and S. Cloutier (2007): “A hybrid Bayesian network learning method for constructing gene networks,” Comput. Biol. Chem., 31, 361–372.10.1016/j.compbiolchem.2007.08.005Suche in Google Scholar PubMed

Wang, Y. R. and H. Huang (2014): “Review on statistical methods for gene network reconstruction using expression data,” J. Theor. Biol., 362, 53–61.10.1016/j.jtbi.2014.03.040Suche in Google Scholar PubMed

Werhli, A. V., M. Grzegorczyk and D. Husmeier (2006): “Comparative evaluation of reverse engineering gene regulatory networks with relevance networks, graphical Gaussian models and Bayesian networks,” Bioinformatics, 22, 2523–2531.10.1093/bioinformatics/btl391Suche in Google Scholar PubMed

Werhli, A. V. and D. Husmeier (2007a): “Reconstructing gene regulatory networks with Bayesian networks by combining expression data with multiple sources of prior knowledge,” Stat. Appl. Genet. Mo. B., 6, 1. doi:10.2202/1544-6115.1282.Suche in Google Scholar PubMed

Werhli, A. V. and D. Husmeier (2007b): “Reverse engineering gene regulatory networks with Bayesian networks from expression data combined with multiple sources of biological prior knowledge,” Lect. N. Bioinformat., 49, 1–2.10.2202/1544-6115.1282Suche in Google Scholar

Wit, E. and J. Mcclure (2004): Statistics for microarrays, John Wiley & Sons, Chichester, UK.10.1002/0470011084Suche in Google Scholar

Yu, J., V. A. Smith, P. P. Wang, A. J. Hartemink and E. D. Jarvis (2004): “Advances to Bayesian network inference for generating causal networks from observational biological data,” Bioinformatics, 20, 3594–3603.10.1093/bioinformatics/bth448Suche in Google Scholar PubMed

Zhou, H. and T. Zheng (2014): “Bayesian hierarchical graph-structured model for pathway analysis using gene expression data,” Stat. Appl. Genet. Mo. B., 12, 393–412.10.1515/sagmb-2013-0011Suche in Google Scholar PubMed

Zhu, J., B. Zhang, E. N. Smith, B. Drees, R. B. Brem, L. Kruglyak, R. E. Bumgarner and E. E. Schadt (2008): “Integrating large-scale functional genomic data to dissect the complexity of yeast regulatory networks,” Nat. Genet., 40, 854–861.10.1038/ng.167Suche in Google Scholar PubMed PubMed Central

Published Online: 2019-05-01

Sie haben derzeit keinen Zugang zu diesem Inhalt.

Artikel in diesem Heft

https://doi.org/10.1515/sagmb-2018-0042

Schlagwörter für diesen Artikel

Bayesian networks; genetic regulatory networks; microarray data; prior knowledge; structural restrictions