Abstract
Biomolecular networks are often assumed to be scale-free hierarchical networks. The weighted gene co-expression network analysis (WGCNA) treats gene co-expression networks as undirected scale-free hierarchical weighted networks. The WGCNA R software package uses an Adjacency Matrix to store a network, next calculates the topological overlap matrix (TOM), and then identifies the modules (sub-networks), where each module is assumed to be associated with a certain biological function. The most time-consuming step of WGCNA is to calculate TOM from the Adjacency Matrix in a single thread. In this paper, the single-threaded algorithm of the TOM has been changed into a multi-threaded algorithm (the parameters are the default values of WGCNA). In the multi-threaded algorithm, Rcpp was used to make R call a C++ function, and then C++ used OpenMP to start multiple threads to calculate TOM from the Adjacency Matrix. On shared-memory MultiProcessor systems, the calculation time decreases as the number of CPU cores increases. The algorithm of this paper can promote the application of WGCNA on large data sets, and help other research fields to identify sub-networks in undirected scale-free hierarchical weighted networks. The source codes and usage are available at https://github.com/do-somethings-haha/multi-threaded_calculate_unsigned_TOM_from_unsigned_or_signed_Adjacency_Matrix_of_WGCNA.
Funding source: Project of Inheritance Studio of National Famous Experts of State Administration of TCM
Award Identifier / Grant number: NO. [2019] 41
Funding source: Key project at central government level: The ability establishment of sustainable use for valuable Chinese medicine resources
Award Identifier / Grant number: 2060302
Funding source: Sichuan Science and Technology Program
Award Identifier / Grant number: 2021YJ0113
Funding source: the National Natural Science Foundation of China
Award Identifier / Grant number: 81673553
-
Research funding: This study was supported by grants from the National Natural Science Foundation of China (81673553), Key project at central government level: The ability establishment of sustainable use for valuable Chinese medicine resources (2060302), Sichuan Science and Technology Program (2021YJ0113), Project of Inheritance Studio of National Famous Experts of State Administration of TCM (NO. [2019] 41).
References
Abuín, J.M., Pena, T.F., and Pichel, J.C. (2017). PASTASpark: multiple sequence alignment meets big data. Bioinformatics 33: 2948–2950. https://doi.org/10.1093/bioinformatics/btx354.Suche in Google Scholar PubMed
Belachew, M.T. (2019). Efficient algorithm for sparse symmetric nonnegative matrix factorization. Pattern Recogn. Lett. 125: 735–741. https://doi.org/10.1016/j.patrec.2019.07.026.Suche in Google Scholar
Benoodt, L. and Thakar, J. (2020). Network analysis of large-scale data and its application to immunology. Methods Mol. Biol. 2131: 199–211. https://doi.org/10.1007/978-1-0716-0389-5_9.Suche in Google Scholar PubMed
Bernhardsson, C., Zan, Y., Chen, Z., Ingvarsson, P.K., and Wu, H.X. (2021). Development of a highly efficient 50K single nucleotide polymorphism genotyping array for the large and complex genome of Norway spruce (Picea abies L. Karst) by whole genome resequencing and its transferability to other spruce species. Mol. Ecol. Resour. 21: 880–896. https://doi.org/10.1111/1755-0998.13292.Suche in Google Scholar PubMed PubMed Central
Bi, Q., Shen, L., Evans, R., Zhang, Z., Wang, S., Dai, W., and Liu, C. (2020). Determining the topic evolution and sentiment polarity for albinism in a Chinese online health community: machine learning and social network analysis. J. Med. Inform. 8: e17813. https://doi.org/10.2196/17813.Suche in Google Scholar PubMed PubMed Central
Bourdakou, M.M. and Spyrou, G.M. (2017). Informed walks: whispering hints to gene hunters inside networks’ jungle. BMC Syst. Biol. 11: 97. https://doi.org/10.1186/s12918-017-0473-6.Suche in Google Scholar PubMed PubMed Central
Braun, U., Plichta, M.M., Esslinger, C., Sauer, C., Haddad, L., Grimm, O., Mier, D., Mohnke, S., Heinz, A., Erk, S., et al.. (2012). Test-retest reliability of resting-state connectivity network characteristics using fMRI and graph theoretical measures. Neuroimage 59: 1404–1412. https://doi.org/10.1016/j.neuroimage.2011.08.044.Suche in Google Scholar PubMed
Butte, A.J. and Kohane, I.S. (2000). Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. Pac. Symp. Biocomput. 418–429. https://doi.org/10.1142/9789814447331_0040.Suche in Google Scholar PubMed
Chen, J.C., Cerise, J.E., Jabbari, A., Clynes, R., and Christiano, A.M. (2015). Master regulators of infiltrate recruitment in autoimmune disease identified through network-based molecular deconvolution. Cell Syst. 1: 326–337. https://doi.org/10.1016/j.cels.2015.11.001.Suche in Google Scholar PubMed PubMed Central
Chih-Ta, L., Tao, X., Shi-Lai, X., Li, Z., Run-Ze, S., Yang, L., Paul, M.J., and Xin, D. (2019). Weighted gene co-expression network analysis (WGCNA) reveals the hub role of protein ubiquitination in the acquisition of desiccation tolerance in Boea hygrometrica. Plant Cell Physiol. 60: 2707–2719. https://doi.org/10.1093/pcp/pcz160.Suche in Google Scholar PubMed
David, T., Rami, P., Avi, S., and Eduardo, B. (2019). A genetic algorithm to optimize weighted gene co-expression network analysis. J. Comput. Biol. 26: 1349–1366. https://doi.org/10.1089/cmb.2019.0221.Suche in Google Scholar PubMed
Eddelbuettel, D. and Francois, R. (2011). Rcpp: seamless R and C plus plus integration. J. Stat. Software 40: 1–18.10.18637/jss.v040.i08Suche in Google Scholar
Faith, J.J., Hayete, B., Thaden, J.T., Mogno, I., Wierzbowski, J., Cottarel, G., Kasif, S., Collins, J.J., and Gardner, T.S. (2007). Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biol. 5: e8. https://doi.org/10.1371/journal.pbio.0050008.Suche in Google Scholar PubMed PubMed Central
Horvath, S. (2011). Weighted network analysis, 1st ed. New York, NY, USA: Springer.10.1007/978-1-4419-8819-5Suche in Google Scholar
Huynh-Thu, V.A., Irrthum, A., Wehenkel, L., and Geurts, P. (2010). Inferring regulatory networks from expression data using tree-based methods. PLoS One 5: e12776. https://doi.org/10.1371/journal.pone.0012776.Suche in Google Scholar PubMed PubMed Central
Javed, M.A., Younis, M.S., Latif, S., Qadir, J., and Baig, A. (2018). Community detection in networks: a multidisciplinary review. J. Netw. Comput. Appl. 108: 87–111. https://doi.org/10.1016/j.jnca.2018.02.011.Suche in Google Scholar
Khan, A., Katanic, D., and Thakar, J. (2017). Meta-analysis of cell-specific transcriptomic data using fuzzy c-means clustering discovers versatile viral responsive genes. BMC Bioinf. 18: 295. https://doi.org/10.1186/s12859-017-1669-x.Suche in Google Scholar PubMed PubMed Central
Kinney, J.B. and Atwal, G.S. (2014). Equitability, mutual information, and the maximal information coefficient. Proc. Natl. Acad. Sci. U.S.A. 111: 3354–3359. https://doi.org/10.1073/pnas.1309933111.Suche in Google Scholar PubMed PubMed Central
Langfelder, P. and Horvath, S. (2008). WGCNA: an R package for weighted correlation network analysis. BMC Bioinf. 9: 559. https://doi.org/10.1186/1471-2105-9-559.Suche in Google Scholar PubMed PubMed Central
Li, Q. and Chen, M. (2020). Comprehensive transportation network planning method based on energy conservation concept. Chem. Technol. Fuels Oils 56: 682–696. https://doi.org/10.1007/s10553-020-01181-z.Suche in Google Scholar
Liu, X., Maiorino, E., Halu, A., Glass, K., Prasad, R.B., Loscalzo, J., Gao, J., and Sharma, A. (2020). Robustness and lethality in multilayer biological molecular networks. Nat. Commun. 11: 6043. https://doi.org/10.1038/s41467-020-19841-3.Suche in Google Scholar PubMed PubMed Central
Lu, C., Pu, Y., Liu, Y., Li, Y., Qu, J., Huang, H., and Dai, S. (2019). Comparative transcriptomics and weighted gene co-expression correlation network analysis (WGCNA) reveal potential regulation mechanism of carotenoid accumulation in chrysanthemum × morifolium. Plant Physiol. Biochem. 142: 415–428. https://doi.org/10.1016/j.plaphy.2019.07.023.Suche in Google Scholar PubMed
Margolin, A.A., Nemenman, I., Basso, K., Wiggins, C., Stolovitzky, G., Dalla, F.R., and Califano, A. (2006). ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinf. 7: S7. https://doi.org/10.1186/1471-2105-7-S1-S7.Suche in Google Scholar PubMed PubMed Central
Meyer, P.E., Lafitte, F., and Bontempi, G. (2008). minet: a R/Bioconductor package for inferring large transcriptional networks using mutual information. BMC Bioinf. 9: 461. https://doi.org/10.1186/1471-2105-9-461.Suche in Google Scholar PubMed PubMed Central
Puliga, M., Flori, A., Pappalardo, G., Chessa, A., and Pammolli, F. (2016). The accounting network: how financial institutions react to systemic crisis. PLoS One 11: e0162855. https://doi.org/10.1371/journal.pone.0162855.Suche in Google Scholar PubMed PubMed Central
Ravasz, E., Somera, A.L., Mongru, D.A., Oltvai, Z.N., and Barabási, A.L. (2002). Hierarchical organization of modularity in metabolic networks. Science 297: 1551–1555. https://doi.org/10.1126/science.1073374.Suche in Google Scholar PubMed
Sanchez-Castillo, M., Blanco, D., Tienda-Luna, I.M., Carrion, M.C., and Huang, Y. (2018). A Bayesian framework for the inference of gene regulatory networks from time and pseudo-time series data. Bioinformatics 34: 964–970. https://doi.org/10.1093/bioinformatics/btx605.Suche in Google Scholar PubMed
Tseng, F., Liang, T., Chou, L., and Chao, H. (2016). Network planning for heterogeneous cellular network in next generation mobile communications. J. Internet Technol. 17: 1269–1277. https://doi.org/10.6138/JIT.2016.17.6.20150603d.Suche in Google Scholar
Wan, Q., Tang, J., Han, Y., and Wang, D. (2018). Co-expression modules construction by WGCNA and identify potential prognostic markers of uveal melanoma. Exp. Eye Res. 166: 13–20. https://doi.org/10.1016/j.exer.2017.10.007.Suche in Google Scholar PubMed
Wang, F., Han, S., Yang, J., Yan, W., and Hu, G. (2021a). Knowledge-guided “community network” analysis reveals the functional modules and candidate targets in non-small-cell lung cancer. Cells 10: 402. https://doi.org/10.3390/cells10020402.Suche in Google Scholar PubMed PubMed Central
Wang, M., Li, Z., Zhang, Y., Zhang, Y., Xie, Y., Ye, L., Zhuang, Y., Lin, K., Zhao, F., Guo, J., et al.. (2021b). An atlas of wheat epigenetic regulatory elements reveals subgenome divergence in the regulation of development and stress responses. Plant Cell. 33: 865–881. https://doi.org/10.1093/plcell/koab028.Suche in Google Scholar PubMed PubMed Central
Wipf, D., Mongelard, G., van Tuinen, D., Gutierrez, L., and Casieri, L. (2014). Transcriptional responses of Medicago truncatula upon sulfur deficiency stress and arbuscular mycorrhizal symbios. Front. Plant Sci. 5: 680. https://doi.org/10.3389/fpls.2014.00680.Suche in Google Scholar PubMed PubMed Central
Yang, Q., Chen, Q., Niu, T., Feng, E., and Yuan, J. (2021). Robustness analysis and identification for an enzyme-catalytic complex metabolic network in batch culture. Bioproc. Biosyst. Eng. 44: 1511–1524. https://doi.org/10.1007/s00449-021-02535-5.Suche in Google Scholar PubMed
Yang, C., Huang, C., and Lin, C. (2010). Hybrid CUDA, OpenMP, and MPI parallel programming on multicore GPU clusters. Comput. Phys. Commun. 182: 266–269. https://doi.org/10.1016/j.cpc.2010.06.035.Suche in Google Scholar
Zhang, H., Fu, Y., Guo, H., Zhang, L., Wang, C.Y., Song, W.N., Yan, Z.G., Wang, Y.J., and Ji, W.Q. (2019). Transcriptome and proteome-based network analysis reveals a model of gene activation in wheat resistance to stripe rust. Int. J. Mol. Sci. 20: 1106. https://doi.org/10.3390/ijms20051106.Suche in Google Scholar PubMed PubMed Central
Zhang, J., Misra, S., Wang, H., and Feng, W. (2016). muBLASTP: database-indexed protein sequence search on multicore CPUs. BMC Bioinf. 17: 443. https://doi.org/10.1186/s12859-016-1302-4.Suche in Google Scholar PubMed PubMed Central
Zhi, Z., Jian-Xiao, S., Yan, P., Juan, P., Yong-Gang, L., Xing-Hua, S., and Wan-Peng, W. (2018). Weighted gene correlation network analysis (WGCNA) detected loss of MAGI2 promotes chronic kidney disease (CKD) by podocyte damage. Cell. Physiol. Biochem. 51: 244–261. https://doi.org/10.1159/000495205.Suche in Google Scholar PubMed
© 2021 Walter de Gruyter GmbH, Berlin/Boston
Artikel in diesem Heft
- Frontmatter
- Research Articles
- Batch effect reduction of microarray data with dependent samples using an empirical Bayes approach (BRIDGE)
- Inference of genetic regulatory networks with regulatory hubs using vector autoregressions and automatic relevance determination with model selections
- Software and Application Note
- Optimizing weighted gene co-expression network analysis with a multi-threaded calculation of the topological overlap matrix
Artikel in diesem Heft
- Frontmatter
- Research Articles
- Batch effect reduction of microarray data with dependent samples using an empirical Bayes approach (BRIDGE)
- Inference of genetic regulatory networks with regulatory hubs using vector autoregressions and automatic relevance determination with model selections
- Software and Application Note
- Optimizing weighted gene co-expression network analysis with a multi-threaded calculation of the topological overlap matrix