Chapter 8 Selection of threshold in binary graphs of biological networks
-
Vilda Purutçuoğlu
und Başak Bahçivancı
Abstract
In recent years, accurately screening genes and their interactions has become increasingly important for personalized medicine. Understanding and detecting gene interactions are pivotal, but discovering these interactions is challenging due to the inherent structural and functional complexities of biological systems. These challenges include the high sparsity of gene interaction networks, the disparity between the number of genes and the number of samples, and the strong correlations among genes. The Gaussian graphical model (GGM) is a fundamental tool used to infer gene interaction networks. It represents relationships between biological entities through an undirected graph, where nodes denote genes and edges indicate conditional dependencies. To construct these graphs, the precision matrix, which contains conditional dependencies, is converted into a binary form, with 0 representing the absence of an edge and 1 indicating the presence of an edge. This conversion requires thresholding, a critical step that affects network construction and subsequent developments. Given the importance of network-based biological research for discovering novel biomarkers and drugs, threshold selection for the precision matrix has become a focal point. Various methods for selecting the optimal threshold are discussed in the literature, including both parametric and nonparametric approaches. For instance, Schneider et al. [4] propose a parametric threshold selection based on data distribution and a nonparametric method using Hill’s estimator for univariate extreme value analysis.
Abstract
In recent years, accurately screening genes and their interactions has become increasingly important for personalized medicine. Understanding and detecting gene interactions are pivotal, but discovering these interactions is challenging due to the inherent structural and functional complexities of biological systems. These challenges include the high sparsity of gene interaction networks, the disparity between the number of genes and the number of samples, and the strong correlations among genes. The Gaussian graphical model (GGM) is a fundamental tool used to infer gene interaction networks. It represents relationships between biological entities through an undirected graph, where nodes denote genes and edges indicate conditional dependencies. To construct these graphs, the precision matrix, which contains conditional dependencies, is converted into a binary form, with 0 representing the absence of an edge and 1 indicating the presence of an edge. This conversion requires thresholding, a critical step that affects network construction and subsequent developments. Given the importance of network-based biological research for discovering novel biomarkers and drugs, threshold selection for the precision matrix has become a focal point. Various methods for selecting the optimal threshold are discussed in the literature, including both parametric and nonparametric approaches. For instance, Schneider et al. [4] propose a parametric threshold selection based on data distribution and a nonparametric method using Hill’s estimator for univariate extreme value analysis.
Kapitel in diesem Buch
- Frontmatter I
- Contents V
- List of authors VII
- Chapter 1 Use of digital systems in the design system of photovoltaic solar stations 1
- Chapter 2 Potential wind energy in Turkmenistan 21
- Chapter 3 Potential of using biogas technology in Turkmenistan 31
- Chapter 4 Energy efficiency 45
- Chapter 5 Latent renewable energy in Turkmenistan 57
- Chapter 6 Approximate stochastic simulation algorithms 67
- Chapter 7 The role of supply chain management in the construction industry 95
- Chapter 8 Selection of threshold in binary graphs of biological networks 121
- Chapter 9 Model selection criteria with bootstrap algorithms: applications in biological networks 133
- Chapter 10 Technocracy in Governance: new directions in city functioning and urban planning 149
- Chapter 11 Outlier detection in biomedical data: ECG-focused approaches 161
- Chapter 12 Optimization of debt collection strategies for South African banks with machine learning models 183
- Chapter 13 Performance of six turbulence models in predicting two-phase flow on a hydraulic test bench 209
- Index 231
Kapitel in diesem Buch
- Frontmatter I
- Contents V
- List of authors VII
- Chapter 1 Use of digital systems in the design system of photovoltaic solar stations 1
- Chapter 2 Potential wind energy in Turkmenistan 21
- Chapter 3 Potential of using biogas technology in Turkmenistan 31
- Chapter 4 Energy efficiency 45
- Chapter 5 Latent renewable energy in Turkmenistan 57
- Chapter 6 Approximate stochastic simulation algorithms 67
- Chapter 7 The role of supply chain management in the construction industry 95
- Chapter 8 Selection of threshold in binary graphs of biological networks 121
- Chapter 9 Model selection criteria with bootstrap algorithms: applications in biological networks 133
- Chapter 10 Technocracy in Governance: new directions in city functioning and urban planning 149
- Chapter 11 Outlier detection in biomedical data: ECG-focused approaches 161
- Chapter 12 Optimization of debt collection strategies for South African banks with machine learning models 183
- Chapter 13 Performance of six turbulence models in predicting two-phase flow on a hydraulic test bench 209
- Index 231