Chapter 9 Model selection criteria with bootstrap algorithms: applications in biological networks
-
Mehmet Ali Kaygusuz
and Vilda Purutçuoğlu
Abstract
Model selection methods play a fundamental role in both classical and modern statistical theories. Owing to the technological advancements, we now have more data in fields such as engineering, medicine and finance. Consequently, the estimation in high-dimensional graphical models and the selection of the regularization parameter embedded in these models under high dimensions have become more important. Therefore, many model selection criteria have been suggested in the literature such as “Akaike’s information criterion” and the “Bayesian information criterion” as wellknown approaches, as well as some extended versions, such as the “consistent Akaike information criterion with Fisher information” and “information and complexity selection” methods. In this study, we used the aforementioned four approaches to evaluate their performance for sparse biological networks by including bootstrap strategies. Specifically, we applied the nonparametric bootstrap, known as the Efron method, and the Bayesian bootstrap method due to the fact that in real data, the number of observations per genomic random variable is typically very limited. Thus, we overcame this limitation by augmenting sample sizes in the selection of the optimal model. We tested the accuracy of our results on two real and two simulated datasets.
Abstract
Model selection methods play a fundamental role in both classical and modern statistical theories. Owing to the technological advancements, we now have more data in fields such as engineering, medicine and finance. Consequently, the estimation in high-dimensional graphical models and the selection of the regularization parameter embedded in these models under high dimensions have become more important. Therefore, many model selection criteria have been suggested in the literature such as “Akaike’s information criterion” and the “Bayesian information criterion” as wellknown approaches, as well as some extended versions, such as the “consistent Akaike information criterion with Fisher information” and “information and complexity selection” methods. In this study, we used the aforementioned four approaches to evaluate their performance for sparse biological networks by including bootstrap strategies. Specifically, we applied the nonparametric bootstrap, known as the Efron method, and the Bayesian bootstrap method due to the fact that in real data, the number of observations per genomic random variable is typically very limited. Thus, we overcame this limitation by augmenting sample sizes in the selection of the optimal model. We tested the accuracy of our results on two real and two simulated datasets.
Chapters in this book
- Frontmatter I
- Contents V
- List of authors VII
- Chapter 1 Use of digital systems in the design system of photovoltaic solar stations 1
- Chapter 2 Potential wind energy in Turkmenistan 21
- Chapter 3 Potential of using biogas technology in Turkmenistan 31
- Chapter 4 Energy efficiency 45
- Chapter 5 Latent renewable energy in Turkmenistan 57
- Chapter 6 Approximate stochastic simulation algorithms 67
- Chapter 7 The role of supply chain management in the construction industry 95
- Chapter 8 Selection of threshold in binary graphs of biological networks 121
- Chapter 9 Model selection criteria with bootstrap algorithms: applications in biological networks 133
- Chapter 10 Technocracy in Governance: new directions in city functioning and urban planning 149
- Chapter 11 Outlier detection in biomedical data: ECG-focused approaches 161
- Chapter 12 Optimization of debt collection strategies for South African banks with machine learning models 183
- Chapter 13 Performance of six turbulence models in predicting two-phase flow on a hydraulic test bench 209
- Index 231
Chapters in this book
- Frontmatter I
- Contents V
- List of authors VII
- Chapter 1 Use of digital systems in the design system of photovoltaic solar stations 1
- Chapter 2 Potential wind energy in Turkmenistan 21
- Chapter 3 Potential of using biogas technology in Turkmenistan 31
- Chapter 4 Energy efficiency 45
- Chapter 5 Latent renewable energy in Turkmenistan 57
- Chapter 6 Approximate stochastic simulation algorithms 67
- Chapter 7 The role of supply chain management in the construction industry 95
- Chapter 8 Selection of threshold in binary graphs of biological networks 121
- Chapter 9 Model selection criteria with bootstrap algorithms: applications in biological networks 133
- Chapter 10 Technocracy in Governance: new directions in city functioning and urban planning 149
- Chapter 11 Outlier detection in biomedical data: ECG-focused approaches 161
- Chapter 12 Optimization of debt collection strategies for South African banks with machine learning models 183
- Chapter 13 Performance of six turbulence models in predicting two-phase flow on a hydraulic test bench 209
- Index 231