Statistical Applications in Genetics and Molecular Biology

Issue

Volume 18, Issue 5

October 2019

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information

This issue is in the journal

Statistical Applications in Genetics and Molecular Biology

Contents

Research Articles
Requires Authentication Unlicensed
Licensed

Stability selection for lasso, ridge and elastic net implemented with AFT models

October 7, 2019

Md Hasinur Rahaman Khan, Anamika Bhadra, Tamanna Howlader

Article number: 20170001

Download PDF

The instability in the selection of models is a major concern with data sets containing a large number of covariates. We focus on stability selection which is used as a technique to improve variable selection performance for a range of selection methods, based on aggregating the results of applying a selection procedure to sub-samples of the data where the observations are subject to right censoring. The accelerated failure time (AFT) models have proved useful in many contexts including the heavy censoring (as for example in cancer survival) and the high dimensionality (as for example in micro-array data). We implement the stability selection approach using three variable selection techniques—Lasso, ridge regression, and elastic net applied to censored data using AFT models. We compare the performances of these regularized techniques with and without stability selection approaches with simulation studies and two real data examples–a breast cancer data and a diffuse large B-cell lymphoma data. The results suggest that stability selection gives always stable scenario about the selection of variables and that as the dimension of data increases the performance of methods with stability selection also improves compared to methods without stability selection irrespective of the collinearity between the covariates.
Requires Authentication Unlicensed
Licensed

A novel individualized drug repositioning approach for predicting personalized candidate drugs for type 1 diabetes mellitus

July 9, 2019

Hong Zheng

Article number: 20180052

Download PDF

The existence of high cost-consuming and high rate of drug failures suggests the promotion of drug repositioning in drug discovery. Existing drug repositioning techniques mainly focus on discovering candidate drugs for a kind of disease, and are not suitable for predicting candidate drugs for an individual sample. Type 1 diabetes mellitus (T1DM) is a disorder of glucose homeostasis caused by autoimmune destruction of the pancreatic β-cell. Here, we present a novel single sample drug repositioning approach for predicting personalized candidate drugs for T1DM. Our method is based on the observation of drug-disease associations by measuring the similarities of individualized pathway aberrance induced by disease and various drugs using a Kolmogorov-Smirnov weighted Enrichment Score algorithm. Using this method, we predicted several underlying candidate drugs for T1DM. Some of them have been reported for the treatment of diabetes mellitus, and some with a current indication to treat other diseases might be repurposed to treat T1DM. This study conducts drug discovery via detecting the functional connections among disease and drug action, on a personalized or customized basis. Our framework provides a rational way for systematic personalized drug discovery of complex diseases and contributes to the future application of custom therapeutic decisions.
Requires Authentication Unlicensed
Licensed

Clustering methods for single-cell RNA-sequencing expression data: performance evaluation with varying sample sizes and cell compositions

August 14, 2019

Aslı Suner

Article number: 20190004

Download PDF

A number of specialized clustering methods have been developed so far for the accurate analysis of single-cell RNA-sequencing (scRNA-seq) expression data, and several reports have been published documenting the performance measures of these clustering methods under different conditions. However, to date, there are no available studies regarding the systematic evaluation of the performance measures of the clustering methods taking into consideration the sample size and cell composition of a given scRNA-seq dataset. Herein, a comprehensive performance evaluation study of 11 selected scRNA-seq clustering methods was performed using synthetic datasets with known sample sizes and number of subpopulations, as well as varying levels of transcriptome complexity. The results indicate that the overall performance of the clustering methods under study are highly dependent on the sample size and complexity of the scRNA-seq dataset. In most of the cases, better clustering performances were obtained as the number of cells in a given expression dataset was increased. The findings of this study also highlight the importance of sample size for the successful detection of rare cell subpopulations with an appropriate clustering tool.
Requires Authentication Unlicensed
Licensed

Bi-level feature selection in high dimensional AFT models with applications to a genomic study

September 17, 2019

Hailin Huang, Jizi Shangguan, Peifeng Ruan, Hua Liang

Article number: 20190016

Download PDF

We propose a new bi-level feature selection method for high dimensional accelerated failure time models by formulating the models to a single index model. The method yields sparse solutions at both the group and individual feature levels along with an expedient algorithm, which is computationally efficient and easily implemented. We analyze a genomic dataset for an illustration, and present a simulation study to show the finite sample performance of the proposed method.

Issues in this Volume

Search the content of this journal

This issue

All issues

Issues in this Volume