Bayesian Sparsity-Path-Analysis of Genetic Association Signal using Generalized t Priors
-
Anthony Lee
We explore the use of generalized t priors on regression coefficients to help understand the nature of association signal within hit regions of genome-wide association studies. The particular generalized t distribution we adopt is a Student distribution on the absolute value of its argument. For low degrees of freedom, we show that the generalized t exhibits sparsity-prior properties with some attractive features over other common forms of sparse priors and includes the well known double-exponential distribution as the degrees of freedom tends to infinity. We pay particular attention to graphical representations of posterior statistics obtained from sparsity-path-analysis (SPA) where we sweep over the setting of the scale (shrinkage/precision) parameter in the prior to explore the space of posterior models obtained over a range of complexities, from very sparse models with all coefficient distributions heavily concentrated around zero, to models with diffuse priors and coefficients distributed around their maximum likelihood estimates. The SPA plots are akin to LASSO plots of maximum a posteriori (MAP) estimates but they characterise the complete marginal posterior distributions of the coefficients plotted as a function of the precision of the prior. Generating posterior distributions over a range of prior precisions is computationally challenging but naturally amenable to sequential Monte Carlo (SMC) algorithms indexed on the scale parameter. We show how SMC simulation on graphic-processing-units (GPUs) provides very efficient inference for SPA. We also present a scale-mixture representation of the generalized t prior that leads to an expectation-maximization (EM) algorithm to obtain MAP estimates should only these be required.
©2012 Walter de Gruyter GmbH & Co. KG, Berlin/Boston
Articles in the same Issue
- Editorial Introduction
- Special Issue on Computational Statistical Methods for Genomics and Systems Biology
- Article
- A Generalized Hidden Markov Model for Determining Sequence-based Predictors of Nucleosome Positioning
- Gene Filtering in the Analysis of Illumina Microarray Experiments
- Principal Components of Heritability for High Dimension Quantitative Traits and General Pedigrees
- Bayesian Sparsity-Path-Analysis of Genetic Association Signal using Generalized t Priors
- A Family-Based Probabilistic Method for Capturing De Novo Mutations from High-Throughput Short-Read Sequencing Data
- Adjusting for Spurious Gene-by-Environment Interaction Using Case-Parent Triads
- Querying Genomic Databases: Refining the Connectivity Map
- A Model-Based Analysis to Infer the Functional Content of a Gene List
- Candidate Pathway Based Analysis for Cleft Lip with or without Cleft Palate
- Improving Pedigree-based Linkage Analysis by Estimating Coancestry Among Families
Articles in the same Issue
- Editorial Introduction
- Special Issue on Computational Statistical Methods for Genomics and Systems Biology
- Article
- A Generalized Hidden Markov Model for Determining Sequence-based Predictors of Nucleosome Positioning
- Gene Filtering in the Analysis of Illumina Microarray Experiments
- Principal Components of Heritability for High Dimension Quantitative Traits and General Pedigrees
- Bayesian Sparsity-Path-Analysis of Genetic Association Signal using Generalized t Priors
- A Family-Based Probabilistic Method for Capturing De Novo Mutations from High-Throughput Short-Read Sequencing Data
- Adjusting for Spurious Gene-by-Environment Interaction Using Case-Parent Triads
- Querying Genomic Databases: Refining the Connectivity Map
- A Model-Based Analysis to Infer the Functional Content of a Gene List
- Candidate Pathway Based Analysis for Cleft Lip with or without Cleft Palate
- Improving Pedigree-based Linkage Analysis by Estimating Coancestry Among Families