Locating Multiple Interacting Quantitative Trait Loci with the Zero-Inflated Generalized Poisson Regression
-
Vinzenz Erhardt
We consider the problem of locating multiple interacting quantitative trait loci (QTL) influencing traits measured in counts. In many applications the distribution of the count variable has a spike at zero. Zero-inflated generalized Poisson regression (ZIGPR) allows for an additional probability mass at zero and hence an improvement in the detection of significant loci. Classical model selection criteria often overestimate the QTL number. Therefore, modified versions of the Bayesian Information Criterion (mBIC and EBIC) were successfully used for QTL mapping. We apply these criteria based on ZIGPR as well as simpler models. An extensive simulation study shows their good power detecting QTL while controlling the false discovery rate. We illustrate how the inability of the Poisson distribution to account for over-dispersion leads to an overestimation of the QTL number and hence strongly discourages its application for identifying factors influencing count data. The proposed method is used to analyze the mice gallstone data of Lyons et al. (2003). Our results suggest the existence of a novel QTL on chromosome 4 interacting with another QTL previously identified on chromosome 5. We provide the corresponding code in R.
©2011 Walter de Gruyter GmbH & Co. KG, Berlin/Boston
Artikel in diesem Heft
- Article
- Epistatic Interactions
- Testing for Gene-Gene Interaction with AMMI Models
- A Bayesian Hierarchical Model for Quantitative Real-Time PCR Data
- Informative or Noninformative Calls for Gene Expression: A Latent Variable Approach
- Detecting Genotyping Error Using Measures of Degree of Hardy-Weinberg Disequilibrium
- Optimisation of HMM Topologies Enhances DNA and Protein Sequence Modelling
- The Apportionment of Total Genetic Variation by Categorical Analysis of Variance
- Dealing with Heterogeneity between Cohorts in Genomewide SNP Association Studies
- An Empirical Bayesian Method for Estimating Biological Networks from Temporal Microarray Data
- Parameter Estimation in Multiple-Hidden I.I.D. Models from Biological Multiple Alignment
- Asymptotic Distribution of the "Orthogonal" Quantitative Transmission Disequilibrium Test in a Structured Population: Exact Formula
- Comparing Spatial Maps of Human Population-Genetic Variation Using Procrustes Analysis
- An Internal Calibration Method for Protein-Array Studies
- Weighted-LASSO for Structured Network Inference from Time Course Data
- Trilocus Disequilibrium Analysis of Multiallelic Markers in Outcrossing Populations
- Sparse Partial Least Squares Classification for High Dimensional Data
- Reconstructability Analysis as a Tool for Identifying Gene-Gene Interactions in Studies of Human Diseases
- Sub-Modular Resolution Analysis by Network Mixture Models
- Space Oriented Rank-Based Data Integration
- The Generalized Odds Ratio as a Measure of Genetic Risk Effect in the Analysis and Meta-Analysis of Association Studies
- Network Enrichment Analysis in Complex Experiments
- Shrinkage Estimation of Effect Sizes as an Alternative to Hypothesis Testing Followed by Estimation in High-Dimensional Biology: Applications to Differential Gene Expression
- Buckley-James Boosting for Survival Analysis with High-Dimensional Biomarker Data
- A Random Coefficients Model for Regional Co-Expression Associated with DNA Copy Number
- Locating Multiple Interacting Quantitative Trait Loci with the Zero-Inflated Generalized Poisson Regression
- Classification of Genomic Sequences via Wavelet Variance and a Self-Organizing Map with an Application to Mitochondrial DNA
- Confidently Estimating the Number of DNA Replication Origins
- Generalizing Moving Averages for Tiling Arrays Using Combined P-Value Statistics
- Lasso Logistic Regression, GSoft and the Cyclic Coordinate Descent Algorithm: Application to Gene Expression Data
- Granger Causality Analysis of Human Cell-Cycle Gene Expression Profiles
- Mapping Quantitative Trait Loci in a Non-Equilibrium Population
- On the Optimal Design of Genetic Variant Discovery Studies
- On Optimal Selection of Summary Statistics for Approximate Bayesian Computation
- Assessment of LD Matrix Measures for the Analysis of Biological Pathway Association
- Optimal Tests Shrinking Both Means and Variances Applicable to Microarray Data Analysis
- The Detection of Blur in Affymetrix GeneChips
- Regression-Based Multi-Trait QTL Mapping Using a Structural Equation Model
- Spatial Clustering of Array CGH Features in Combination with Hierarchical Multiple Testing
- Predicting Patient Survival from Longitudinal Gene Expression
- Including Probe-Level Measurement Error in Robust Mixture Clustering of Replicated Microarray Gene Expression
- Reader's Reaction
- An Alternative Model of Type A Dependence in a Gene Set of Correlated Genes
- Letter to the Editor
- Permutation P-values Should Never Be Zero: Calculating Exact P-values When Permutations Are Randomly Drawn
Artikel in diesem Heft
- Article
- Epistatic Interactions
- Testing for Gene-Gene Interaction with AMMI Models
- A Bayesian Hierarchical Model for Quantitative Real-Time PCR Data
- Informative or Noninformative Calls for Gene Expression: A Latent Variable Approach
- Detecting Genotyping Error Using Measures of Degree of Hardy-Weinberg Disequilibrium
- Optimisation of HMM Topologies Enhances DNA and Protein Sequence Modelling
- The Apportionment of Total Genetic Variation by Categorical Analysis of Variance
- Dealing with Heterogeneity between Cohorts in Genomewide SNP Association Studies
- An Empirical Bayesian Method for Estimating Biological Networks from Temporal Microarray Data
- Parameter Estimation in Multiple-Hidden I.I.D. Models from Biological Multiple Alignment
- Asymptotic Distribution of the "Orthogonal" Quantitative Transmission Disequilibrium Test in a Structured Population: Exact Formula
- Comparing Spatial Maps of Human Population-Genetic Variation Using Procrustes Analysis
- An Internal Calibration Method for Protein-Array Studies
- Weighted-LASSO for Structured Network Inference from Time Course Data
- Trilocus Disequilibrium Analysis of Multiallelic Markers in Outcrossing Populations
- Sparse Partial Least Squares Classification for High Dimensional Data
- Reconstructability Analysis as a Tool for Identifying Gene-Gene Interactions in Studies of Human Diseases
- Sub-Modular Resolution Analysis by Network Mixture Models
- Space Oriented Rank-Based Data Integration
- The Generalized Odds Ratio as a Measure of Genetic Risk Effect in the Analysis and Meta-Analysis of Association Studies
- Network Enrichment Analysis in Complex Experiments
- Shrinkage Estimation of Effect Sizes as an Alternative to Hypothesis Testing Followed by Estimation in High-Dimensional Biology: Applications to Differential Gene Expression
- Buckley-James Boosting for Survival Analysis with High-Dimensional Biomarker Data
- A Random Coefficients Model for Regional Co-Expression Associated with DNA Copy Number
- Locating Multiple Interacting Quantitative Trait Loci with the Zero-Inflated Generalized Poisson Regression
- Classification of Genomic Sequences via Wavelet Variance and a Self-Organizing Map with an Application to Mitochondrial DNA
- Confidently Estimating the Number of DNA Replication Origins
- Generalizing Moving Averages for Tiling Arrays Using Combined P-Value Statistics
- Lasso Logistic Regression, GSoft and the Cyclic Coordinate Descent Algorithm: Application to Gene Expression Data
- Granger Causality Analysis of Human Cell-Cycle Gene Expression Profiles
- Mapping Quantitative Trait Loci in a Non-Equilibrium Population
- On the Optimal Design of Genetic Variant Discovery Studies
- On Optimal Selection of Summary Statistics for Approximate Bayesian Computation
- Assessment of LD Matrix Measures for the Analysis of Biological Pathway Association
- Optimal Tests Shrinking Both Means and Variances Applicable to Microarray Data Analysis
- The Detection of Blur in Affymetrix GeneChips
- Regression-Based Multi-Trait QTL Mapping Using a Structural Equation Model
- Spatial Clustering of Array CGH Features in Combination with Hierarchical Multiple Testing
- Predicting Patient Survival from Longitudinal Gene Expression
- Including Probe-Level Measurement Error in Robust Mixture Clustering of Replicated Microarray Gene Expression
- Reader's Reaction
- An Alternative Model of Type A Dependence in a Gene Set of Correlated Genes
- Letter to the Editor
- Permutation P-values Should Never Be Zero: Calculating Exact P-values When Permutations Are Randomly Drawn