How to analyze many contingency tables simultaneously in genetic association studies
-
Thorsten Dickhaus
Abstract
We study exact tests for (2 x 2) and (2 x 3) contingency tables, in particular exact chi-squared tests and exact tests of Fisher type. In practice, these tests are typically carried out without randomization, leading to reproducible results but not exhausting the significance level. We discuss that this can lead to methodological and practical issues in a multiple testing framework when many tables are simultaneously under consideration as in genetic association studies.Realized randomized p-values are proposed as a solution which is especially useful for data-adaptive (plug-in) procedures. These p-values allow to estimate the proportion of true null hypotheses much more accurately than their non-randomized counterparts. Moreover, we address the problem of positively correlated p-values for association by considering techniques to reduce multiplicity by estimating the "effective number of tests" from the correlation structure.An algorithm is provided that bundles all these aspects, efficient computer implementations are made available, a small-scale simulation study is presented and two real data examples are shown.
©2012 Walter de Gruyter GmbH & Co. KG, Berlin/Boston
Articles in the same Issue
- Article
- A New Explained-Variance Based Genetic Risk Score for Predictive Modeling of Disease Risk
- Hessian Calculation for Phylogenetic Likelihood based on the Pruning Algorithm and its Applications
- Cluster-Localized Sparse Logistic Regression for SNP Data
- How to analyze many contingency tables simultaneously in genetic association studies
- Incorporating the Empirical Null Hypothesis into the Benjamini-Hochberg Procedure
- Estimating the Number of One-step Beneficial Mutations
- Testing clonality of three and more tumors using their loss of heterozygosity profiles
- Correction for Founder Effects in Host-Viral Association Studies via Principal Components
- A Non-Homogeneous Dynamic Bayesian Network with Sequentially Coupled Interaction Parameters for Applications in Systems and Synthetic Biology
- An Integrated Hierarchical Bayesian Model for Multivariate eQTL Mapping
- A Novel and Fast Normalization Method for High-Density Arrays
- Performance of MAX Test and Degree of Dominance Index in Predicting the Mode of Inheritance
- A Bayesian autoregressive three-state hidden Markov model for identifying switching monotonic regimes in Microarray time course data
- QTL Mapping Using a Memetic Algorithm with Modifications of BIC as Fitness Function
- Computing Posterior Probabilities for Score-based Alignments Using ppALIGN
Articles in the same Issue
- Article
- A New Explained-Variance Based Genetic Risk Score for Predictive Modeling of Disease Risk
- Hessian Calculation for Phylogenetic Likelihood based on the Pruning Algorithm and its Applications
- Cluster-Localized Sparse Logistic Regression for SNP Data
- How to analyze many contingency tables simultaneously in genetic association studies
- Incorporating the Empirical Null Hypothesis into the Benjamini-Hochberg Procedure
- Estimating the Number of One-step Beneficial Mutations
- Testing clonality of three and more tumors using their loss of heterozygosity profiles
- Correction for Founder Effects in Host-Viral Association Studies via Principal Components
- A Non-Homogeneous Dynamic Bayesian Network with Sequentially Coupled Interaction Parameters for Applications in Systems and Synthetic Biology
- An Integrated Hierarchical Bayesian Model for Multivariate eQTL Mapping
- A Novel and Fast Normalization Method for High-Density Arrays
- Performance of MAX Test and Degree of Dominance Index in Predicting the Mode of Inheritance
- A Bayesian autoregressive three-state hidden Markov model for identifying switching monotonic regimes in Microarray time course data
- QTL Mapping Using a Memetic Algorithm with Modifications of BIC as Fitness Function
- Computing Posterior Probabilities for Score-based Alignments Using ppALIGN