Detecting Sample Misidentifications in Genetic Association Studies
-
Claus T. Ekstrøm
Genetic association studies require that the genotype data from a given person can be correctly linked to the phenotype data from the same person. However, sample misidentification errors sometimes happen, whereby the link becomes invalid for some of the subjects in a study. This can have substantial consequences in terms of power to detect truly associated variants. In family-based studies, Mendelian inconsistencies can be used to detect sample misidentification. Genome-wide association studies (GWAS), however, typically use unrelated individuals, making error detection more problematic.Here we present a method for identifying potential sample misidentifications in GWAS and other genetic association studies building on ideas from forensic sciences. A widely used ad-hoc method for error detection is to check if the sex of an individual matches its X-linked genotype. We generalize this idea to less stringent associations between known genotypes and phenotypes, and show that if several known associations are combined, the power to detect misidentifications increases substantially. Individuals with an unlikely set of phenotypes given their genotypes are flagged as potential errors.We provide analytical and simulation results comparing the odds that the genotype and phenotype are both from the same individual for different numbers of available genotype-phenotype associations and for different information content of the associations. Our method has good sensitivity and specificity with as few as ten moderately informative genotype-phenotype associations. We apply the method to GWAS data from the Danish National Birth Cohort.
©2012 Walter de Gruyter GmbH & Co. KG, Berlin/Boston
Artikel in diesem Heft
- Article
- Exploring Multicollinearity Using a Random Matrix Theory Approach
- The Beta-Binomial SGoF method for multiple dependent tests
- Detecting Sample Misidentifications in Genetic Association Studies
- Borrowing Information Across Genes and Experiments for Improved Error Variance Estimation in Microarray Data Analysis
- Hierarchical Bayes Model for Predicting Effectiveness of HIV Combination Therapies
- The practical effect of batch on genomic prediction
- Normalization, bias correction, and peak calling for ChIP-seq
- Combining Multiple Laser Scans of Spotted Microarrays by Means of a Two-Way ANOVA Model
- Empirical Bayes Interval Estimates that are Conditionally Equal to Unadjusted Confidence Intervals or to Default Prior Credibility Intervals
- Detection of Differentially Expressed Gene Sets in a Partially Paired Microarray Data Set
- Non-Iterative, Regression-Based Estimation of Haplotype Associations with Censored Survival Outcomes
- Graph Selection with GGMselect
- Sample Size Calculations for Designing Clinical Proteomic Profiling Studies Using Mass Spectrometry
- A New Approach for the Joint Analysis of Multiple Chip-Seq Libraries with Application to Histone Modification
- Software Communication
- GENOVA: Gene Overlap Analysis of GWAS Results
Artikel in diesem Heft
- Article
- Exploring Multicollinearity Using a Random Matrix Theory Approach
- The Beta-Binomial SGoF method for multiple dependent tests
- Detecting Sample Misidentifications in Genetic Association Studies
- Borrowing Information Across Genes and Experiments for Improved Error Variance Estimation in Microarray Data Analysis
- Hierarchical Bayes Model for Predicting Effectiveness of HIV Combination Therapies
- The practical effect of batch on genomic prediction
- Normalization, bias correction, and peak calling for ChIP-seq
- Combining Multiple Laser Scans of Spotted Microarrays by Means of a Two-Way ANOVA Model
- Empirical Bayes Interval Estimates that are Conditionally Equal to Unadjusted Confidence Intervals or to Default Prior Credibility Intervals
- Detection of Differentially Expressed Gene Sets in a Partially Paired Microarray Data Set
- Non-Iterative, Regression-Based Estimation of Haplotype Associations with Censored Survival Outcomes
- Graph Selection with GGMselect
- Sample Size Calculations for Designing Clinical Proteomic Profiling Studies Using Mass Spectrometry
- A New Approach for the Joint Analysis of Multiple Chip-Seq Libraries with Application to Histone Modification
- Software Communication
- GENOVA: Gene Overlap Analysis of GWAS Results