Large-scale Parentage Inference with SNPs: an Efficient Algorithm for Statistical Confidence of Parent Pair Allocations
-
Eric C. Anderson
Abstract
Advances in genotyping that allow tens of thousands of individuals to be genotyped at a moderate number of single nucleotide polymorphisms (SNPs) permit parentage inference to be pursued on a very large scale. The intergenerational tagging this capacity allows is revolutionizing the management of cultured organisms (cows, salmon, etc.) and is poised to do the same for scientific studies of natural populations. Currently, however, there are no likelihood-based methods of parentage inference which are implemented in a manner that allows them to quickly handle a very large number of potential parents or parent pairs. Here we introduce an efficient likelihood-based method applicable to the specialized case of cultured organisms in which both parents can be reliably sampled. We develop a Markov chain representation for the cumulative number of Mendelian incompatibilities between an offspring and its putative parents and we exploit it to develop a fast algorithm for simulation-based estimates of statistical confidence in SNP-based assignments of offspring to pairs of parents. The method is implemented in the freely available software SNPPIT. We describe the method in detail, then assess its performance in a large simulation study using known allele frequencies at 96 SNPs from ten hatchery salmon populations. The simulations verify that the method is fast and accurate and that 96 well-chosen SNPs can provide sufficient power to identify the correct pair of parents from amongst millions of candidate pairs.
© 2012 This article is a U.S. Government work and not subject to copyright protection in the United States. All foreign rights are reserved by De Gruyter.
Artikel in diesem Heft
- Article
- Large-scale Parentage Inference with SNPs: an Efficient Algorithm for Statistical Confidence of Parent Pair Allocations
- ExactDAS: An Exact Test Procedure for the Detection of Differential Alternative Splicing in Microarray Experiments
- Incorporating Genomic Annotation into a Hidden Markov Model for DNA Methylation Tiling Array Data
- Variational Bayes Procedure for Effective Classification of Tumor Type with Microarray Gene Expression Data
- Detecting Differential Expression in RNA-sequence Data Using Quasi-likelihood with Shrunken Dispersion Estimates
- Empirical Bayesian Selection of Hypothesis Testing Procedures for Analysis of Sequence Count Expression Data
- Analyzing Genetic Association Studies with an Extended Propensity Score Approach
- Genotype Copy Number Variations using Gaussian Mixture Models: Theory and Algorithms
- Estimators of the local false discovery rate designed for small numbers of tests
- A PAUC-based Estimation Technique for Disease Classification and Biomarker Selection
- Comparison of Targeted Maximum Likelihood and Shrinkage Estimators of Parameters in Gene Networks
- DNA Pooling and Statistical Tests for the Detection of Single Nucleotide Polymorphisms
Artikel in diesem Heft
- Article
- Large-scale Parentage Inference with SNPs: an Efficient Algorithm for Statistical Confidence of Parent Pair Allocations
- ExactDAS: An Exact Test Procedure for the Detection of Differential Alternative Splicing in Microarray Experiments
- Incorporating Genomic Annotation into a Hidden Markov Model for DNA Methylation Tiling Array Data
- Variational Bayes Procedure for Effective Classification of Tumor Type with Microarray Gene Expression Data
- Detecting Differential Expression in RNA-sequence Data Using Quasi-likelihood with Shrunken Dispersion Estimates
- Empirical Bayesian Selection of Hypothesis Testing Procedures for Analysis of Sequence Count Expression Data
- Analyzing Genetic Association Studies with an Extended Propensity Score Approach
- Genotype Copy Number Variations using Gaussian Mixture Models: Theory and Algorithms
- Estimators of the local false discovery rate designed for small numbers of tests
- A PAUC-based Estimation Technique for Disease Classification and Biomarker Selection
- Comparison of Targeted Maximum Likelihood and Shrinkage Estimators of Parameters in Gene Networks
- DNA Pooling and Statistical Tests for the Detection of Single Nucleotide Polymorphisms