DNA Pooling and Statistical Tests for the Detection of Single Nucleotide Polymorphisms

David M. Ramsey; Andreas Futschik

doi:10.1515/1544-6115.1763

Enjoy 40% off

academic books on De Gruyter Brill *

Article

DNA Pooling and Statistical Tests for the Detection of Single Nucleotide Polymorphisms

David M. Ramsey and Andreas Futschik

Published/Copyright: September 25, 2012

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information

From the journal Statistical Applications in Genetics and Molecular Biology Volume 11 Issue 5

Abstract

The development of next generation genome sequencers gives the opportunity of learning more about the genetic make-up of human and other populations. One important question involves the location of sites at which variation occurs within a population. Our focus will be on the detection of rare variants. Such variants will often not be present in smaller samples and are hard to distinguish from sequencing errors in larger samples. This is particularly true for pooled samples which are often used as part of a cost saving strategy. The focus of this article is on experiments that involve DNA pooling. We derive experimental designs that optimize the power of statistical tests for detecting single nucleotide polymorphisms (SNPs, sites at which there is variation within a population). We also present a new simple test that calls a SNP, if the maximum number of reads of a prospective variant across lanes exceeds a certain threshold. The value of this threshold is defined according to the number of available lanes, the parameters of the genome sequencer and a specified probability of accepting that there is variation at a site when no variation is present. On the basis of this test, we derive pool sizes which are optimal for the detection of rare variants. This test is compared with a likelihood ratio test, which takes into account the number of reads of a prospective variant from all the lanes. It is shown that the threshold based rule achieves a comparable power to this likelihood ratio test and may well be a useful tool in determining near optimal pool sizes for the detection of rare alleles in practical applications.

Keywords: genome sequencing; optimal DNA pooling; statistical inference; single nucleotide polymorphism

Published Online: 2012-9-25

You are currently not able to access this content.

Articles in the same Issue

https://doi.org/10.1515/1544-6115.1763

Keywords for this article

genome sequencing; optimal DNA pooling; statistical inference; single nucleotide polymorphism