Accounting for Dependence in Similarity Data from DNA Fingerprinting

Graham Hepworth; Ian R Gordon; Michael J McCullough

doi:10.2202/1544-6115.1212

Home Life Sciences Accounting for Dependence in Similarity Data from DNA Fingerprinting

Article

Licensed

Unlicensed Requires Authentication

Accounting for Dependence in Similarity Data from DNA Fingerprinting

, and

Published/Copyright: January 15, 2007

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information Explore this Subject

From the journal Statistical Applications in Genetics and Molecular Biology Volume 6 Issue 1

MLA
APA
Harvard
Chicago
Vancouver

MLA
APA
Harvard
Chicago
Vancouver

Hepworth, Graham, Gordon, Ian R and McCullough, Michael J. "Accounting for Dependence in Similarity Data from DNA Fingerprinting" Statistical Applications in Genetics and Molecular Biology, vol. 6, no. 1. https://doi.org/10.2202/1544-6115.1212

Hepworth, G., Gordon, I. & McCullough, M. (). Accounting for Dependence in Similarity Data from DNA Fingerprinting. Statistical Applications in Genetics and Molecular Biology, 6(1). https://doi.org/10.2202/1544-6115.1212

Hepworth, G., Gordon, I. and McCullough, M. () Accounting for Dependence in Similarity Data from DNA Fingerprinting. Statistical Applications in Genetics and Molecular Biology, Vol. 6 (Issue 1). https://doi.org/10.2202/1544-6115.1212

Hepworth, Graham, Gordon, Ian R and McCullough, Michael J. "Accounting for Dependence in Similarity Data from DNA Fingerprinting" Statistical Applications in Genetics and Molecular Biology 6, no. 1 (). https://doi.org/10.2202/1544-6115.1212

Hepworth G, Gordon I, McCullough M. Accounting for Dependence in Similarity Data from DNA Fingerprinting. Statistical Applications in Genetics and Molecular Biology. ;6(1). https://doi.org/10.2202/1544-6115.1212

Copy

Copied to clipboard

BibTeX EndNote RIS

Differentiating strains of a pathogen is often central to investigating its epidemiological aspects. The genetic similarity of a group of strains can be assessed by calculating a matrix of dissimilarities from their DNA fingerprinting profiles. The mean dissimilarity for each strain across other strains within the group is then used as an observation in a statistical analysis. These observations are not independent of each other, and so standard analysis techniques such as the t-test are inappropriate, because they underestimate the variance of the group means, and hence overstate the statistical significance of any differences. By examining the correlation between elements of the dissimilarity matrix, it is shown that the variance is underestimated by a factor of between about 2 and 4. Permutation tests are proposed as a way of addressing the problem of dependence, and are applied to a study of fluconazole resistance in Candida albicans.

Keywords: Candida albicans; dependence; DNA fingerprinting; permutation test; similarity; variance inflation factor

Published Online: 2007-1-15

You are currently not able to access this content.

Articles in the same Issue

Article
Accounting for Dependence in Similarity Data from DNA Fingerprinting
Normalization of Dye Bias in Microarray Data Using the Mixture of Splines Model
A Generalized Sidak-Holm Procedure and Control of Generalized Error Rates under Independence
Using Duplicate Genotyped Data in Genetic Analyses: Testing Association and Estimating Error Rates
Likelihood-Based Inference for Multi-Color Optical Mapping
Sparse Logistic Regression with Lp Penalty for Biomarker Identification
Super Learning: An Application to the Prediction of HIV-1 Drug Resistance
Supervised Detection of Conserved Motifs in DNA Sequences with Cosmo
Accurate Ranking of Differentially Expressed Genes by a Distribution-Free Shrinkage Approach
Statistical Inference for Quantitative Polymerase Chain Reaction Using a Hidden Markov Model: A Bayesian Approach
A Bayesian Model of AFLP Marker Evolution and Phylogenetic Inference
Sequential Quantitative Trait Locus Mapping in Experimental Crosses
Case-Control Inference of Interaction between Genetic and Nongenetic Risk Factors under Assumptions on Their Distribution
Inference on the Limiting False Discovery Rate and the P-value Threshold Parameter Assuming Weak Dependence between Gene Expression Levels within Subject
Reconstructing Gene Regulatory Networks with Bayesian Networks by Combining Expression Data with Multiple Sources of Prior Knowledge
Cox Survival Analysis of Microarray Gene Expression Data Using Correlation Principal Component Regression
A Method for Meta-Analysis of Case-Control Genetic Association Studies Using Logistic Regression
Approximating the Variance of the Conditional Probability of the State of a Hidden Markov Model
Using Linear Mixed Models for Normalization of cDNA Microarrays
Experimental Design for Two-Color Microarrays Applied in a Pre-Existing Split-Plot Experiment
The Cyclohedron Test for Finding Periodic Genes in Time Course Expression Studies
H-Tuple Approach to Evaluate Statistical Significance of Biological Sequence Comparison with Gaps
Multiple Testing Issues in Discriminating Compound-Related Peaks and Chromatograms from High Frequency Noise, Spikes and Solvent-Based Noise in LC - MS Data Sets
A Bayesian Approach to Estimation and Testing in Time-course Microarray Experiments
Super Learner
Testing for Trends in Dose-Response Microarray Experiments: A Comparison of Several Testing Procedures, Multiplicity and Resampling-Based Inference
On the Operational Characteristics of the Benjamini and Hochberg False Discovery Rate Procedure
A Comparison of Methods to Control Type I Errors in Microarray Studies
Selection of Biologically Relevant Genes with a Wrapper Stochastic Algorithm
T-BAPS: A Bayesian Statistical Tool for Comparison of Microbial Communities Using Terminal-restriction Fragment Length Polymorphism (T-RFLP) Data
Population Structure and Covariate Analysis Based on Pairwise Microsatellite Allele Matching Frequencies
Estimating the Arm-Wise False Discovery Rate in Array Comparative Genomic Hybridization Experiments
An Expectation Maximization Approach to Estimate Malaria Haplotype Frequencies in Multiply Infected Children
Estimation of Expression Levels in Spotted Microarrays with Saturated Pixels
Improving Divergence Time Estimation in Phylogenetics: More Taxa vs. Longer Sequences
Fully Bayesian Mixture Model for Differential Gene Expression: Simulations and Model Checks
Multiple Testing for SNP-SNP Interactions

Search journal Search the content of this journal

https://doi.org/10.2202/1544-6115.1212

Keywords for this article

Candida albicans; dependence; DNA fingerprinting; permutation test; similarity; variance inflation factor

Articles in the same Issue

Article
Accounting for Dependence in Similarity Data from DNA Fingerprinting
Normalization of Dye Bias in Microarray Data Using the Mixture of Splines Model
A Generalized Sidak-Holm Procedure and Control of Generalized Error Rates under Independence
Using Duplicate Genotyped Data in Genetic Analyses: Testing Association and Estimating Error Rates
Likelihood-Based Inference for Multi-Color Optical Mapping
Sparse Logistic Regression with Lp Penalty for Biomarker Identification
Super Learning: An Application to the Prediction of HIV-1 Drug Resistance
Supervised Detection of Conserved Motifs in DNA Sequences with Cosmo
Accurate Ranking of Differentially Expressed Genes by a Distribution-Free Shrinkage Approach
Statistical Inference for Quantitative Polymerase Chain Reaction Using a Hidden Markov Model: A Bayesian Approach
A Bayesian Model of AFLP Marker Evolution and Phylogenetic Inference
Sequential Quantitative Trait Locus Mapping in Experimental Crosses
Case-Control Inference of Interaction between Genetic and Nongenetic Risk Factors under Assumptions on Their Distribution
Inference on the Limiting False Discovery Rate and the P-value Threshold Parameter Assuming Weak Dependence between Gene Expression Levels within Subject
Reconstructing Gene Regulatory Networks with Bayesian Networks by Combining Expression Data with Multiple Sources of Prior Knowledge
Cox Survival Analysis of Microarray Gene Expression Data Using Correlation Principal Component Regression
A Method for Meta-Analysis of Case-Control Genetic Association Studies Using Logistic Regression
Approximating the Variance of the Conditional Probability of the State of a Hidden Markov Model
Using Linear Mixed Models for Normalization of cDNA Microarrays
Experimental Design for Two-Color Microarrays Applied in a Pre-Existing Split-Plot Experiment
The Cyclohedron Test for Finding Periodic Genes in Time Course Expression Studies
H-Tuple Approach to Evaluate Statistical Significance of Biological Sequence Comparison with Gaps
Multiple Testing Issues in Discriminating Compound-Related Peaks and Chromatograms from High Frequency Noise, Spikes and Solvent-Based Noise in LC - MS Data Sets
A Bayesian Approach to Estimation and Testing in Time-course Microarray Experiments
Super Learner
Testing for Trends in Dose-Response Microarray Experiments: A Comparison of Several Testing Procedures, Multiplicity and Resampling-Based Inference
On the Operational Characteristics of the Benjamini and Hochberg False Discovery Rate Procedure
A Comparison of Methods to Control Type I Errors in Microarray Studies
Selection of Biologically Relevant Genes with a Wrapper Stochastic Algorithm
T-BAPS: A Bayesian Statistical Tool for Comparison of Microbial Communities Using Terminal-restriction Fragment Length Polymorphism (T-RFLP) Data
Population Structure and Covariate Analysis Based on Pairwise Microsatellite Allele Matching Frequencies
Estimating the Arm-Wise False Discovery Rate in Array Comparative Genomic Hybridization Experiments
An Expectation Maximization Approach to Estimate Malaria Haplotype Frequencies in Multiply Infected Children
Estimation of Expression Levels in Spotted Microarrays with Saturated Pixels
Improving Divergence Time Estimation in Phylogenetics: More Taxa vs. Longer Sequences
Fully Bayesian Mixture Model for Differential Gene Expression: Simulations and Model Checks
Multiple Testing for SNP-SNP Interactions