Comparing Bacterial DNA Microarray Fingerprints
-
Alan Willse
Epidemiologic and forensic investigations often require assays to detect subtle genetic differences between closely related microorganisms. Typically, gel electrophoresis is used to compare randomly amplified DNA fragments between microbial samples, where the patterns of DNA fragment sizes are viewed as genotype fingerprints. The limited genomic sample captured on a gel, however, is not always sufficient to discriminate closely related strains. This paper examines the application of microarray technology to DNA fingerprinting as a high-resolution alternative to gel-based methods. The so-called universal microarray, which uses short oligonucleotide probes that do not target specific genes or species, is intended to be applicable to all microorganisms because it does not require prior knowledge of genomic sequence. In principle, closely related strains can be distinguished if enough independent oligonucleotide probes are used on the microarray, i.e., if the genome is sufficiently sampled. In practice, we confront noisy data, imperfectly matched hybridizations, and a high-dimensional inference problem. We describe the statistical problems of microarray fingerprinting, outline similarities with and differences from more conventional microarray applications, and illustrate a statistical measurement error model to fingerprint 10 closely related strains from three Bacillus species, and 3 strains from non-Bacillus species.
©2011 Walter de Gruyter GmbH & Co. KG, Berlin/Boston
Artikel in diesem Heft
- Article
- Estimating Motifs Under Order Restrictions
- Reproducible Research: A Bioinformatics Case Study
- Generalized Rank Tests for Replicated Microarray Data
- Stepwise Normalization of Two-Channel Spotted Microarrays
- Comparing Automatic and Manual Image Processing in FLARE Assay Analysis for Colon Carcinogenesis
- Pixel-level Signal Modelling with Spatial Correlation for Two-Colour Microarrays
- Empirical Bayes Microarray ANOVA and Grouping Cell Lines by Equal Expression Levels
- Multiple Testing and Data Adaptive Regression: An Application to HIV-1 Sequence Data.
- Early Diagnostic Marker Panel Determination for Microarray Based Clinical Studies
- Prediction of Missing Values in Microarray and Use of Mixed Models to Evaluate the Predictors
- Combined Association and Linkage Analysis for General Pedigrees and Genetic Models
- Incorporating Biological Information as a Prior in an Empirical Bayes Approach to Analyzing Microarray Data
- The Relative Inefficiency of Sequence Weights Approaches in Determining a Nucleotide Position Weight Matrix
- A Simple Loglinear Model for Haplotype Effects in a Case-Control Study Involving Two Unphased Genotypes
- Extension of the SIMLA Package for Generating Pedigrees with Complex Inheritance Patterns: Environmental Covariates, Gene-Gene and Gene-Environment Interaction
- Error Distribution for Gene Expression Data
- A General Framework for Weighted Gene Co-Expression Network Analysis
- Statistical Inference in Evolutionary Models of DNA Sequences via the EM Algorithm
- Comparing Bacterial DNA Microarray Fingerprints
- Continuous Covariates in Genetic Association Studies of Case-Parent Triads: Gene and Gene-Environment Interaction Effects, Population Stratification, and Power Analysis
- Robust Remote Homology Detection by Feature Based Profile Hidden Markov Models
- Empirical Bayes Estimation of a Sparse Vector of Gene Expression Changes
- Hierarchical Inverse Gaussian Models and Multiple Testing: Application to Gene Expression Data
- FADO: A Statistical Method to Detect Favored or Avoided Distances between Occurrences of Motifs using the Hawkes' Model
- Prediction of Genomewide Conserved Epitope Profiles of HIV-1: Classifier Choice and Peptide Representation
- Fold-Change Estimation of Differentially Expressed Genes using Mixture Mixed-Model
- Test on the Structure of Biological Sequences via Chaos Game Representation
- Reverse Engineering Galactose Regulation in Yeast through Model Selection
- Empirical Bayes and Resampling Based Multiple Testing Procedure Controlling Tail Probability of the Proportion of False Positives.
- Weighted Analysis of Paired Microarray Experiments
- A Probabilistic Approach to Large-Scale Association Scans: A Semi-Bayesian Method to Detect Disease-Predisposing Alleles
- A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics
- Structured Antedependence Models for Functional Mapping of Multiple Longitudinal Traits
- Correlation Between Gene Expression Levels and Limitations of the Empirical Bayes Methodology for Finding Differentially Expressed Genes
- Bayesian Statistical Studies of the Ramachandran Distribution
- On Reference Designs For Microarray Experiments
- Computing Asymptotic Power and Sample Size for Case-Control Genetic Association Studies in the Presence of Phenotype and/or Genotype Misclassification Errors
Artikel in diesem Heft
- Article
- Estimating Motifs Under Order Restrictions
- Reproducible Research: A Bioinformatics Case Study
- Generalized Rank Tests for Replicated Microarray Data
- Stepwise Normalization of Two-Channel Spotted Microarrays
- Comparing Automatic and Manual Image Processing in FLARE Assay Analysis for Colon Carcinogenesis
- Pixel-level Signal Modelling with Spatial Correlation for Two-Colour Microarrays
- Empirical Bayes Microarray ANOVA and Grouping Cell Lines by Equal Expression Levels
- Multiple Testing and Data Adaptive Regression: An Application to HIV-1 Sequence Data.
- Early Diagnostic Marker Panel Determination for Microarray Based Clinical Studies
- Prediction of Missing Values in Microarray and Use of Mixed Models to Evaluate the Predictors
- Combined Association and Linkage Analysis for General Pedigrees and Genetic Models
- Incorporating Biological Information as a Prior in an Empirical Bayes Approach to Analyzing Microarray Data
- The Relative Inefficiency of Sequence Weights Approaches in Determining a Nucleotide Position Weight Matrix
- A Simple Loglinear Model for Haplotype Effects in a Case-Control Study Involving Two Unphased Genotypes
- Extension of the SIMLA Package for Generating Pedigrees with Complex Inheritance Patterns: Environmental Covariates, Gene-Gene and Gene-Environment Interaction
- Error Distribution for Gene Expression Data
- A General Framework for Weighted Gene Co-Expression Network Analysis
- Statistical Inference in Evolutionary Models of DNA Sequences via the EM Algorithm
- Comparing Bacterial DNA Microarray Fingerprints
- Continuous Covariates in Genetic Association Studies of Case-Parent Triads: Gene and Gene-Environment Interaction Effects, Population Stratification, and Power Analysis
- Robust Remote Homology Detection by Feature Based Profile Hidden Markov Models
- Empirical Bayes Estimation of a Sparse Vector of Gene Expression Changes
- Hierarchical Inverse Gaussian Models and Multiple Testing: Application to Gene Expression Data
- FADO: A Statistical Method to Detect Favored or Avoided Distances between Occurrences of Motifs using the Hawkes' Model
- Prediction of Genomewide Conserved Epitope Profiles of HIV-1: Classifier Choice and Peptide Representation
- Fold-Change Estimation of Differentially Expressed Genes using Mixture Mixed-Model
- Test on the Structure of Biological Sequences via Chaos Game Representation
- Reverse Engineering Galactose Regulation in Yeast through Model Selection
- Empirical Bayes and Resampling Based Multiple Testing Procedure Controlling Tail Probability of the Proportion of False Positives.
- Weighted Analysis of Paired Microarray Experiments
- A Probabilistic Approach to Large-Scale Association Scans: A Semi-Bayesian Method to Detect Disease-Predisposing Alleles
- A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics
- Structured Antedependence Models for Functional Mapping of Multiple Longitudinal Traits
- Correlation Between Gene Expression Levels and Limitations of the Empirical Bayes Methodology for Finding Differentially Expressed Genes
- Bayesian Statistical Studies of the Ramachandran Distribution
- On Reference Designs For Microarray Experiments
- Computing Asymptotic Power and Sample Size for Case-Control Genetic Association Studies in the Presence of Phenotype and/or Genotype Misclassification Errors