Estimating Motifs Under Order Restrictions

Erik W van Zwet; Katherina J Kechris; Peter J Bickel; Michael B. Eisen

doi:10.2202/1544-6115.1100

Home Life Sciences Estimating Motifs Under Order Restrictions

Article

Licensed

Unlicensed Requires Authentication

Estimating Motifs Under Order Restrictions

Erik W van Zwet , Katherina J Kechris , Peter J Bickel and Michael B. Eisen

Published/Copyright: January 10, 2005

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information Explore this Subject

From the journal Statistical Applications in Genetics and Molecular Biology Volume 4 Issue 1

MLA
APA
Harvard
Chicago
Vancouver

MLA
APA
Harvard
Chicago
Vancouver

van Zwet, Erik W, Kechris, Katherina J, Bickel, Peter J and Eisen, Michael B.. "Estimating Motifs Under Order Restrictions" Statistical Applications in Genetics and Molecular Biology, vol. 4, no. 1. https://doi.org/10.2202/1544-6115.1100

van Zwet, E., Kechris, K., Bickel, P. & Eisen, M. (). Estimating Motifs Under Order Restrictions. Statistical Applications in Genetics and Molecular Biology, 4(1). https://doi.org/10.2202/1544-6115.1100

van Zwet, E., Kechris, K., Bickel, P. and Eisen, M. () Estimating Motifs Under Order Restrictions. Statistical Applications in Genetics and Molecular Biology, Vol. 4 (Issue 1). https://doi.org/10.2202/1544-6115.1100

van Zwet, Erik W, Kechris, Katherina J, Bickel, Peter J and Eisen, Michael B.. "Estimating Motifs Under Order Restrictions" Statistical Applications in Genetics and Molecular Biology 4, no. 1 (). https://doi.org/10.2202/1544-6115.1100

van Zwet E, Kechris K, Bickel P, Eisen M. Estimating Motifs Under Order Restrictions. Statistical Applications in Genetics and Molecular Biology. ;4(1). https://doi.org/10.2202/1544-6115.1100

Copy

Copied to clipboard

BibTeX EndNote RIS

Transcription factors and many other DNA-binding proteins recognize more than one specific sequence. Among sequences recognized by a given DNA-binding protein, different positions exhibit varying degrees of conservation. The reason is that base pairs that are more extensively contacted by the protein tend to be more conserved. This observation can be used in the discovery of transcription factor binding sites. Here we present a rigorous means to accomplish this. In particular, we constrain the order of the information (entropy) in the columns of the position specific weight matrix (PWM) which characterizes the motif being sought. We then show how to compute the maximum likelihood estimate of a PWM under such order restrictions. This computation is easily integrated with the EM algorithm or the Gibbs sampler to enhance performance in the search for motifs in unaligned sequences. We demonstrate our method on a well-known data set of binding sites of the transcription factor Crp in E. coli.

Published Online: 2005-1-10

You are currently not able to access this content.

Articles in the same Issue

Article
Estimating Motifs Under Order Restrictions
Reproducible Research: A Bioinformatics Case Study
Generalized Rank Tests for Replicated Microarray Data
Stepwise Normalization of Two-Channel Spotted Microarrays
Comparing Automatic and Manual Image Processing in FLARE Assay Analysis for Colon Carcinogenesis
Pixel-level Signal Modelling with Spatial Correlation for Two-Colour Microarrays
Empirical Bayes Microarray ANOVA and Grouping Cell Lines by Equal Expression Levels
Multiple Testing and Data Adaptive Regression: An Application to HIV-1 Sequence Data.
Early Diagnostic Marker Panel Determination for Microarray Based Clinical Studies
Prediction of Missing Values in Microarray and Use of Mixed Models to Evaluate the Predictors
Combined Association and Linkage Analysis for General Pedigrees and Genetic Models
Incorporating Biological Information as a Prior in an Empirical Bayes Approach to Analyzing Microarray Data
The Relative Inefficiency of Sequence Weights Approaches in Determining a Nucleotide Position Weight Matrix
A Simple Loglinear Model for Haplotype Effects in a Case-Control Study Involving Two Unphased Genotypes
Extension of the SIMLA Package for Generating Pedigrees with Complex Inheritance Patterns: Environmental Covariates, Gene-Gene and Gene-Environment Interaction
Error Distribution for Gene Expression Data
A General Framework for Weighted Gene Co-Expression Network Analysis
Statistical Inference in Evolutionary Models of DNA Sequences via the EM Algorithm
Comparing Bacterial DNA Microarray Fingerprints
Continuous Covariates in Genetic Association Studies of Case-Parent Triads: Gene and Gene-Environment Interaction Effects, Population Stratification, and Power Analysis
Robust Remote Homology Detection by Feature Based Profile Hidden Markov Models
Empirical Bayes Estimation of a Sparse Vector of Gene Expression Changes
Hierarchical Inverse Gaussian Models and Multiple Testing: Application to Gene Expression Data
FADO: A Statistical Method to Detect Favored or Avoided Distances between Occurrences of Motifs using the Hawkes' Model
Prediction of Genomewide Conserved Epitope Profiles of HIV-1: Classifier Choice and Peptide Representation
Fold-Change Estimation of Differentially Expressed Genes using Mixture Mixed-Model
Test on the Structure of Biological Sequences via Chaos Game Representation
Reverse Engineering Galactose Regulation in Yeast through Model Selection
Empirical Bayes and Resampling Based Multiple Testing Procedure Controlling Tail Probability of the Proportion of False Positives.
Weighted Analysis of Paired Microarray Experiments
A Probabilistic Approach to Large-Scale Association Scans: A Semi-Bayesian Method to Detect Disease-Predisposing Alleles
A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics
Structured Antedependence Models for Functional Mapping of Multiple Longitudinal Traits
Correlation Between Gene Expression Levels and Limitations of the Empirical Bayes Methodology for Finding Differentially Expressed Genes
Bayesian Statistical Studies of the Ramachandran Distribution
On Reference Designs For Microarray Experiments
Computing Asymptotic Power and Sample Size for Case-Control Genetic Association Studies in the Presence of Phenotype and/or Genotype Misclassification Errors

Search journal Search the content of this journal

https://doi.org/10.2202/1544-6115.1100

Articles in the same Issue

Article
Estimating Motifs Under Order Restrictions
Reproducible Research: A Bioinformatics Case Study
Generalized Rank Tests for Replicated Microarray Data
Stepwise Normalization of Two-Channel Spotted Microarrays
Comparing Automatic and Manual Image Processing in FLARE Assay Analysis for Colon Carcinogenesis
Pixel-level Signal Modelling with Spatial Correlation for Two-Colour Microarrays
Empirical Bayes Microarray ANOVA and Grouping Cell Lines by Equal Expression Levels
Multiple Testing and Data Adaptive Regression: An Application to HIV-1 Sequence Data.
Early Diagnostic Marker Panel Determination for Microarray Based Clinical Studies
Prediction of Missing Values in Microarray and Use of Mixed Models to Evaluate the Predictors
Combined Association and Linkage Analysis for General Pedigrees and Genetic Models
Incorporating Biological Information as a Prior in an Empirical Bayes Approach to Analyzing Microarray Data
The Relative Inefficiency of Sequence Weights Approaches in Determining a Nucleotide Position Weight Matrix
A Simple Loglinear Model for Haplotype Effects in a Case-Control Study Involving Two Unphased Genotypes
Extension of the SIMLA Package for Generating Pedigrees with Complex Inheritance Patterns: Environmental Covariates, Gene-Gene and Gene-Environment Interaction
Error Distribution for Gene Expression Data
A General Framework for Weighted Gene Co-Expression Network Analysis
Statistical Inference in Evolutionary Models of DNA Sequences via the EM Algorithm
Comparing Bacterial DNA Microarray Fingerprints
Continuous Covariates in Genetic Association Studies of Case-Parent Triads: Gene and Gene-Environment Interaction Effects, Population Stratification, and Power Analysis
Robust Remote Homology Detection by Feature Based Profile Hidden Markov Models
Empirical Bayes Estimation of a Sparse Vector of Gene Expression Changes
Hierarchical Inverse Gaussian Models and Multiple Testing: Application to Gene Expression Data
FADO: A Statistical Method to Detect Favored or Avoided Distances between Occurrences of Motifs using the Hawkes' Model
Prediction of Genomewide Conserved Epitope Profiles of HIV-1: Classifier Choice and Peptide Representation
Fold-Change Estimation of Differentially Expressed Genes using Mixture Mixed-Model
Test on the Structure of Biological Sequences via Chaos Game Representation
Reverse Engineering Galactose Regulation in Yeast through Model Selection
Empirical Bayes and Resampling Based Multiple Testing Procedure Controlling Tail Probability of the Proportion of False Positives.
Weighted Analysis of Paired Microarray Experiments
A Probabilistic Approach to Large-Scale Association Scans: A Semi-Bayesian Method to Detect Disease-Predisposing Alleles
A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics
Structured Antedependence Models for Functional Mapping of Multiple Longitudinal Traits
Correlation Between Gene Expression Levels and Limitations of the Empirical Bayes Methodology for Finding Differentially Expressed Genes
Bayesian Statistical Studies of the Ramachandran Distribution
On Reference Designs For Microarray Experiments
Computing Asymptotic Power and Sample Size for Case-Control Genetic Association Studies in the Presence of Phenotype and/or Genotype Misclassification Errors