Unsupervised Classification for Tiling Arrays: ChIP-chip and Transcriptome

Caroline Bérard; Marie-Laure Martin-Magniette; Véronique Brunaud; Sébastien Aubourg; Stéphane Robin

doi:10.2202/1544-6115.1692

Home Unsupervised Classification for Tiling Arrays: ChIP-chip and Transcriptome

Article

Licensed

Unlicensed Requires Authentication

Unsupervised Classification for Tiling Arrays: ChIP-chip and Transcriptome

Caroline Bérard , Marie-Laure Martin-Magniette , Véronique Brunaud , Sébastien Aubourg and Stéphane Robin

Published/Copyright: November 1, 2011

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information

From the journal Statistical Applications in Genetics and Molecular Biology Volume 10 Issue 1

MLA
APA
Harvard
Chicago
Vancouver

MLA
APA
Harvard
Chicago
Vancouver

Bérard, Caroline, Martin-Magniette, Marie-Laure, Brunaud, Véronique, Aubourg, Sébastien and Robin, Stéphane. "Unsupervised Classification for Tiling Arrays: ChIP-chip and Transcriptome" Statistical Applications in Genetics and Molecular Biology, vol. 10, no. 1, 2011. https://doi.org/10.2202/1544-6115.1692

Bérard, C., Martin-Magniette, M., Brunaud, V., Aubourg, S. & Robin, S. (2011). Unsupervised Classification for Tiling Arrays: ChIP-chip and Transcriptome. Statistical Applications in Genetics and Molecular Biology, 10(1). https://doi.org/10.2202/1544-6115.1692

Bérard, C., Martin-Magniette, M., Brunaud, V., Aubourg, S. and Robin, S. (2011) Unsupervised Classification for Tiling Arrays: ChIP-chip and Transcriptome. Statistical Applications in Genetics and Molecular Biology, Vol. 10 (Issue 1). https://doi.org/10.2202/1544-6115.1692

Bérard, Caroline, Martin-Magniette, Marie-Laure, Brunaud, Véronique, Aubourg, Sébastien and Robin, Stéphane. "Unsupervised Classification for Tiling Arrays: ChIP-chip and Transcriptome" Statistical Applications in Genetics and Molecular Biology 10, no. 1 (2011). https://doi.org/10.2202/1544-6115.1692

Bérard C, Martin-Magniette M, Brunaud V, Aubourg S, Robin S. Unsupervised Classification for Tiling Arrays: ChIP-chip and Transcriptome. Statistical Applications in Genetics and Molecular Biology. 2011;10(1). https://doi.org/10.2202/1544-6115.1692

Copy

Copied to clipboard

BibTeX EndNote RIS

Tiling arrays make possible a large-scale exploration of the genome thanks to probes which cover the whole genome with very high density, up to 2,000,000 probes. Biological questions usually addressed are either the expression difference between two conditions or the detection of transcribed regions. In this work, we propose to consider both questions simultaneously as an unsupervised classification problem by modeling the joint distribution of the two conditions. In contrast to previous methods, we account for all available information on the probes as well as biological knowledge such as annotation and spatial dependence between probes. Since probes are not biologically relevant units, we propose a classification rule for non-connected regions covered by several probes. Applications to transcriptomic and ChIP-chip data of Arabidopsis thaliana obtained with a NimbleGen tiling array highlight the importance of a precise modeling and of the region classification. The "TAHMMAnnot" package is implemented in R and C and is freely available from CRAN.

Keywords: bivariate Gaussian mixture; hidden Markov model; tiling arrays; unsupervised classification

Published Online: 2011-11-1

You are currently not able to access this content.

Articles in the same Issue

Invited Editorial
Measurement of Evidence and Evidence of Measurement
Article
Fully Moderated T-statistic for Small Sample Size Gene Expression Arrays
Determining Coding CpG Islands by Identifying Regions Significant for Pattern Statistics on Markov Chains
Assessing Modularity Using a Random Matrix Theory Approach
Choice of Summary Statistic Weights in Approximate Bayesian Computation
Genetic Linkage Analysis in the Presence of Germline Mosaicism
Fitting Boolean Networks from Steady State Perturbation Data
Adaptive Elastic-Net Sparse Principal Component Analysis for Pathway Association Testing
Bayesian Learning from Marginal Data in Bionetwork Models
Unsupervised Classification for Tiling Arrays: ChIP-chip and Transcriptome
Multiple Testing in Candidate Gene Situations: A Comparison of Classical, Discrete, and Resampling-Based Procedures
Modeling Read Counts for CNV Detection in Exome Sequencing Data
Multiscale Characterization of Signaling Network Dynamics through Features
A Calibrated Multiclass Extension of AdaBoost
False Discovery Rate Estimation for Stability Selection: Application to Genome-Wide Association Studies
A Markov-Chain Model for the Analysis of High-Resolution Enzymatically ¹⁸O-Labeled Mass Spectra
Repeated Measures Semiparametric Regression Using Targeted Maximum Likelihood Methodology with Application to Transcription Factor Activity Discovery
Learning Monotonic Genotype-Phenotype Maps
A Comparison of Multifactor Dimensionality Reduction and L₁-Penalized Regression to Identify Gene-Gene Interactions in Genetic Association Studies
Accuracy and Computational Efficiency of a Graphical Modeling Approach to Linkage Disequilibrium Estimation
Learning from Past Treatments and Their Outcome Improves Prediction of In Vivo Response to Anti-HIV Therapy
A Three Component Latent Class Model for Robust Semiparametric Gene Discovery
Log-Linear Modelling of Protein Dipeptide Structure Reveals Interesting Patterns of Side-Chain-Backbone Interactions
A Robust Statistical Method to Detect Null Alleles in Microsatellite and SNP Datasets in Both Panmictic and Inbred Populations
Large Sample Approximations of Probabilities of Correct Evolutionary Tree Estimation and Biases of Maximum Likelihood Estimation
Interval Estimation of Familial Correlations from Pedigrees
Information Metrics in Genetic Epidemiology
Linear Combination Test for Hierarchical Gene Set Analysis
Exploratory Analysis of Multiple Omics Datasets Using the Adjusted RV Coefficient
Application of the Lasso to Expression Quantitative Trait Loci Mapping
A Variance-Components Model for Distance-Matrix Phylogenetic Reconstruction
Imputation Estimators Partially Correct for Model Misspecification
On the Statistical Properties of SGoF Multitesting Method
Meta-Analysis of Family-Based and Case-Control Genetic Association Studies that Use the Same Cases
A Non-Parametric Method for Detecting Specificity Determining Sites in Protein Sequence Alignments
Performance of Matrix Representation with Parsimony for Inferring Species from Gene Trees
Disequilibrium Coefficient: A Bayesian Perspective
Analyzing Time-Course Microarray Data Using Functional Data Analysis - A Review
The NBP Negative Binomial Model for Assessing Differential Gene Expression from RNA-Seq
Inferring Gene Networks using Robust Statistical Techniques
A Two-Stage Poisson Model for Testing RNA-Seq Data
Quantifying the Relative Contribution of the Heterozygous Class to QTL Detection Power
The Joint Null Criterion for Multiple Hypothesis Tests
Multiple Imputation of Missing Phenotype Data for QTL Mapping
Sparse Canonical Covariance Analysis for High-throughput Data
Comparison of Clinical Subgroup aCGH Profiles through Pseudolikelihood Ratio Tests
Random Forests for Genetic Association Studies
Deviance Information Criteria for Model Selection in Approximate Bayesian Computation
High-Dimensional Regression and Variable Selection Using CAR Scores
Surveying the Manifold Divergence of an Entire Protein Class for Statistical Clues to Underlying Biochemical Mechanisms
Smoothing Gene Expression Data with Network Information Improves Consistency of Regulated Genes
Entropy Based Genetic Association Tests and Gene-Gene Interaction Tests
Weighted Lasso with Data Integration
MA-SNP -- A New Genotype Calling Method for Oligonucleotide SNP Arrays Modeling the Batch Effect with a Normal Mixture Model
A Modified Maximum Contrast Method for Unequal Sample Sizes in Pharmacogenomic Studies

Search journal Search the content of this journal

https://doi.org/10.2202/1544-6115.1692

Keywords for this article

bivariate Gaussian mixture; hidden Markov model; tiling arrays; unsupervised classification

Articles in the same Issue

Invited Editorial
Measurement of Evidence and Evidence of Measurement
Article
Fully Moderated T-statistic for Small Sample Size Gene Expression Arrays
Determining Coding CpG Islands by Identifying Regions Significant for Pattern Statistics on Markov Chains
Assessing Modularity Using a Random Matrix Theory Approach
Choice of Summary Statistic Weights in Approximate Bayesian Computation
Genetic Linkage Analysis in the Presence of Germline Mosaicism
Fitting Boolean Networks from Steady State Perturbation Data
Adaptive Elastic-Net Sparse Principal Component Analysis for Pathway Association Testing
Bayesian Learning from Marginal Data in Bionetwork Models
Unsupervised Classification for Tiling Arrays: ChIP-chip and Transcriptome
Multiple Testing in Candidate Gene Situations: A Comparison of Classical, Discrete, and Resampling-Based Procedures
Modeling Read Counts for CNV Detection in Exome Sequencing Data
Multiscale Characterization of Signaling Network Dynamics through Features
A Calibrated Multiclass Extension of AdaBoost
False Discovery Rate Estimation for Stability Selection: Application to Genome-Wide Association Studies
A Markov-Chain Model for the Analysis of High-Resolution Enzymatically ¹⁸O-Labeled Mass Spectra
Repeated Measures Semiparametric Regression Using Targeted Maximum Likelihood Methodology with Application to Transcription Factor Activity Discovery
Learning Monotonic Genotype-Phenotype Maps
A Comparison of Multifactor Dimensionality Reduction and L₁-Penalized Regression to Identify Gene-Gene Interactions in Genetic Association Studies
Accuracy and Computational Efficiency of a Graphical Modeling Approach to Linkage Disequilibrium Estimation
Learning from Past Treatments and Their Outcome Improves Prediction of In Vivo Response to Anti-HIV Therapy
A Three Component Latent Class Model for Robust Semiparametric Gene Discovery
Log-Linear Modelling of Protein Dipeptide Structure Reveals Interesting Patterns of Side-Chain-Backbone Interactions
A Robust Statistical Method to Detect Null Alleles in Microsatellite and SNP Datasets in Both Panmictic and Inbred Populations
Large Sample Approximations of Probabilities of Correct Evolutionary Tree Estimation and Biases of Maximum Likelihood Estimation
Interval Estimation of Familial Correlations from Pedigrees
Information Metrics in Genetic Epidemiology
Linear Combination Test for Hierarchical Gene Set Analysis
Exploratory Analysis of Multiple Omics Datasets Using the Adjusted RV Coefficient
Application of the Lasso to Expression Quantitative Trait Loci Mapping
A Variance-Components Model for Distance-Matrix Phylogenetic Reconstruction
Imputation Estimators Partially Correct for Model Misspecification
On the Statistical Properties of SGoF Multitesting Method
Meta-Analysis of Family-Based and Case-Control Genetic Association Studies that Use the Same Cases
A Non-Parametric Method for Detecting Specificity Determining Sites in Protein Sequence Alignments
Performance of Matrix Representation with Parsimony for Inferring Species from Gene Trees
Disequilibrium Coefficient: A Bayesian Perspective
Analyzing Time-Course Microarray Data Using Functional Data Analysis - A Review
The NBP Negative Binomial Model for Assessing Differential Gene Expression from RNA-Seq
Inferring Gene Networks using Robust Statistical Techniques
A Two-Stage Poisson Model for Testing RNA-Seq Data
Quantifying the Relative Contribution of the Heterozygous Class to QTL Detection Power
The Joint Null Criterion for Multiple Hypothesis Tests
Multiple Imputation of Missing Phenotype Data for QTL Mapping
Sparse Canonical Covariance Analysis for High-throughput Data
Comparison of Clinical Subgroup aCGH Profiles through Pseudolikelihood Ratio Tests
Random Forests for Genetic Association Studies
Deviance Information Criteria for Model Selection in Approximate Bayesian Computation
High-Dimensional Regression and Variable Selection Using CAR Scores
Surveying the Manifold Divergence of an Entire Protein Class for Statistical Clues to Underlying Biochemical Mechanisms
Smoothing Gene Expression Data with Network Information Improves Consistency of Regulated Genes
Entropy Based Genetic Association Tests and Gene-Gene Interaction Tests
Weighted Lasso with Data Integration
MA-SNP -- A New Genotype Calling Method for Oligonucleotide SNP Arrays Modeling the Batch Effect with a Normal Mixture Model
A Modified Maximum Contrast Method for Unequal Sample Sizes in Pharmacogenomic Studies