Statistical Significance Threshold Criteria For Analysis of Microarray Gene Expression Data

Cheng Cheng; Stanley B. Pounds; James M. Boyett; Deqing Pei; Mei-Ling Kuo; Martine F. Roussel

doi:10.2202/1544-6115.1064

Article

Statistical Significance Threshold Criteria For Analysis of Microarray Gene Expression Data

Cheng Cheng , Stanley B. Pounds , James M. Boyett , Deqing Pei , Mei-Ling Kuo and Martine F. Roussel

Published/Copyright: December 19, 2004

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information Explore this Subject

From the journal Statistical Applications in Genetics and Molecular Biology Volume 3 Issue 1

The methodological advancement in microarray data analysis on the basis of false discovery rate (FDR) control, such as the q-value plots, allows the investigator to examine the FDR from several perspectives. However, when FDR control at the ``customary" levels 0.01, 0.05, or 0.1 does not provide fruitful findings, there is little guidance for making the trade off between the significance threshold and the FDR level by sound statistical or biological considerations. Thus, meaningful statistical significance criteria that complement the existing FDR methods for large-scale multiple tests are desirable. Three statistical significance criteria, the profile information criterion, the total error proportion, and the guide-gene driven selection, are developed in this research. The first two are general significance threshold criteria for large-scale multiple tests; the profile information criterion is related to the recent theoretical studies of the connection between FDR control and minimax estimation, and the total error proportion is closely related to the asymptotic properties of FDR control in terms of the total error risk. The guide-gene driven selection is an approach to combining statistical significance and the existing biological knowledge of the study at hand. Error properties of these criteria are investigated theoretically and by simulation. The proposed methods are illustrated and compared using an example of genomic screening for novel Arf gene targets. Operating characteristics of q-value and the proposed significance threshold criteria are investigated and compared in a simulation study that employs a model mimicking a gene regulatory pathway. A guideline for using these criteria is provided. Splus/R code is available from the corresponding author upon request.

Keywords: multiple tests; significance threshold selection; profile information criterion; total error proportion; false discovery rate; q-value; microarray; gene expression

Published Online: 2004-12-19

You are currently not able to access this content.

Articles in the same Issue

https://doi.org/10.2202/1544-6115.1064

Keywords for this article

multiple tests; significance threshold selection; profile information criterion; total error proportion; false discovery rate; q-value; microarray; gene expression