Abstract
MicroRNAs (miRNAs) are short non-coding RNAs that play critical roles in numerous cellular processes through post-transcriptional functions. The aberrant role of miRNAs has been reported in a number of diseases. A robust computational method is vital to discover novel miRNAs where level of noise varies dramatically across the different miRNAs. In this paper, we propose a flexible rank-based procedure for estimating a weighted log partial area under the receiver operating characteristic (ROC) curve statistic for selecting differentially expressed miRNAs. The statistic combines results taking partial area under the curve (pAUC) and their corresponding variance. The proposed method does not involve complicated formulas and does not require advanced programming skills. Two real datasets are analyzed to illustrate the method and a simulation study is carried out to assess the performance of different miRNA ranking statistics. We conclude that the proposed method offers robust results with large samples for miRNA expression data, and the method can be used as an alternative analytical tool for identifying a list of target miRNAs for further biological and clinical investigation.
AH acknowledges post-doctoral fellowship funding from the Drug Safety and Effectiveness Cross-Disciplinary Training (DSECT) Program. JB would like to acknowledge Discovery Grant funding from the Natural Sciences and Engineering Research Council of Canada (NSERC) (grant number 293295-2009) and Canadian Institutes of Health Research (CIHR) (grant number 84392). JB holds the John D. Cameron Endowed Chair in the Genetic Determinants of Chronic Diseases, Department of Clinical Epidemiology and Biostatistics, McMaster University. We would like to thank two anonymous reviewers and the editor for insightful comments that improved the presentation and clarity of our manuscript.
References
Agilent Manual (2012): Agilent Feature Extraction Software Manual, http://cp.chem.agilent.com/Library/usermanuals/Public/G4460-90019_FE_10.5_User.pdf.Suche in Google Scholar
Ambros, V. (2003): “MicroRNA pathways in flies and worms: growth, death, fat, stress, and timing,” Cell, 113, 673–676.10.1016/S0092-8674(03)00428-8Suche in Google Scholar
Ambros, V. (2004): “The functions of animal microRNAs,” Nature, 431, 350–355.10.1038/nature02871Suche in Google Scholar
Calin, G. A. and C. M. Croce (2006): “The functions of animal microRNAs,” Cancer Res., 66, 7390–7394.Suche in Google Scholar
Calin, G. A., C. D. Dumitru, M. Shimizu, R. Bichi, S. Zupo, E. Noch, H. Aldler, S. Rattan, M. Keating, K. Rai, L. Rassenti, T. Kipps, M. Negrini, F. Bullrich and C. M. Croce (2002): “Frequent deletions and down-regulation of micro- RNA genes miR15 and miR16 at 13q14 in chronic lymphocytic leukemia,” Proc. Natl. Acad. Sci. USA, 99, 15524–15529.10.1073/pnas.242606799Suche in Google Scholar
Caren H., F. Abel, P. Kogner and T. Martinsson (2000): “High incidence of DNA mutations and gene amplifications of the ALK gene in advanced sporadic neuroblastoma tumours,” Biochem J., 416, 153–159.Suche in Google Scholar
Efron, B., R. Tibshirani, J. Storey and V. Tusher (2001): “Empirical bayes analysis of a microarray experiment,” Clinical Chemistry, 96, 1151–1160.10.1198/016214501753382129Suche in Google Scholar
Faraggi, D. and B. Reiser (2002): “Estimation of the area under the ROC curve,” Statist. Med., 21, 3093–3106.Suche in Google Scholar
Goddard, M. J. and I. Hinberg (1990): “Receiver operating characteristic (ROC) curves and non-normal data: an empirical study,” Statistics in Medicine, 9, 325–337.10.1002/sim.4780090315Suche in Google Scholar
Hardy, R. J. and S. G. Thompson (1998): “Detecting and describing heterogeneity in meta-analysis,” Statistics in Medicine, 17 (8), 841–856.10.1002/(SICI)1097-0258(19980430)17:8<841::AID-SIM781>3.0.CO;2-DSuche in Google Scholar
He, Y. and M. Escobar (2008): “Nonparametric statistical inference method for partial areas under receiver operating characteristic curves, with application to genomic studies,” Statistics in Medicine, 27, 5291–5308.10.1002/sim.3335Suche in Google Scholar
He, L. and G. J. Hannon (2004): “MicroRNAs: small RNAs with a big role in gene regulation,” Nat. Rev. Genet., 5, 522–531.Suche in Google Scholar
Hossain, A., A. Willan and J. Beyene (2013): “A flexible nonparametric approach to find candidate genes associated to disease in microarray experimets,” J. Bioinformat. Comput. Biol., 11 (2), 1250021.Suche in Google Scholar
Jason, B. N. and C. L. Walter (2012): “Linear discriminant functions in connection with the micro-rna diagnosis of colon cancer, Cancer Informatics, 11, 1–14.10.4137/CIN.S8779Suche in Google Scholar
Lewis, B. P., C. B. Burge and D. P. Bartel (2005): “Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets,” Cell, 1, 15–20.10.1016/j.cell.2004.12.035Suche in Google Scholar
Pepe, M. S., G. Longton, G. L. Anderson and M. Schummer (2003): “Selecting differentially expressed genes from microarray experiments,” Biometrics, 59, 133–142.10.1111/1541-0420.00016Suche in Google Scholar
Raychaudhuri, S., J. M. Stuart, X. Liu, P. M. Small and R. B. Altman (2000): “Pattern recognition of genomic features with microarrays: site typing of Mycrobacterium tuberculosis strains,” Proc. Int. Conf.Intell. Syst. Mol. Biol., 8, 286–295.Suche in Google Scholar
Sarver, A. L., A. J. French, P. M. Borralho and V. Thayanithy, A. L. Oberg, K. A. T. Silverstein, B. W. Morlan, S. M. Riska, L. A. Boardman, J. M. Cunningham, S. Subramanian, L. Wang, T. C. Smyrk, C. M. P. Rodrigues, S. N. Thibodeau and C. J. Steer (2009): “Human colon cancer profiles show differential microRNA expression depending on mismatch repair status and are characteristic of undifferentiated proliferative states,” BMC Cancer, 9 (18), 401–413.10.1186/1471-2407-9-401Suche in Google Scholar PubMed PubMed Central
Scaruffi, P., S. Stigliani, S. Moretti, S. Coco, C. De Vecchi, F. Valdora, A. Garaventa, S. Bonassi and G. P. Tonini (2009): “Transcribed-ultra conserved region expression is associated with outcome in high-risk neuroblastoma,” BMC Cancer, 15, 441–450.10.1186/1471-2407-9-441Suche in Google Scholar PubMed PubMed Central
Troyanskaya, O. G., M. Garber, P. Brown, D. Botstein and R. B. Altman (2002): “Nonparamteric methods for identifying differentially expressed genes in microarray data,” Bioinformatics, 18 (11), 1454–1461.10.1093/bioinformatics/18.11.1454Suche in Google Scholar PubMed
Tsodikov, A., A. Szabo and D. Jones (2002): “Adjustments and measures of differential expression for microarray data,” Bioinformatics, 18, 251–260.10.1093/bioinformatics/18.2.251Suche in Google Scholar PubMed
Vermeulen, J., K. De Preter, A. Naranjo, L. Vercruysse, N. Van Roy, J. Hellemans, K. Swerts, S. Bravo, P. Scaruffi, G. P. Tonini, B. De Bernardi, R. Noguera, M. Piqueras, A. Cañete, V. Castel, I. Janoueix-Lerosey, O. Delattre, G. Schleiermacher, J. Michon, V. Combaret, M. Fischer, A. Oberthuer, P. F. Ambros, K. Beiske, J. Bénard, B. Marques, H. Rubie, J. Kohler, U. Pötschger, R. Ladenstein, M. D. Hogarty, P. McGrady, W. B. London, G. Laureys, F. Speleman, J. Vandesompele (2009): “Predicting outcomes for children with neuroblastoma using a multigeneexpression signature: a retrospective SIOPEN/COG/GPOH study,” Lancet Oncol., 10 (7), 663–671.Suche in Google Scholar
©2013 by Walter de Gruyter Berlin Boston
Artikel in diesem Heft
- Masthead
- Masthead
- Research Articles
- A new variance stabilizing transformation for gene expression data analysis
- Kernel approximate Bayesian computation in population genetic inferences
- Permutation tests for analyzing cospeciation in multiple phylogenies: applications in tri-trophic ecology
- Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies
- Modeling, simulation and analysis of methylation profiles from reduced representation bisulfite sequencing experiments
- Estimation of weighted log partial area under the ROC curve and its application to MicroRNA expression data
- Random forests on distance matrices for imaging genetics studies
Artikel in diesem Heft
- Masthead
- Masthead
- Research Articles
- A new variance stabilizing transformation for gene expression data analysis
- Kernel approximate Bayesian computation in population genetic inferences
- Permutation tests for analyzing cospeciation in multiple phylogenies: applications in tri-trophic ecology
- Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies
- Modeling, simulation and analysis of methylation profiles from reduced representation bisulfite sequencing experiments
- Estimation of weighted log partial area under the ROC curve and its application to MicroRNA expression data
- Random forests on distance matrices for imaging genetics studies