Outlier reset CUSUM for the exploration of copy number alteration data

Yinglei Lai; Joseph L. Gastwirth

doi:10.1515/sagmb-2014-0027

Artikel

Outlier reset CUSUM for the exploration of copy number alteration data

Yinglei Lai und Joseph L. Gastwirth

Veröffentlicht/Copyright: 18. Juni 2015

Veröffentlicht von

Veröffentlichen auch Sie bei De Gruyter Brill

Manuskript einreichen Informationen für Autor*innen

Aus der Zeitschrift Statistical Applications in Genetics and Molecular Biology Band 14 Heft 4

Abstract

Copy number alteration (CNA) data have been collected to study disease related chromosomal amplifications and deletions. The CUSUM procedure and related plots have been used to explore CNA data. In practice, it is possible to observe outliers. Then, modifications of the CUSUM procedure may be required. An outlier reset modification of the CUSUM (ORCUSUM) procedure is developed in this paper. The threshold value for detecting outliers or significant CUSUMs can be derived using results for sums of independent truncated normal random variables. Bartel’s non-parametric test for autocorrelation is also introduced to the analysis of copy number variation data. Our simulation results indicate that the ORCUSUM procedure can still be used even in the situation where the degree of autocorrelation level is low. Furthermore, the results show the outlier’s impact on the traditional CUSUM’s performance and illustrate the advantage of the ORCUSUM’s outlier reset feature. Additionally, we discuss how the ORCUSUM can be applied to examine CNA data with a simulated data set. To illustrate the procedure, recently collected single nucleotide polymorphism (SNP) based CNA data from The Cancer Genome Atlas (TCGA) Research Network is analyzed. The method is applied to a data set collected in an ovarian cancer study. Three cytogenetic bands (cytobands) are considered to illustrate the method. The cytobands 11q13 and 9p21 have been shown to be related to ovarian cancer. They are presented as positive examples. The cytoband 3q22, which is less likely to be disease related, is presented as a negative example. These results illustrate the usefulness of the ORCUSUM procedure as an exploratory tool for the analysis of SNP based CNA data.

Keywords: copy number alteration; CUSUM; outlier

Corresponding author: Yinglei Lai, Department of Statistics, The George Washington University, Rome Hall, Room 553, 801 22nd St. NW, Washington, DC 20052, USA, e-mail: ylai@gwu.edu

References

Aravidis, C., A. D. Panani, Z. Kosmaidou, N. Thomakos, A. Rodolakis and A. Antsaklis (2012): “Detection of numerical abnormalities of chromosome 9 and p16/cdkn2a gene alterations in ovarian cancer with fish analysis,” Anticancer Res., 32, 5309–5313.Suche in Google Scholar

Bartels, R. (1982): “The rank version of von neumann’s ratio test for randomness,” J. Am. Stat. Assoc., 77, 40–46.Suche in Google Scholar

Birnbaum, Z. W. and F. C. Andrews (1949): “On sums of symmetrically truncated normal random variables,” Ann. Math. Stat., 20, 458–461.Suche in Google Scholar

Chen, H., H. Xing and N. R. Zhang (2011): “Estimation of parent specific dna copy number in tumors using high-density genotyping arrays,” PLoS Comput. Biol., 7:e1001060.Suche in Google Scholar

Chiang, D. Y., G. Getz, D. B. Jaffe, M. J. T. O’Kelly, X. Zhao, S. L. Carter, C. Russ, C. Nusbaum, M. Meyerson and E. S. Lander (2009): “High-resolution mapping of copy-number alterations with massively parallel sequencing,” Nat. Methods, 6, 99–103.Suche in Google Scholar

Hawkins, D. M. and D. H. Olwell (1998): Cumulative sum charts and charting for quality improvement, New York, NY, USA: Springer.10.1007/978-1-4612-1686-5Suche in Google Scholar

Hui, W., Y. R. Gel and J. L. Gastwirth (2008): “Lawstat: an r package for law, public policy and biostatistics,” J. Stat. Software, 28. http://www.jstatsoft.org/v28/i03.Suche in Google Scholar

Li, W., A. Lee and P. K Gregersen (2009): “Copy-number-variation and copy-number-alteration region detection by cumulative plots,” BMC Bioinformatics, 10(Suppl 1):S67.10.1186/1471-2105-10-S1-S67Suche in Google Scholar PubMed PubMed Central

Lockhart, D. J., H. Dong, M. C. Byrne, M. T. Follettie, M. V. Gallo, M. S. Chee, M. Mittmann, C. Wang, M. Kobayashi, H. Norton and E. L. Brown (1996): “Expression monitoring by hybridization to high-density oligonuleotide arrays,” Nat. Biotechnol., 14, 1675–1680.Suche in Google Scholar

McDaniel, S., J. Minnier, R. A. Betensky, G. Mohapatra, Y. Shen, J. F. Gusella, D. N. Louis and T. Cai (2010): “Assessing population level genetic instability via moving average,” Stat. Biosci., 2, 120–136.Suche in Google Scholar

McLachlan, G. and D. Peel (2000): Finite mixture models. Wiley series in probability and statistics, New York, NY, USA: John Wiley & Sons, Inc.10.1002/0471721182Suche in Google Scholar

Niu, Y. S. and H. Zhang (2012): “The screening and ranking algorithm to detect dna copy number variations,” Ann. Appl. Stat., 6, 1306–1326.Suche in Google Scholar

Olshen, A. B., H. Bengtsson, P. Neuvial, P. T. Spellman, R. A. Olshen and V. E. Seshan (2011): “Parent-specific copy number in paired tumor-normal studies using circular binary segmentation,” Bioinformatics, 27, 2038–2046.10.1093/bioinformatics/btr329Suche in Google Scholar PubMed PubMed Central

Pejovic, T. (1995): “Genetic changes in ovarian cancer,” Ann. Med., 27, 73–78.Suche in Google Scholar

Schena, M., D. Shalon, R. W. Davis and P. O. Brown (1995): “Quantitative monitoring of gene expression patterns with a complementary dna microarray,” Science, 270, 467–470.10.1126/science.270.5235.467Suche in Google Scholar PubMed

The Cancer Genome Atlas Network (2008): “Comprehensive genomic characterization defines human glioblastoma genes and core pathways,” Nature, 455, 1061–1068.10.1038/nature07385Suche in Google Scholar PubMed PubMed Central

Tukey, J. W. (1960): A survey of sampling from contaminated distributions. In contributions to probability and statistics, Stanford, California: Stanford University Press.Suche in Google Scholar

Weitzel, J. N., J. Patel, D. M. Smith, A. Goodman, H. Safaii and H. G. Ball (1994): “Molecular genetic changes associated with ovarian cancer,” Gynecol. Oncol., 55, 245–252.Suche in Google Scholar

Zhao, X., C. Li, J. G. Paez, K. Chin, P. A. Janne, T.-H. Chen, L. Girard, J. Minna, D. Christiani, C. Leo, J. W. Gray, W. R. Sellers and M. Meyerson (2004): “An integrated view of copy number and allelic alterations in the cancer genome using single nucleotide polymorphism arrays,” Cancer Res., 64, 3060–3071.Suche in Google Scholar

Published Online: 2015-6-18

Published in Print: 2015-8-1

Sie haben derzeit keinen Zugang zu diesem Inhalt.

Artikel in diesem Heft

https://doi.org/10.1515/sagmb-2014-0027

Schlagwörter für diesen Artikel

copy number alteration; CUSUM; outlier