HiPerMAb: a tool for judging the potential of small sample size biomarker pilot studies

Amani Al-Mekhlafi; Frank Klawonn

doi:10.1515/ijb-2022-0063

Article

HiPerMAb: a tool for judging the potential of small sample size biomarker pilot studies

Amani Al-Mekhlafi and Frank Klawonn

Published/Copyright: March 6, 2023

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information Explore this Subject

From the journal The International Journal of Biostatistics Volume 20 Issue 1

Abstract

Common statistical approaches are not designed to deal with so-called “short fat data” in biomarker pilot studies, where the number of biomarker candidates exceeds the sample size by magnitudes. High-throughput technologies for omics data enable the measurement of ten thousands and more biomarker candidates for specific diseases or states of a disease. Due to the limited availability of study participants, ethical reasons and high costs for sample processing and analysis researchers often prefer to start with a small sample size pilot study in order to judge the potential of finding biomarkers that enable – usually in combination – a sufficiently reliable classification of the disease state under consideration. We developed a user-friendly tool, called HiPerMAb that allows to evaluate pilot studies based on performance measures like multiclass AUC, entropy, area above the cost curve, hypervolume under manifold, and misclassification rate using Monte-Carlo simulations to compute the p-values and confidence intervals. The number of “good” biomarker candidates is compared to the expected number of “good” biomarker candidates in a data set with no association to the considered disease states. This allows judging the potential in the pilot study even if statistical tests with correction for multiple testing fail to provide any hint of significance.

Keywords: biomarker candidates; Monte-Carlo simulation; multi-class problem; pilot study evaluation; R Shiny application; short fat data

Corresponding authors: Amani Al-Mekhlafi, Department of Biostatistics, Helmholtz Centre for Infection Research, Braunschweig, Germany; and PhD Programme “Epidemiology” Hannover Medical School (MHH), Hannover, Germany, E-mail: amani.al-mekhlafi@helmholtz-hzi.de; and Frank Klawonn, Department of Computer Science, Ostfalia University of Applied Sciences, Wolfenbuettel, Germany, E-mail: frank.klawonn@helmholtz-hzi.de

Funding source: LEGaTO Project (legato-project.eu)

Award Identifier / Grant number: 780681

Funding source: Lower Saxony Ministry of Science and Culture within the programme Big Data in Modern Life Science, project i.Vacc.

Author contributions: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.
Research funding: This work was partly funded by European Union Horizon 2020 research and innovation programme under the LEGaTO Project (legato-project.eu), grant agreement No 780681, and the Lower Saxony Ministry of Science and Culture within the programme Big Data in Modern Life Science, project i.Vacc.
Institutional Review Board Statement: The study protocol was approved by the local ethics committee (Bayerische Landesärtzekammer, Munich, Germany).
Informed Consent Statement: Written consent was obtained from all participants.
Conflict of interest statement: The authors declare that they have no competing interests.

References

1. Omar, M, Klawonn, F, Brand, S, Stiesch, M, Krettek, C, Eberhard, J. Transcriptome wide high-density microarray analysis reveals differential gene transcription in periprosthetic tissue from hips with low-grade infection versus aseptic loosening. J Arthroplasty 2017;32:234–40. https://doi.org/10.1016/j.arth.2016.06.036.Search in Google Scholar PubMed

2. Biomarkers Definition Working Group. Biomarkers and surrogate endpoints: preferred definitions and conceptual framework. Clin Pharmacol Therapeut 2001;69:89–95.10.1067/mcp.2001.113989Search in Google Scholar PubMed

3. WHO. International programme on chemical safety biomarkers in risk assessment: validity and validation; 2001. Available from: https://inchem.org/documents/ehc/ehc/ehc222.htm [Accessed 14 May 2022].Search in Google Scholar

4. Di Liello, R, Piccirillo, MC, Arenare, L, Gargiulo, P, Schettino, C, Gravina, A, et al.. Master protocols for precision medicine in oncology: overcoming methodology of randomized clinical trials. Life 2021;11:1253. https://doi.org/10.3390/life11111253.Search in Google Scholar PubMed PubMed Central

5. Pepperkok, R, Ellenberg, J. High-throughput fluorescence microscopy for systems biology. Nature Reviews. Molecular Cell Biology 2006;7:690–6. https://doi.org/10.1038/nrm1979.Search in Google Scholar PubMed

6. Soon, Wendy Weijia, Hariharan, Manoj, Snyder, Michael P. High-throughput sequencing for biology and medicine. Molecular Systems Biology 2013;9:640. https://doi.org/10.1038/msb.2012.61.Search in Google Scholar PubMed PubMed Central

7. Wan, A.-J., Wang, K, Zhang, H.-C., Li, H, Wang, D.-N. Modercarbohydrate microarray biochip technologies. . Chinese Journal of Analytical Chemistry 2012;40:1780–8.10.1016/S1872-2040(11)60584-7Search in Google Scholar

8. Al-Mekhlafi, A, Becker, T, Klawonn, F. Sample size and performance estimation for biomarker combinations based on pilot studies with small sample sizes. Commun Stat Theor Methods 2020;51:5534–48. https://doi.org/10.1080/03610926.2020.1843053.Search in Google Scholar

9. Aasthaa, B, Pepe, MS. When does combining markers improve classification performance and what are implications for practice? Stat Med 2013;32:1877–92. https://doi.org/10.1002/sim.5736.Search in Google Scholar PubMed PubMed Central

10. Dudoit, S, Shaffer, JP, Boldrick, JC. Multiple hypothesis testing in microarray experiments. Stat Sci 2003;18:71–103. https://doi.org/10.1214/ss/1056397487.Search in Google Scholar

11. J, GJ, Aldo, S. Multiple hypothesis testing in genomics. Stat Med 2014;33:1946–78. https://doi.org/10.1002/sim.6082.Search in Google Scholar PubMed

12. Genovese, CR, Lazar, NA, Nichols, T. Thresholding of statistical maps in functional neuroimaging using the false discovery rate. Neuroimage 2002;15:870–8. https://doi.org/10.1006/nimg.2001.1037.Search in Google Scholar PubMed

13. Choi, H, Nesvizhskii, AI. False discovery rates and related statistical concepts in mass spectrometry-based proteomics. J Proteome Res 2007;7:47–50. https://doi.org/10.1021/pr700747q.Search in Google Scholar PubMed

14. Keselman, H, Cribbie, R, Holland, B. Controlling the rate of type I error over a large set of statistical tests. Br J Math Stat Psychol 2002;55:27–39. https://doi.org/10.1348/000711002159680.Search in Google Scholar PubMed

15. Shaffer, JP. Multiple hypothesis testing. Annu Rev Psychol 1995;46:561–84. https://doi.org/10.1146/annurev.ps.46.020195.003021.Search in Google Scholar

16. Bajgrowicz, P, Scaillet, O. Technical trading revisited: false discoveries, persistence tests, and transaction costs. J Financ Econ 2012;106:473–91. https://doi.org/10.1016/j.jfineco.2012.06.001.Search in Google Scholar

17. Benjamini, Y, Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Roy Stat Soc B 1995;57:289–300. https://doi.org/10.1111/j.2517-6161.1995.tb02031.x.Search in Google Scholar

18. Benjamini, Y, Hochberg, Y. On the adaptive control of the false discovery rate in multiple testing with independent statistics. J Educ Behav Stat 2000;25:60–83. https://doi.org/10.2307/1165312.Search in Google Scholar

19. Storey, JD, Tibshirani, R. Statistical significance for genomewide studies. Proc Natl Acad Sci USA 2003;100:9440–5. https://doi.org/10.1073/pnas.1530509100.Search in Google Scholar PubMed PubMed Central

20. Ignatiadis, N, Klaus, B, Zaugg, JB, Huber, W. Data-driven hypothesis weighting increases detection power in genome-scale multiple testing. Nat Methods 2016;13:577–80. https://doi.org/10.1038/nmeth.3885.Search in Google Scholar PubMed PubMed Central

21. Lei, L, Fithian, W. AdaPT: an interactive procedure for multiple testing with side information. J Roy Stat Soc B 2018;80:649–79. https://doi.org/10.1111/rssb.12274.Search in Google Scholar

22. Efron, B. Microarrays, empirical bayes and the two-groups model. Stat Sci 2008;23:1–22. https://doi.org/10.1214/07-sts236.Search in Google Scholar

23. Korthauer, K, Kimes, PK, Duvallet, C, Reyes, A, Subramanian, A, Teng, M, et al.. A practical guide to methods controlling false discoveries in computational biology. Genome Biol 2019;20:118. https://doi.org/10.1186/s13059-019-1716-1.Search in Google Scholar PubMed PubMed Central

24. Klawonn, F, Wang, J, Koch, I, Eberhard, J, Omar, M. HAUCA curves for the evaluation of biomarker pilot studies with small sample sizes and large numbers of features. In: Advances in intelligent data analysis; 2016, vol XV:356–67 pp.10.1007/978-3-319-46349-0_31Search in Google Scholar

25. Mason, SJ, Graham, NE. Areas beneath the relative operating characteristics (ROC) and relative operating levels (ROL) curves: statistical significance and interpretation. Q J R Meteorol Soc 2002;128:2145–66. https://doi.org/10.1256/003590002320603584.Search in Google Scholar

26. Szafranski, SP, Wos-Oxley, ML, Vilchez-Vargas, R, Jáuregui, R, Plumeier, I, Klawonn, F, et al.. High-resolution taxonomic profiling of the subgingival microbiome for biomarker discovery and periodontitis diagnosis. Appl Environ Microbiol 2015;81:1047–58. https://doi.org/10.1128/aem.03534-14.Search in Google Scholar PubMed PubMed Central

27. Hand, DJ, Till, RJ. A simple generalisation of the area under theROC curve for multiple class classification problems. Mach Learn 2001;45:171–86. https://doi.org/10.1023/a:1010920819831.10.1023/A:1010920819831Search in Google Scholar

28. Fayyad, UM, Irani, KB. Multi-interval discretization of continuous-valued attributes for classification learning. In: Proceedings of the international joint conference on uncertainty in AI; 1993:1022–7 pp.Search in Google Scholar

29. Novoselova, N, Wang, J, Pessler, F, Klawonn, F. Feature selection and classification with the embedded validation procedures for biomedical data analysis. Package ‘biocomb’; 2018. Available from: https://cran.r-project.org/web/packages/Biocomb/Biocomb.pdf [Accessed 14 May 2022].Search in Google Scholar

30. Montvida, O, Klawonn, F. Relative cost curves: an alternative to AUC and an extension to 3-class problems. Kybernetika 2014;50:647–60. https://doi.org/10.14736/kyb-2014-5-0647.Search in Google Scholar

31. Klawonn, F, Höppner, F, May, S. An alternative to ROC and AUC analysis of classifiers. In: Gama, J, Bradley, E, Hollm′en, J, editors. Advances in intelligent data analysis X. Berlin: Springer; 2011:210–21 pp.10.1007/978-3-642-24800-9_21Search in Google Scholar

32. Novoselova, N, Beffa, CD, Wang, J, Li, J, Pessler, F, Klawonn, F. HUM calculator and HUM package for R: easy-to-use software tools for multicategory receiver operating characteristic analysis. Bioinformatics 2014;30:1635–6. https://doi.org/10.1093/bioinformatics/btu086.Search in Google Scholar PubMed

33. Robin, X, Turck, N, Hainard, A, Tiberti, N, Lisacek, F, Sanchez, JC, et al.. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinf 2011;12:77. https://doi.org/10.1186/1471-2105-12-77.Search in Google Scholar PubMed PubMed Central

34. Kim, H. Package ‘discretization’; 2015. Available from https://cran.r-project.org/web/packages/discretization/discretization.pdf [Accessed 14 May 2022].Search in Google Scholar

35. Dowle, M, Srinivasan, A. Data.table: extension of `data.frame`. R package version 1.14.0; 2021. Available from: https://CRAN.R-project.org/package=data.table [Accessed 14 May 2022].Search in Google Scholar

36. Harrell, FEJr. Package Hmisc; 2020. Available from: https://cran.r-project.org/web/packages/Hmisc/Hmisc.pdf [Accessed 14 May 2022].Search in Google Scholar

37. Holm, S. A simple sequentially rejective multiple test procedure. Scand J Stat 1979;6:65–70.Search in Google Scholar

38. Sievert, C. Interactive web-based data visualization with R, plotly, and shiny. Chapman and Hall/CRC Florida; 2020. Available from: https://plotly-r.com [Accessed 14 May 2022].10.1201/9780429447273Search in Google Scholar

39. Soetaert, K. plot3D: plotting multi-dimensional data. R package version 1.3; 2019. Available from: https://CRAN.R-project.org/package=plot3D [Accessed 14 May 2022].Search in Google Scholar

40. Xie, Y, Cheng, J, Tan, X. DT: a wrapper of the javaScript library ‘DataTables’. R package version 0.17; 2021. Available from: https://CRAN.R-project.org/package=DT [Accessed 14 May 2022].Search in Google Scholar

41. Hand, DJ. Measuring classifier performance: a coherent alternative to the area under the ROC curve. Mach Learn 2009;77:103–23. https://doi.org/10.1007/s10994-009-5119-5.Search in Google Scholar

42. Movahedi, F, Padman, R, Antaki, JF. Limitations of receiver operating characteristic curve on imbalanced data: assist device mortality risk scores. J Thorac Cardiovasc Surg 2021;S0022–5223:01140–5. https://doi.org/10.1016/j.jtcvs.2021.07.041.Search in Google Scholar PubMed PubMed Central

43. Mazurowski, MA, Habas, PA, Zurada, JM, Lo, JY, Baker, JA, Tourassi, GD. Training neural network classifiers for medical decision making: the effects of imbalanced datasets on classification performance. Neural Network 2008;21:427–36. https://doi.org/10.1016/j.neunet.2007.12.031.Search in Google Scholar PubMed PubMed Central

44. Gao, T, Hao, Y, Zhang, H, Hu, L, Li, H, Li, H, et al.. Predicting pathological response to neoadjuvant chemotherapy in breast cancer patients based on imbalanced clinical data. Personal Ubiquitous Comput 2018;22:1039–47. https://doi.org/10.1007/s00779-018-1144-3.Search in Google Scholar

45. Zhang, L, Yang, H, Jiang, Z. Imbalanced biomedical data classification using self-adaptive multilayer ELM combined with dynamic GAN. Biomed Eng Online 2018;17:181. https://doi.org/10.1186/s12938-018-0604-3.Search in Google Scholar PubMed PubMed Central

46. Fotouhi, S, Asadi, S, Kattan, MW. A comprehensive data level analysis for cancer diagnosis on imbalanced data. J Biomed Inf 2019;90:103089. https://doi.org/10.1016/j.jbi.2018.12.003.Search in Google Scholar PubMed

47. Carrington, AM, Fieguth, PW, Qazi, H, Holzinger, A, Chen, HH, Mayr, F, et al.. A new concordant partial AUC and partial c statistic for imbalanced data in the evaluation of machine learning algorithms. BMC Med Inf Decis Making 2020;20:4. https://doi.org/10.1186/s12911-019-1014-6.Search in Google Scholar PubMed PubMed Central

Received: 2022-05-29

Accepted: 2023-02-01

Published Online: 2023-03-06

You are currently not able to access this content.

Articles in the same Issue

https://doi.org/10.1515/ijb-2022-0063

Keywords for this article

biomarker candidates; Monte-Carlo simulation; multi-class problem; pilot study evaluation; R Shiny application; short fat data