Home Robustness of steroidomics-based machine learning for diagnosis of primary aldosteronism: a laboratory medicine perspective
Article
Licensed
Unlicensed Requires Authentication

Robustness of steroidomics-based machine learning for diagnosis of primary aldosteronism: a laboratory medicine perspective

  • Graeme Eisenhofer ORCID logo EMAIL logo , Mirko Peitzsch , Kevin Mantik , Manuel Schulze , Georgiana Constantinescu , Zhong Lu , Hanna Remde , Carmina T. Fuss , Tracy Ann Williams , Sven Gruber , Jacques W. M. Lenders , Andrea Horvath and Christina Pamporaki
Published/Copyright: July 25, 2025
Become an author with De Gruyter Brill

Abstract

Objectives

Use of machine learning (ML) in diagnostics offers promise to optimise interpretation of laboratory data and guide clinical decision-making. For this, ML-based outputs should provide robustly reproducible results at least as good as the underlying laboratory data. The objective of this study was to assess robustness of ML-based steroid-probability-scores for diagnosis of primary aldosteronism (PA).

Methods

Reproducibility of ML-based steroid-probability-scores was assessed from coefficients of variation (CVs) for pools of quality control plasma from selected groups of patients with and without PA. Intra-patient measurement variability was assessed from CVs of three consecutive plasma specimens obtained on different days from 77 patients. Inter-laboratory reproducibility was assessed from 47 duplicate plasma specimens analysed in two different laboratories.

Results

Support vector machine-derived steroid-probability-scores for diagnosis of PA for seven sets of quality control plasma pools yielded an averaged CV (2.5 % CI 0.4–4.4 %) that was lower (p=0.0078) than the averaged CV for seven steroids employed in that model (12.0 % CI 7.4–16.6). Using three sets of plasma samples from 77 patients, CVs for intra-patient measurement variability of steroid-probability-scores were 7 % (CI 5–9 %) and lower (p<0.0001) than CVs for measurements of aldosterone (38 % CI 32–42 %), 18-oxocortisol (36 % CI 29–43 %), 18-hydroxycortisol (25 % CI 21–28 %) and the aldosterone:renin ratio (46 % CI 38–55 %). ML-derived probability scores for 47 duplicate plasma samples analysed at two separate laboratories displayed excellent agreement and negligible bias.

Conclusions

ML-based steroid-probability-scores for diagnosis of PA display remarkably high robustness according to reproducibility of measurements within and between laboratories as well as within patients.


Corresponding author: Graeme Eisenhofer, Department of Medicine III, Technische Universität Dresden, Fetscherstraße 74, 01307, Dresden, Germany, E-mail:

Award Identifier / Grant number: 314061271-TRR/CRC 205-1/2

Acknowledgments

The authors gratefully acknowledge the contributions of Catleen Conrad, Denise Kaden, Carola Kunath, Ramona Walter, James Doery, Peta Nuttall, Catherine He, Tina Yen, George Mangos and Sradha Kotwal for support with patient recruitment, clinical procedures or sample processing at Dresden, Würzburg, Melbourne and Sydney. Immunohistochemical and KCNJ5 genotyping by Carolin Ellerbrock at Munich is also gratefully acknowledged.

  1. Research ethics: Dresden 24/10/2018 EK 386102018; Munich 09/04/2018 No 18–117; Wurzburg 10/02/2021 42/19; Zurich 27/08/2018 # 2018-01292; three Australian centers: Monash Medical Center, Melbourne, Prince of Wales and St George Hospitals- Umbrella agreement – 30/10/2019 - ERM Reference Number: 53636 Monash Health Ref: RES-19-0000480A.

  2. Informed consent: Written Informed consent was obtained from all individuals included in this study.

  3. Author contributions: All authors have accepted responsibility for the entire content of this manuscript and approved its submission.

  4. Use of Large Language Models, AI and Machine Learning Tools: None declared.

  5. Conflict of interest: The authors state no conflict of interest.

  6. Research funding: The Deutsche Forschungsgemeinschaft (314061271-TRR/CRC 205-1/2) to GE, MP, CTF, TAW and CP.

  7. Data availability: The underlying data can be made available on reasonable request from the corresponding author.

References

1. Gopal, G, Suter-Crazzolara, C, Toldo, L, Eberhardt, W. Digital transformation in healthcare - architectures of present and future information technologies. Clin Chem Lab Med 2019;57:328–35. https://doi.org/10.1515/cclm-2018-0658.Search in Google Scholar PubMed

2. Neumaier, M. Diagnostics 4.0: the medical laboratory in digital health. Clin Chem Lab Med 2019;57:343–8. https://doi.org/10.1515/cclm-2018-1088.Search in Google Scholar PubMed

3. Cubukcu, HC, Topcu, DI, Yenice, S. Machine learning-based clinical decision support using laboratory data. Clin Chem Lab Med 2024;62:793–823. https://doi.org/10.1515/cclm-2023-1037.Search in Google Scholar PubMed

4. Seok, HS, Yu, S, Shin, KH, Lee, W, Chun, S, Kim, S, et al.. Machine learning-based sample misidentification error detection in clinical laboratory tests: a retrospective multicenter study. Clin Chem 2024;70:1256–67. https://doi.org/10.1093/clinchem/hvae114.Search in Google Scholar PubMed

5. Reel, PS, Reel, S, van Kralingen, JC, Langton, K, Lang, K, Erlic, Z, et al.. Machine learning for classification of hypertension subtypes using multi-omics: a multi-centre, retrospective, data-driven study. EBioMedicine 2022;84:104276. https://doi.org/10.1016/j.ebiom.2022.104276.Search in Google Scholar PubMed PubMed Central

6. Che, Y, Zhao, M, Gao, Y, Zhang, Z, Zhang, X. Application of machine learning for mass spectrometry-based multi-omics in thyroid diseases. Front Mol Biosci 2024;11:1483326. https://doi.org/10.3389/fmolb.2024.1483326.Search in Google Scholar PubMed PubMed Central

7. Eisenhofer, G, Duran, C, Chavakis, T, Cannistraci, CV. Steroid metabolomics: machine learning and multidimensional diagnostics for adrenal cortical tumors, hyperplasias, and related disorders. Curr Opin Endocr Metab Res 2019;8:40–9. https://doi.org/10.1016/j.coemr.2019.07.002.Search in Google Scholar

8. Turcu, AF, Yang, J, Vaidya, A. Primary aldosteronism - a multidimensional syndrome. Nat Rev Endocrinol 2022;18:665–82. https://doi.org/10.1038/s41574-022-00730-2.Search in Google Scholar PubMed

9. Constantinescu, G, Gruber, S, Fuld, S, Peitzsch, M, Schulze, M, Remde, H, et al.. Steroidomics-based screening for primary aldosteronism: impact of antihypertensive drugs. Hypertension 2024;81:2060–71. https://doi.org/10.1161/hypertensionaha.124.23029.Search in Google Scholar

10. Eisenhofer, G, Duran, C, Cannistraci, CV, Peitzsch, M, Williams, TA, Riester, A, et al.. Use of steroid profiling combined with machine learning for identification and subtype classification in primary aldosteronism. JAMA Network Open 2020;3:e2016209. https://doi.org/10.1001/jamanetworkopen.2020.16209.Search in Google Scholar PubMed PubMed Central

11. Constantinescu, G, Schulze, M, Peitzsch, M, Hofmockel, T, Scholl, UI, Williams, TA, et al.. Integration of artificial intelligence and plasma steroidomics with laboratory information management systems: application to primary aldosteronism. Clin Chem Lab Med 2022;60:1929–37. https://doi.org/10.1515/cclm-2022-0470.Search in Google Scholar PubMed

12. Williams, TA, Peitzsch, M, Dietz, AS, Dekkers, T, Bidlingmaier, M, Riester, A, et al.. Genotype-specific steroid profiles associated with aldosterone-producing adenomas. Hypertension 2016;67:139–45. https://doi.org/10.1161/hypertensionaha.115.06186.Search in Google Scholar PubMed

13. Van Laere, S, Muylle, KM, Cornu, P. Clinical decision support and new regulatory frameworks for medical devices: are we ready for it? - A viewpoint paper. Int J Health Policy Manage 2022;11:3159–63. https://doi.org/10.34172/ijhpm.2021.144.Search in Google Scholar PubMed PubMed Central

14. McKee, M, Wouters, OJ. The challenges of regulating artificial intelligence in healthcare comment on “cinical decision support and new regulatory fameworks for medical devices: are we ready for it? - A viewpoint paper”. Int J Health Policy Manage 2023;12:7261. https://doi.org/10.34172/ijhpm.2022.7261.Search in Google Scholar PubMed PubMed Central

15. Sendak, MP, Liu, VX, Beecy, A, Vidal, DE, Shaw, K, Lifson, MA, et al.. Strengthening the use of artificial intelligence within healthcare delivery organizations: balancing regulatory compliance and patient safety. J Am Med Inf Assoc 2024;31:1622–7. https://doi.org/10.1093/jamia/ocae119.Search in Google Scholar PubMed PubMed Central

16. Collins, GS, Moons, KGM, Dhiman, P, Riley, RD, Beam, AL, Van Calster, B, et al.. TRIPOD+AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods. BMJ 2024;385:e078378. https://doi.org/10.1136/bmj-2023-078378.Search in Google Scholar PubMed PubMed Central

17. Spies, NC, Farnsworth, CW, Wheeler, S, McCudden, CR. Validating, implementing, and monitoring machine learning solutions in the clinical laboratory safely and effectively. Clin Chem 2024;70:1334–43. https://doi.org/10.1093/clinchem/hvae126.Search in Google Scholar PubMed

18. Labkoff, S, Oladimeji, B, Kannry, J, Solomonides, A, Leftwich, R, Koski, E, et al.. Toward a responsible future: recommendations for AI-enabled clinical decision support. J Am Med Inf Assoc 2024;31:2730–9. https://doi.org/10.1093/jamia/ocae209.Search in Google Scholar PubMed PubMed Central

19. Yozamp, N, Hundemer, GL, Moussa, M, Underhill, J, Fudim, T, Sacks, B, et al.. Intraindividual variability of aldosterone concentrations in primary aldosteronism: implications for case detection. Hypertension 2021;77:891–9. https://doi.org/10.1161/hypertensionaha.120.16429.Search in Google Scholar

20. Veldhuizen, GP, Alnazer, RM, Kroon, AA, de Leeuw, PW. Variability of aldosterone, renin and the aldosterone-to-renin ratio in hypertensive patients without primary aldosteronism. J Hypertens 2022;40:2256–62. https://doi.org/10.1097/hjh.0000000000003257.Search in Google Scholar PubMed

21. Ng, E, Gwini, SM, Libianto, R, Choy, KW, Lu, ZX, Shen, J, et al.. Aldosterone, renin, and aldosterone-to-renin ratio variability in screening for primary aldosteronism. J Clin Endocrinol Metab 2022;108:33–41. https://doi.org/10.1210/clinem/dgac568.Search in Google Scholar PubMed

22. Maciel, AAW, Freitas, TC, Fagundes, GFC, Petenuci, J, Vilela, LAP, Brito, LP, et al.. Intra-individual variability of serum aldosterone and implications for primary aldosteronism screening. J Clin Endocrinol Metab 2023;108:1143–53. https://doi.org/10.1210/clinem/dgac679.Search in Google Scholar PubMed

23. Funder, JW, Carey, RM, Mantero, F, Murad, MH, Reincke, M, Shibata, H, et al.. The management of primary aldosteronism: case detection, diagnosis, and treatment: an endocrine society clinical practice guideline. J Clin Endocrinol Metab 2016;101:1889–916. https://doi.org/10.1210/jc.2015-4061.Search in Google Scholar PubMed

24. Eisenhofer, G, Kurlbaum, M, Peitzsch, M, Constantinescu, G, Remde, H, Schulze, M, et al.. The saline infusion test for primary aldosteronism: implications of immunoassay inaccuracy. J Clin Endocrinol Metab 2022;107:e2027–36. https://doi.org/10.1210/clinem/dgab924.Search in Google Scholar PubMed PubMed Central

25. Fuld, S, Constantinescu, G, Pamporaki, C, Peitzsch, M, Schulze, M, Yang, J, et al.. Screening for primary aldosteronism by mass spectrometry versus immunoassay measurements of aldosterone: a prospective within-patient study. J Appl Lab Med 2024;9:752–66. https://doi.org/10.1093/jalm/jfae017.Search in Google Scholar PubMed

26. Harris, PA, Taylor, R, Thielke, R, Payne, J, Gonzalez, N, Conde, JG. Research electronic data capture (REDCap)--a metadata-driven methodology and workflow process for providing translational research informatics support. J Biomed Inf 2009;42:377–81. https://doi.org/10.1016/j.jbi.2008.08.010.Search in Google Scholar PubMed PubMed Central

27. Peitzsch, M, Dekkers, T, Haase, M, Sweep, FC, Quack, I, Antoch, G, et al.. An LC-MS/MS method for steroid profiling during adrenal venous sampling for investigation of primary aldosteronism. J Steroid Biochem Mol Biol 2015;145:75–84. https://doi.org/10.1016/j.jsbmb.2014.10.006.Search in Google Scholar PubMed

28. Fanelli, F, Cantu, M, Temchenko, A, Mezzullo, M, Lindner, JM, Peitzsch, M, et al.. Report from the HarmoSter study: impact of calibration on comparability of LC-MS/MS measurement of circulating cortisol, 17OH-progesterone and aldosterone. Clin Chem Lab Med 2022;60:726–39. https://doi.org/10.1515/cclm-2021-1028.Search in Google Scholar PubMed

29. Fanelli, F, Bruce, S, Cantu, M, Temchenko, A, Mezzullo, M, Lindner, JM, et al.. Report from the HarmoSter study: inter-laboratory comparison of LC-MS/MS measurements of corticosterone, 11-deoxycortisol and cortisone. Clin Chem Lab Med 2023;61:67–77. https://doi.org/10.1515/cclm-2022-0242.Search in Google Scholar PubMed

30. Fanelli, F, Peitzsch, M, Bruce, S, Cantu, M, Temchenko, A, Mezzullo, M, et al.. Report from the HarmoSter study: different LC-MS/MS androstenedione, DHEAS and testosterone methods compare well; however, unifying calibration is a double-edged sword. Clin Chem Lab Med 2024;62:1080–91. https://doi.org/10.1515/cclm-2023-1138.Search in Google Scholar PubMed

31. Eisenhofer, G, Peitzsch, M, Kaden, D, Langton, K, Pamporaki, C, Masjkur, J, et al.. Reference intervals for plasma concentrations of adrenal steroids measured by LC-MS/MS: impact of gender, age, oral contraceptives, body mass index and blood pressure status. Clin Chim Acta 2017;470:115–24. https://doi.org/10.1016/j.cca.2017.05.002.Search in Google Scholar PubMed PubMed Central

32. Williams, TA, Lenders, JWM, Mulatero, P, Burrello, J, Rottenkolber, M, Adolf, C, et al.. Outcomes after adrenalectomy for unilateral primary aldosteronism: an international consensus on outcome measures and analysis of remission rates in an international cohort. Lancet Diabetes Endocrinol 2017;5:689–99. https://doi.org/10.1016/s2213-8587(17)30135-3.Search in Google Scholar PubMed PubMed Central

33. Williams, TA, Gomez-Sanchez, CE, Rainey, WE, Giordano, TJ, Lam, AK, Marker, A, et al.. International histopathology consensus for unilateral primary aldosteronism. J Clin Endocrinol Metab 2021;106:42–54. https://doi.org/10.1210/clinem/dgaa484.Search in Google Scholar PubMed PubMed Central

34. Reichenbach, SE, Zini, CA, Nicolli, KP, Welke, JE, Cordero, C, Tao, Q. Benchmarking machine learning methods for comprehensive chemical fingerprinting and pattern recognition. J Chromatogr A 2019;1595:158–67. https://doi.org/10.1016/j.chroma.2019.02.027.Search in Google Scholar PubMed

35. Efthimiou, O, Seo, M, Chalkou, K, Debray, T, Egger, M, Salanti, G. Developing clinical prediction models: a step-by-step guide. BMJ 2024;386:e078276. https://doi.org/10.1136/bmj-2023-078276.Search in Google Scholar PubMed PubMed Central

36. Balendran, A, Beji, C, Bouvier, F, Khalifa, O, Evgeniou, T, Ravaud, P, et al.. A scoping review of robustness concepts for machine learning in healthcare. npj Digit Med 2025;8:38. https://doi.org/10.1038/s41746-024-01420-1.Search in Google Scholar PubMed PubMed Central

37. Futoma, J, Simons, M, Panch, T, Doshi-Velez, F, Celi, LA. The myth of generalisability in clinical research and machine learning in health care. Lancet Digit Health 2020;2:e489–92. https://doi.org/10.1016/s2589-7500(20)30186-2.Search in Google Scholar

38. Yang, HS, Pan, W, Wang, Y, Zaydman, MA, Spies, NC, Zhao, Z, et al.. Generalizability of a machine learning model for improving utilization of parathyroid hormone-related peptide testing across multiple clinical centers. Clin Chem 2023;69:1260–9. https://doi.org/10.1093/clinchem/hvad141.Search in Google Scholar PubMed

39. Pham, HT, Arnhard, K, Asad, YJ, Deng, L, Felder, TK, St John-Williams, L, et al.. Inter-laboratory robustness of next-generation bile acid study in mice and humans: international ring trial involving 12 laboratories. J Appl Lab Med 2016;1:129–42. https://doi.org/10.1373/jalm.2016.020537.Search in Google Scholar PubMed

40. Thompson, JW, Adams, KJ, Adamski, J, Asad, Y, Borts, D, Bowden, JA, et al.. International ring trial of a high resolution targeted metabolomics and lipidomics platform for serum and plasma analysis. Anal Chem 2019;91:14407–16. https://doi.org/10.1021/acs.analchem.9b02908.Search in Google Scholar PubMed PubMed Central

41. Pamporaki, C, Pommer, G, Apostolopoulos, ID, Filippatos, A, Peitzsch, M, Remde, H, et al.. Utility of disease probability scores to guide decision-making during screening for phaeochromocytoma and paraganglioma: a machine learning modelling cross sectional study. eClinicalMedicine 2025;82:103181. https://doi.org/10.1016/j.eclinm.2025.103181.Search in Google Scholar PubMed PubMed Central

42. Amann, J, Blasimme, A, Vayena, E, Frey, D, Madai, VI, Precise, Q. Explainability for artificial intelligence in healthcare: a multidisciplinary perspective. BMC Med Inf Decis Making 2020;20:310. https://doi.org/10.1186/s12911-020-01332-6.Search in Google Scholar PubMed PubMed Central

43. Allgaier, J, Mulansky, L, Draelos, RL, Pryss, R. How does the model make predictions? A systematic literature review on the explainability power of machine learning in healthcare. Artif Intell Med 2023;143:102616. https://doi.org/10.1016/j.artmed.2023.102616.Search in Google Scholar PubMed


Supplementary Material

This article contains supplementary material (https://doi.org/10.1515/cclm-2025-0200).


Received: 2025-02-19
Accepted: 2025-07-09
Published Online: 2025-07-25

© 2025 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 19.9.2025 from https://www.degruyterbrill.com/document/doi/10.1515/cclm-2025-0200/html
Scroll to top button