Machine learning algorithms with body fluid parameters: an interpretable framework for malignant cell screening in cerebrospinal fluid
Abstract
Objectives
This study aimed to develop and validate a machine learning (ML) model utilizing cerebrospinal fluid (CSF) body fluid parameters from hematology analyzers to screen for malignant cells.
Methods
We analyzed 643 consecutive CSF samples from patients with central nervous system symptoms, with 191 samples classified as positive for malignant cells based on cytological examination, for model derivation. Body fluid parameters were measured using the body fluid mode of a hematology analyzer. Least Absolute Shrinkage and Selection Operator (LASSO) regression was applied to identify predictive biomarkers, followed by performance evaluations of six ML algorithms. Model interpretability was assessed using SHapley Additive exPlanations (SHAP). The selected model was also externally validated with an additional 136 CSF samples.
Results
The median leukocyte (WBC) and total nucleated cell (TNC) counts in the cytology-positive samples were significantly lower than those in the cytology-negative samples (5.4 vs. 31.8 and 7.4 vs. 32.6, respectively, p<0.001). The support vector machine (SVM) model achieved the highest area under the curve (AUC) of 0.899 (SD: 0.035) and the highest sensitivity of 0.827 (SD: 0.059) in internal validation. SHAP analysis identified the percentage of high fluorescence cells and monocytes as the two most significant predictors, both positively correlated with malignant cell outcomes. External validation demonstrated a comparable AUC and sensitivity, confirming the model’s generalizability.
Conclusions
We developed an ML model that predicts cytological outcomes in CSF using routinely available body fluid parameters. The model demonstrated consistent performance during external validation.
Acknowledgments
We would like to thank the Extreme Smart Analysis platform for its analysis assistance.
-
Research ethics: This study was approved by the Ethics Committee of the First Affiliated Hospital of Zhejiang University, ethics approval number: (2024) IIT consent letter No. (0811).
-
Informed consent: The requirement for informed consent was waived for this study.
-
Author contributions: Xianfei Ye collected the samples, analyzed the data, and wrote the draft. Xinfeng Zhao contributed to data analysis and figure creation. Yinyu Lou assisted with data collection, sample analysis, and morphological evaluation. Hanqi Pan was responsible for the cellular pathological diagnosis and reviewed the results of the discordant samples. Yunying Chen developed the research idea, designed the study, and established the machine learning models. All authors reviewed and approved the final manuscript.
-
Use of Large Language Models, AI and Machine Learning Tool: None declared.
-
Conflict of interest: The authors state no conflict of interest.
-
Research funding: This work was supported by the Hangzhou Municipal Health Commission Project (20241029Y063).
-
Data availability: The data that support the findings of this study are available from the corresponding author upon reasonable request.
References
1. Clinical and Laboratory Standards Institution (CLSI). Body fluid analysis for cellular composition. Wayne, PA: Approved Guideline: CLSI Document H56-A; 2006.Suche in Google Scholar
2. Chamberlain, M, Soffietti, R, Raizer, J, Rudà, R, Brandsma, D, Boogerd, W, et al.. Leptomeningeal metastasis: a Response Assessment in Neuro-Oncology critical review of endpoints and response criteria of published randomized clinical trials. Neuro Oncol 2014;16:1176–85. https://doi.org/10.1093/neuonc/nou089.Suche in Google Scholar PubMed PubMed Central
3. Bourner, G, De la Salle, B, George, T, Tabe, Y, Baum, H, Culp, N, et al.. ICSH guidelines for the verification and performance of automated cell counters for body fluids. Int J Lab Hematol 2014;36:598–612. https://doi.org/10.1111/ijlh.12196.Suche in Google Scholar PubMed
4. Fleming, C, Brouwer, R, Lindemans, J, de Jonge, R. Validation of the body fluid module on the new Sysmex XN-1000 for counting blood cells in cerebrospinal fluid and other body fluids. Clin Chem Lab Med 2012;50:1791–8. https://doi.org/10.1515/cclm-2011-0927.Suche in Google Scholar PubMed
5. de Jonge, R, Brouwer, R, de Graaf, MT, Luitwieler, RL, Fleming, C, de Frankrijker-Merkestijn, M, et al.. Evaluation of the new body fluid mode on the Sysmex XE-5000 for counting leukocytes and erythrocytes in cerebrospinal fluid and other body fluids. Clin Chem Lab Med 2010;48:665–75. https://doi.org/10.1515/cclm.2010.108.Suche in Google Scholar PubMed
6. Sandhaus, LM. Is the hemocytometer obsolete for body fluid cell counting? Am J Clin Pathol 2016;145:294–5. https://doi.org/10.1093/ajcp/aqw014.Suche in Google Scholar PubMed
7. Buoro, S, Peruzzi, B, Fanelli, A, Seghezzi, M, Manenti, B, Lorubbio, M, et al.. Two-site evaluation of the diagnostic performance of the Sysmex XN Body Fluid (BF) module for cell count and differential in Cerebrospinal Fluid. Int J Lab Hematol 2018;40:26–33. https://doi.org/10.1111/ijlh.12723.Suche in Google Scholar PubMed
8. Waldrop, GE, Cocuzzo, K, Schneider, CL, Kim, CY, Goetz, TG, Chomba, MS, et al.. Accuracy of automated analyzers for the estimation of CSF cell counts: a systematic review and meta-analysis. Int J Lab Hematol 2024;46:234–42. https://doi.org/10.1111/ijlh.14236.Suche in Google Scholar PubMed
9. Cho, YU, Chi, HS, Park, SH, Jang, S, Kim, YJ, Park, CJ. Body fluid cellular analysis using the Sysmex XN-2000 automatic hematology analyzer: focusing on malignant samples. Int J Lab Hematol 2015;37:346–56. https://doi.org/10.1111/ijlh.12292.Suche in Google Scholar PubMed
10. Cho, HE, Kim, YJ, Cho, SY, Park, TS, Park, KS. Clinical application of an algorithm to screen for malignant cells in body fluids using an automated hematology analyzer. Int J Lab Hematol 2022;44:483–9. https://doi.org/10.1111/ijlh.13813.Suche in Google Scholar PubMed
11. Saini, A, Sareen, R, Gupta, GN. High fluorescent cells on automated body fluid analysis as discriminator for malignant cell detection. S Asian J Cancer 2023;00:00. https://doi.org/10.1055/s-0043-1776287.Suche in Google Scholar
12. Alcaide Martín, MJ, Altimira Queral, L, Sahuquillo Frías, L, Valiña Amado, L, Merino, A, García de Guadiana-Romualdo, L. Automated cell count in body fluids: a review. Adv Lab Med 2021;2:149–77. https://doi.org/10.1515/almed-2021-0011.Suche in Google Scholar PubMed PubMed Central
13. Rahimi, J, Woehrer, A. Overview of cerebrospinal fluid cytology. Handb Clin Neurol 2017;145:563–71. https://doi.org/10.1016/b978-0-12-802395-2.00035-3.Suche in Google Scholar
14. Fleming, C, Russcher, H, Lindemans, J, de Jonge, R. Clinical relevance and contemporary methods for counting blood cells in body fluids suspected of inflammatory disease. Clin Chem Lab Med 2015;53:1689–706. https://doi.org/10.1515/cclm-2014-1247.Suche in Google Scholar PubMed
15. Schwarz, B, Hardt, C, Friedrich, K, Prpic, M, Osterloh, A, Heppner, FL, et al.. Sysmex XN-based evaluation of the diagnostic performance of high-fluorescent cells from CSF as a supportive diagnostic criterion in neurological diseases. Int J Lab Hematol 2025. https://doi.org/10.1111/ijlh.14466. [Epub ahead of print].Suche in Google Scholar PubMed PubMed Central
16. Wong-Arteta, J, Merino, A, Torrente, S, Banales, JM, Bujanda, L. High fluorescence cell count in ascitic body fluids for carcinomatosis screening. Clin Chem Lab Med 2018;56:272–4. https://doi.org/10.1515/cclm-2018-0359.Suche in Google Scholar PubMed
17. Larruzea, A, Aguadero, V, Orellana, R, Berlanga, E. High-fluorescent cells: a marker of malignancy in the analysis of body fluid samples. Int J Lab Hematol 2018;40:e43–5. https://doi.org/10.1111/ijlh.12793.Suche in Google Scholar PubMed
18. Wong-Arteta, J, Gil-Rodríguez, E, Cabezon-Vicente, R, Bereciartua-Urbieta, E, Bujanda, L. High fluorescence cell count in pleural fluids for malignant effusion screening. Clin Chim Acta 2019;499:115–7. https://doi.org/10.1016/j.cca.2019.09.008.Suche in Google Scholar PubMed
19. Mishra, S, Parikh, BP, Singh, J. Diagnostic utility of high fluorescence cells in detecting malignant effusions. J Cytol 2024;41:176–80. https://doi.org/10.4103/joc.joc_122_23.Suche in Google Scholar PubMed PubMed Central
20. Rastogi, L, Dass, J, Arya, V, Kotwal, J. Evaluation of high-fluorescence body fluid (HF-BF) parameter as a screening tool of malignancy in body fluids. Indian J Pathol Microbiol 2019;62:572–7. https://doi.org/10.4103/ijpm.ijpm_802_18.Suche in Google Scholar PubMed
21. Zimmermann, M, Otto, C, Gonzalez, JB, Prokop, S, Ruprecht, K. Cellular origin and diagnostic significance of high-fluorescent cells in cerebrospinal fluid detected by the XE-5000 hematology analyzer. Int J Lab Hematol 2013;35:580–8. https://doi.org/10.1111/ijlh.12090.Suche in Google Scholar PubMed
22. Buoro, S, Mecca, T, Azzarà, G, Seghezzi, M, Dominoni, P, Crippa, A, et al.. Cell Population Data and reflex testing rules of cell analysis in pleural and ascitic fluids using body fluid mode on Sysmex XN-9000. Clin Chim Acta 2016;452:92–8. https://doi.org/10.1016/j.cca.2015.11.005.Suche in Google Scholar PubMed
23. Cho, YU, You, E, Jang, S, Park, CJ. Validation of reflex testing rules and establishment of a new workflow for body fluid cell analysis using a Sysmex XN-550 automatic hematology analyzer. Int J Lab Hematol 2018;40:258–67. https://doi.org/10.1111/ijlh.12774.Suche in Google Scholar PubMed
24. Cancella, DAM, Brumpt, C, Sala, T, Oueidat, N, Larsen, M, Hausfater, P. Cell population data for early detection of sepsis in patients with suspected infection in the emergency department. Clin Chem Lab Med 2025;63:1654–62. https://doi.org/10.1515/cclm-2025-0180.Suche in Google Scholar PubMed
25. Roccaforte, V, Sabbatini, G, Panella, R, Daves, M, Formenti, P, Gotti, M, et al.. The potential role of leukocytes cell population data (CPD) for diagnosing sepsis in adult patients admitted to the intensive care unit. Clin Chem Lab Med 2025;63:1031–42. https://doi.org/10.1515/cclm-2024-1202.Suche in Google Scholar PubMed
26. Davis, SE, Greevy, RA, Fonnesbeck, C, Lasko, TA, Walsh, CG, Matheny, ME. A nonparametric updating method to correct clinical prediction model drift. J Am Med Inf Assoc 2019;26:1448–57. https://doi.org/10.1093/jamia/ocz127.Suche in Google Scholar PubMed PubMed Central
Supplementary Material
This article contains supplementary material (https://doi.org/10.1515/cclm-2025-0302).
© 2025 Walter de Gruyter GmbH, Berlin/Boston
Artikel in diesem Heft
- Frontmatter
- Editorial
- Quality indicators: an evolving target for laboratory medicine
- Reviews
- Regulating the future of laboratory medicine: European regulatory landscape of AI-driven medical device software in laboratory medicine
- The spectrum of nuclear patterns with stained metaphase chromosome plate: morphology nuances, immunological associations, and clinical relevance
- Opinion Papers
- Comprehensive assessment of medical laboratory performance: a 4D model of quality, economics, velocity, and productivity indicators
- Detecting cardiac injury: the next generation of high-sensitivity cardiac troponins improving diagnostic outcomes
- Perspectives
- Can Theranos resurrect from its ashes?
- Guidelines and Recommendations
- Australasian guideline for the performance of sweat chloride testing 3rd edition: to support cystic fibrosis screening, diagnosis and monitoring
- General Clinical Chemistry and Laboratory Medicine
- Recommendations for the integration of standardized quality indicators for glucose point-of-care testing
- A cost-effective assessment for the combination of indirect immunofluorescence and solid-phase assay in ANA-screening
- Assessment of measurement uncertainty of immunoassays and LC-MS/MS methods for serum 25-hydroxyvitamin D
- A novel immunoprecipitation-based targeted liquid chromatography-tandem mass spectrometry analysis for accurate determination for copeptin in human serum
- Histamine metabolite to basal serum tryptase ratios in systemic mastocytosis and hereditary alpha tryptasemia using a validated LC-MS/MS approach
- Machine learning algorithms with body fluid parameters: an interpretable framework for malignant cell screening in cerebrospinal fluid
- Impact of analytical bias on machine learning models for sepsis prediction using laboratory data
- Immunochemical measurement of urinary free light chains and Bence Jones proteinuria
- Serum biomarkers as early indicators of outcomes in spontaneous subarachnoid hemorrhage
- High myoglobin plasma samples risk being reported as falsely low due to antigen excess – follow up after a 2-year period of using a mitigating procedure
- Candidate Reference Measurement Procedures and Materials
- Commutability evaluation of glycated albumin candidate EQA materials
- Reference Values and Biological Variations
- Health-related reference intervals for heavy metals in non-exposed young adults
- Hematology and Coagulation
- Practical handling of hemolytic, icteric and lipemic samples for coagulation testing in European laboratories. A collaborative survey from the European Organisation for External Quality Assurance Providers in Laboratory Medicine (EQALM)
- Cancer Diagnostics
- Assessment of atypical cells in detecting bladder cancer in female patients
- Cardiovascular Diseases
- False-positive cardiac troponin I values due to macrotroponin in healthy athletes after COVID-19
- Diabetes
- A comparison of current methods to measure antibodies in type 1 diabetes
- Letters to the Editor
- The neglected issue of pyridoxal- 5′ phosphate
- Error in prostate-specific antigen levels after prostate cancer treatment with radical prostatectomy
- Arivale is dead ‒ Hooke is alive
- A single dose of 20-mg of ostarine is detectable in hair
- Growing importance of vocabularies in medical laboratories
- Congress Abstracts
- 62nd National Congress of the Hungarian Society of Laboratory Medicine Szeged, Hungary, August 28–30, 2025
Artikel in diesem Heft
- Frontmatter
- Editorial
- Quality indicators: an evolving target for laboratory medicine
- Reviews
- Regulating the future of laboratory medicine: European regulatory landscape of AI-driven medical device software in laboratory medicine
- The spectrum of nuclear patterns with stained metaphase chromosome plate: morphology nuances, immunological associations, and clinical relevance
- Opinion Papers
- Comprehensive assessment of medical laboratory performance: a 4D model of quality, economics, velocity, and productivity indicators
- Detecting cardiac injury: the next generation of high-sensitivity cardiac troponins improving diagnostic outcomes
- Perspectives
- Can Theranos resurrect from its ashes?
- Guidelines and Recommendations
- Australasian guideline for the performance of sweat chloride testing 3rd edition: to support cystic fibrosis screening, diagnosis and monitoring
- General Clinical Chemistry and Laboratory Medicine
- Recommendations for the integration of standardized quality indicators for glucose point-of-care testing
- A cost-effective assessment for the combination of indirect immunofluorescence and solid-phase assay in ANA-screening
- Assessment of measurement uncertainty of immunoassays and LC-MS/MS methods for serum 25-hydroxyvitamin D
- A novel immunoprecipitation-based targeted liquid chromatography-tandem mass spectrometry analysis for accurate determination for copeptin in human serum
- Histamine metabolite to basal serum tryptase ratios in systemic mastocytosis and hereditary alpha tryptasemia using a validated LC-MS/MS approach
- Machine learning algorithms with body fluid parameters: an interpretable framework for malignant cell screening in cerebrospinal fluid
- Impact of analytical bias on machine learning models for sepsis prediction using laboratory data
- Immunochemical measurement of urinary free light chains and Bence Jones proteinuria
- Serum biomarkers as early indicators of outcomes in spontaneous subarachnoid hemorrhage
- High myoglobin plasma samples risk being reported as falsely low due to antigen excess – follow up after a 2-year period of using a mitigating procedure
- Candidate Reference Measurement Procedures and Materials
- Commutability evaluation of glycated albumin candidate EQA materials
- Reference Values and Biological Variations
- Health-related reference intervals for heavy metals in non-exposed young adults
- Hematology and Coagulation
- Practical handling of hemolytic, icteric and lipemic samples for coagulation testing in European laboratories. A collaborative survey from the European Organisation for External Quality Assurance Providers in Laboratory Medicine (EQALM)
- Cancer Diagnostics
- Assessment of atypical cells in detecting bladder cancer in female patients
- Cardiovascular Diseases
- False-positive cardiac troponin I values due to macrotroponin in healthy athletes after COVID-19
- Diabetes
- A comparison of current methods to measure antibodies in type 1 diabetes
- Letters to the Editor
- The neglected issue of pyridoxal- 5′ phosphate
- Error in prostate-specific antigen levels after prostate cancer treatment with radical prostatectomy
- Arivale is dead ‒ Hooke is alive
- A single dose of 20-mg of ostarine is detectable in hair
- Growing importance of vocabularies in medical laboratories
- Congress Abstracts
- 62nd National Congress of the Hungarian Society of Laboratory Medicine Szeged, Hungary, August 28–30, 2025