Home Medicine Explainable AI for gut microbiome-based diagnostics: colorectal cancer as a case study
Article Open Access

Explainable AI for gut microbiome-based diagnostics: colorectal cancer as a case study

  • Animesh Acharjee ORCID logo EMAIL logo
Published/Copyright: June 19, 2023

To the Editor,

An increasing number of studies have found a correlation between gut microbiome composition changes and colorectal cancer (CRC) [1], [2], [3]. Researchers have utilised machine learning (ML) algorithms to identify potential CRC biomarkers and to distinguish CRC patients from healthy individuals [4]. Identification of the diagnostics microbes will help tremendously to develop a non-invasive biomarker for the CRC patients. Recent microbiome studies frequently employ random forest [5, 6] due to its predictive ability and capacity to generate feature importance. Random forest is a decision tree-based algorithm and proven to be sophisticated enough to capture bacteria that are typically associated with CRC, however it is unable to identify species that are only significant for a subset of patients.

Recently, Rynazal et al., published an exciting paper on [7] on explainable AI and their application on microbiome data sets derived from CRC patients. This study investigated the feasibility of employing a method called as “Shapley Additive Explanations (SHAP)” technique for analysing gastrointestinal microbiome data [7]. The difference between normal ML methods compared to the explainable AI method is interpretability. For example: Using Random Forest method, we can rank the important variable (here microbiome) called as “Gini importance” and we rely solely on global explanation techniques. Local explanations enable us to observe distinct bacterial contribution patterns among CRC patients. One of the examples provided Rynazal et al., would be, Clostridium symbiosum was found to be the most influential bacteria for the patient with subject ID SAMD00114911. In Figure 1, shows the application of the explainable AI and discovery of the interpretable microbes for therapeutic intervention.

Figure 1: 
Shows the application of the explainable AI and discovery of the interpretable microbes for therapeutic intervention.
Figure 1:

Shows the application of the explainable AI and discovery of the interpretable microbes for therapeutic intervention.

In contrast, Eubacterium eligens displays the opposite pattern, indicating that a higher abundance of this species is associated with a lower CRC risk. This understanding of the direction of effects cannot be obtained using commonly employed global explanation methods, such as the random forest’s built-in feature importance.

This study further investigated the benefits of using SHAP for individual feature contribution analyses and CRC subtyping. In addition, a Python library was developed to assist microbiome researchers with comparable analyses. Based on the contribution of each bacterial species to the classifier, the study discovered that SHAP can generate feature contributions for every single ML prediction and classify the disease group into subgroups of CRC patients. This study investigated the feasibility of using explainable AI for CRC classification based on the gastrointestinal microbiome. SHAP was utilised to acquire more individualised feature importance that can be used to identify potential bacterial CRC biomarkers.

This method will benefit the microbiome community and encourage researchers to utilise local explanations for a more personalised identification of feature importance and subtyping and a huge potential for microbiome based diagnostics [8]. CRC is just an example of such applications however, it can be used other cancer types or other diseases like inflammatory bowel diseases [9] or metabolic diseases to make an interpretable model and hence a new avenue towards precision medicine [10].


Corresponding author: Dr. Animesh Acharjee, Institute of Cancer and Genomic Sciences, University of Birmingham, Birmingham, UK; Institute of Translational Medicine, University of Birmingham, Birmingham, UK; MRC Health Data Research UK (HDR UK), Midlands Site, Birmingham, UK; and University of Birmingham, B15 2TT, Birmingham, UK, Phone: +4407403642022, E-mail:

  1. Research funding: None declared.

  2. Author contributions: Animesh Acharjee is responsible for the writing, conceptualization, acquisition of data, and interpretation of the data.

  3. Competing interests: The author declares no conflicts of interest regarding this article.

  4. Informed consent: Not applicable.

  5. Ethical approval: Not applicable.

References

1. Roelands, J, Kuppen, PJK, Ahmed, EI, Mall, R, Masoodi, T, Singh, P, et al.. An integrated tumor, immune and microbiome atlas of colon cancer. Nat Med 2023;29:1273–86. https://doi.org/10.1038/s41591-023-02324-5.Search in Google Scholar PubMed PubMed Central

2. Rebersek, M. Gut microbiome and its role in colorectal cancer. BMC Cancer 2021;21:1325. https://doi.org/10.1186/s12885-021-09054-2.Search in Google Scholar PubMed PubMed Central

3. Ternes, D, Karta, J, Tsenkova, M, Wilmes, P, Haan, S, Letellier, E. Microbiome in colorectal cancer: how to get from meta-omics to mechanism? Trends Microbiol 2020;28:401–23. https://doi.org/10.1016/j.tim.2020.05.013.Search in Google Scholar PubMed

4. Bosch, S, Acharjee, A, Quraishi, MN, Bijnsdorp, IV, Rojas, P, Bakkali, A, et al.. Integration of stool microbiota, proteome and amino acid profiles to discriminate patients with adenomas and colorectal cancer. Gut Microb 2022;14:2139979. https://doi.org/10.1080/19490976.2022.2139979.Search in Google Scholar PubMed PubMed Central

5. Acharjee, A, Larkman, J, Xu, Y, Cardoso, VR, Gkoutos, GV. A random forest based biomarker discovery and power analysis framework for diagnostics research. BMC Med Genom 2020;13:178. https://doi.org/10.1186/s12920-020-00826-6.Search in Google Scholar PubMed PubMed Central

6. Breiman, L Random forests. Mach Learn 2001;45:5–32.10.1023/A:1010933404324Search in Google Scholar

7. Rynazal, R, Fujisawa, K, Shiroma, H, Salim, F, Mizutani, S, Shiba, S, et al.. Leveraging explainable AI for gut microbiome-based colorectal cancer classification. Genome Biol 2023;24:21. https://doi.org/10.1186/s13059-023-02858-4.Search in Google Scholar PubMed PubMed Central

8. Acharjee, A, Singh, U, Choudhury, SP, Gkoutos, GV. The diagnostic potential and barriers of microbiome based therapeutics. Diagnosis 2022;9:411–20. https://doi.org/10.1515/dx-2022-0052.Search in Google Scholar PubMed

9. Shah, P, Kendall, F, Khozin, S, Goosen, R, Hu, J, Laramie, J, et al.. Artificial intelligence and machine learning in clinical development: a translational perspective. NPJ Digit Med 2019;2:69. https://doi.org/10.1038/s41746-019-0148-3.Search in Google Scholar PubMed PubMed Central

10. Petrosino, JF. The microbiome in precision medicine: the way forward. Genome Med 2018;10:12. https://doi.org/10.1186/s13073-018-0525-6.Search in Google Scholar PubMed PubMed Central

Received: 2023-05-30
Accepted: 2023-06-04
Published Online: 2023-06-19

© 2023 the author(s), published by De Gruyter, Berlin/Boston

This work is licensed under the Creative Commons Attribution 4.0 International License.

Articles in the same Issue

  1. Frontmatter
  2. Reviews
  3. Diagnostic errors in uncommon conditions: a systematic review of case reports of diagnostic errors
  4. Routine blood test markers for predicting liver disease post HBV infection: precision pathology and pattern recognition
  5. Opinion Papers
  6. The challenge of clinical reasoning in chronic multimorbidity: time and interactions in the Health Issues Network model
  7. The first diagnostic excellence conference in Japan
  8. Clouds across the new dawn for clinical, diagnostic and biological data: accelerating the development, delivery and uptake of personalized medicine
  9. Original Articles
  10. Towards diagnostic excellence on academic ward teams: building a conceptual model of team dynamics in the diagnostic process
  11. Error codes at autopsy to study potential biases in diagnostic error
  12. Multicenter evaluation of a method to identify delayed diagnosis of diabetic ketoacidosis and sepsis in administrative data
  13. Detection of fake papers in the era of artificial intelligence
  14. Is language an issue? Accuracy of the German computerized diagnostic decision support system ISABEL and cross-validation with the English counterpart
  15. The feasibility of a mystery case curriculum to enhance diagnostic reasoning skills among medical students: a process evaluation
  16. Internal medicine intern performance on the gastrointestinal physical exam
  17. Scaling up a diagnostic pause at the ICU-to-ward transition: an exploration of barriers and facilitators to implementation of the ICU-PAUSE handoff tool
  18. Learned cautions regarding antibody testing in mast cell activation syndrome
  19. Diagnostic properties of natriuretic peptides and opportunities for personalized thresholds for detecting heart failure in primary care
  20. Incomplete filling of spray-dried K2EDTA evacuated blood tubes: impact on measuring routine hematological parameters on Sysmex XN-10
  21. Letters to the Editor
  22. The diagnostic accuracy of AI-based predatory journal detectors: an analogy to diagnosis
  23. Explainable AI for gut microbiome-based diagnostics: colorectal cancer as a case study
  24. Restless X syndrome: a new diagnostic family of nocturnal, restless, abnormal sensations of various body parts
  25. Erratum
  26. Retraction of: Establishing a stable platform for the measurement of blood endotoxin levels in the dialysis population
Downloaded on 27.1.2026 from https://www.degruyterbrill.com/document/doi/10.1515/dx-2023-0062/html
Scroll to top button