Collective intelligence improves probabilistic diagnostic assessments

Nathan R. Stehouwer; Keith W. Torrey; Michael S. Dell

doi:10.1515/dx-2022-0090

Artikel

Collective intelligence improves probabilistic diagnostic assessments

Nathan R. Stehouwer , Keith W. Torrey und Michael S. Dell

Veröffentlicht/Copyright: 20. Februar 2023

Veröffentlicht von

Veröffentlichen auch Sie bei De Gruyter Brill

Manuskript einreichen Informationen für Autor*innen Erkunden Sie dieses Fachgebiet

Aus der Zeitschrift Diagnosis Band 10 Heft 2

Abstract

Objectives

Collective intelligence, the “wisdom of the crowd,” seeks to improve the quality of judgments by aggregating multiple individual inputs. Here, we evaluate the success of collective intelligence strategies applied to probabilistic diagnostic judgments.

Methods

We compared the performance of individual and collective intelligence judgments on two series of clinical cases requiring probabilistic diagnostic assessments, or “forecasts”. We assessed the quality of forecasts using Brier scores, which compare forecasts to observed outcomes.

Results

On both sets of cases, the collective intelligence answers outperformed nearly every individual forecaster or team. The improved performance by collective intelligence was mediated by both improved resolution and calibration of probabilistic assessments. In a secondary analysis looking at the effect of varying number of individual inputs in collective intelligence answers from two different data sources, nearly identical curves were found in the two data sets showing 11–12% improvement when averaging two independent inputs, 15% improvement averaging four independent inputs, and small incremental improvements with further increases in number of individual inputs.

Conclusions

Our results suggest that the application of collective intelligence strategies to probabilistic diagnostic forecasts is a promising approach to improve diagnostic accuracy and reduce diagnostic error.

Keywords: collective intelligence; diagnostic forecasting; probabilistic diagnostic reasoning

Corresponding author: Nathan R. Stehouwer, MD, Internal Medicine and Pediatrics, University Hospitals Rainbow Babies & Children’s Hospital, Cleveland, OH, USA; University Hospitals Cleveland Medical Center, Cleveland, OH, USA; and Case Western Reserve University School of Medicine, Cleveland, OH, USA, E-mail: nathan.stehouwer@uhhospitals.org

Funding source: University Hospitals Graduate Medical Education

Award Identifier / Grant number: Innovation Award #P0478

Acknowledgments

The authors thank Lukasz Weiner, MD, Paul Shaniuk, MD, Lauren Sackett, MD, and Collin Swafford, MD for their contributions of authorship of cases used in the study.

Research funding: This work was supported in part by University Hospitals Graduate Medical Education Innovation Award # P0478.
Author contributions: All authors have accepted responsibility for the entire content of this manuscript and approved its submission.
Competing interests: Dr. Stehouwer reports receiving a stipend to serve on the editorial board of the New England Journal of Medicine Healer application, which teaches diagnostic reasoning. Dr. Dell and Dr. Torrey report no relevant conflicts of interest.
Informed consent: Not applicable. No identifiable information was utilized for this study.
Ethical approval: The local Institutional Review Board deemed the study exempt from review.

References

1. Radcliffe, K, Lyson, HC, Barr-Walker, J, Sarkar, U. Collective intelligence in medical decision-making: a systematic scoping review. BMC Med Inf Decis Making 2019;19:158. https://doi.org/10.1186/s12911-019-0882-0.Suche in Google Scholar PubMed PubMed Central

2. Fontil, V, Radcliffe, K, Lyson, HC, Ratanawongsa, N, Lyles, C, Tuot, D, et al.. Testing and improving the acceptability of a web-based platform for collective intelligence to improve diagnostic accuracy in primary care clinics. JAMIA Open 2019;2:40–8. https://doi.org/10.1093/jamiaopen/ooy058.Suche in Google Scholar PubMed PubMed Central

3. Poses, RM, Bekes, C, Winkler, RL, Scott, WE, Copare, FJ. Are two (inexperienced) heads better than one (experienced) head? Arch Intern Med 1990;150:1874–8. https://doi.org/10.1001/archinte.150.9.1874.Suche in Google Scholar

4. Winkler, RL, Poses, RM. Evaluating and combining physicians’ probabilities of survival in an intensive care unit. Manag Sci 1993;39:1526–43. https://doi.org/10.1287/mnsc.39.12.1526.Suche in Google Scholar

5. Kurvers, RHJM, Krause, J, Argenziano, G, Zalaudek, I, Wolf, M. Detection accuracy of collective intelligence assessments for skin cancer diagnosis. JAMA Dermatol 2015;151:1346–53. https://doi.org/10.1001/jamadermatol.2015.3149.Suche in Google Scholar PubMed

6. Kurvers, RHJM, Herzog, SM, Hertwig, R, Krause, J, Carney, PA, Bogart, A, et al.. Boosting medical diagnostics by pooling independent judgments. Proc Natl Acad Sci USA 2016;113:8777–82. https://doi.org/10.1073/pnas.1601827113.Suche in Google Scholar PubMed PubMed Central

7. Barnett, ML, Boddupalli, D, Nundy, S, Bates, DW. Comparative accuracy of diagnosis by collective intelligence of multiple physicians vs individual physicians. JAMA Netw Open 2019;2:e190096. https://doi.org/10.1001/jamanetworkopen.2019.0096.Suche in Google Scholar PubMed PubMed Central

8. Wolf, M, Krause, J, Carney, PA, Bogart, A, Kurvers, RHJM. Collective intelligence meets medical decision-making: the collective outperforms the best radiologist. PLoS One 2015;10. https://doi.org/10.1371/journal.pone.0134269.Suche in Google Scholar PubMed PubMed Central

9. Kämmer, JE, Hautz, WE, Herzog, SM, Kunina-Habenicht, O, Kurvers, RHJM. The potential of collective intelligence in emergency medicine: pooling medical students᾽ independent decisions improves diagnostic performance. Med Decis Making 2017;37:715–24. https://doi.org/10.1177/0272989x17696998.Suche in Google Scholar PubMed

10. Krockow, EM, Kurvers, RHJM, Herzog, SM, Kämmer, JE, Hamilton, RA, Thilly, N, et al.. Harnessing the wisdom of crowds can improve guideline compliance of antibiotic prescribers and support antimicrobial stewardship. Sci Rep 2020;10. https://doi.org/10.1038/s41598-020-75063-z.Suche in Google Scholar PubMed PubMed Central

11. Altkorn, D. Chapter 1–5: the threshold model: conceptualizing probabilities. In: Stern, S, Cifu, A, Altcorn, D, editors. Symptom to diagnosis: an evidence-based guide, 4th ed. New York: McGraw-Hill; 2020.Suche in Google Scholar

12. Winkler, RL, Grushka-Cockayne, Y, Lichtendahl, KC, Jose, RR. Probability forecasts and their combination: a research perspective 1. Decis Anal 2019;16:239–60. https://doi.org/10.1287/deca.2019.0391.Suche in Google Scholar

13. Baron, J, Mellers, BA, Tetlock, PE, Stone, E, Ungar, LH. Two reasons to make aggregated probability forecasts more extreme. Decis Anal 2014;11:133–45. https://doi.org/10.1287/deca.2014.0293.Suche in Google Scholar

14. Brier, GW. Verification of forecasts expressed in terms of probability. Mon Weather Rev 1950;78:1–3. https://doi.org/10.1175/1520-0493(1950)078<0001:vofeit>2.0.co;2.10.1175/1520-0493(1950)078<0001:VOFEIT>2.0.CO;2Suche in Google Scholar

15. Murphy, AH. A new vector partition of the probability score. J Appl Meteorol 1973;12:595–600. https://doi.org/10.1175/1520-0450(1973)012<0595:anvpot>2.0.co;2.10.1175/1520-0450(1973)012<0595:ANVPOT>2.0.CO;2Suche in Google Scholar

16. Dezecache, G, Dockendorff, M, Ferreiro, DN, Deroy, O, Bahrami, B. Democratic forecast: small groups predict the future better than individuals and crowds. J Exp Psychol Appl 2022;28:525–37. https://doi.org/10.1037/xap0000424.Suche in Google Scholar

17. Han, Y, Budescu, DV. Recalibrating probabilistic forecasts to improve their accuracy. Judgm Decis Mak 2022;17:91. https://doi.org/10.1017/s1930297500009049.Suche in Google Scholar

18. Attali, Y, Budescu, D, Arieli-Attali, M. An item response approach to calibration of confidence judgments. Decision 2020;7:1–19. https://doi.org/10.1037/dec0000111.Suche in Google Scholar

19. Navajas, J, Niella, T, Garbulsky, G, Bahrami, B, Sigman, M. Aggregated knowledge from a small number of debates outperforms the wisdom of large crowds. Nat Human Behav 2018;2:126–32. https://doi.org/10.1038/s41562-017-0273-4.Suche in Google Scholar

20. Sunstein, C, Kahneman, D, Sibony, O. Noise: a flaw in human judgment, 1st ed. Boston, MA: Little, Brown Spark; 2021, vol 1.Suche in Google Scholar

21. Tetlock, P, Gardner, D. Superforecasting. New York: Crown Publishers; 2015.Suche in Google Scholar

Received: 2022-08-16

Accepted: 2023-01-18

Published Online: 2023-02-20

Sie haben derzeit keinen Zugang zu diesem Inhalt.

Artikel in diesem Heft

https://doi.org/10.1515/dx-2022-0090

Schlagwörter für diesen Artikel

collective intelligence; diagnostic forecasting; probabilistic diagnostic reasoning