Abstract
Objectives
Clinical prediction requires formalizing uncertainty into a statistical model. However, persistent confusion between prediction and inference, and between traditional (stepwise) and modern (penalized) development strategies, leads to unstable, poorly calibrated, and overfit models. A structured statistical framework is essential.
Methods
This article is a structured, didactic tutorial that explains the core concepts of clinical prediction models. It covers the definition of a prediction model, the fundamental strategies for its construction, and the essential framework for its evaluation, illustrated through an applied example using real-world clinical data.
Results
The tutorial illustrates model development using the GUSTO-I dataset (N = 40,830). Penalized methods (LASSO and Elastic Net) successfully identified clinical signals while eliminating engineered noise variables. The LASSO model (λ1se) achieved excellent discrimination (AUC 0.818; 95 % CI: 0.803–0.832) and overall accuracy (Brier score 0.058). Calibration analysis revealed a slope of 1.28 and intercept of 0.63, identifying conservative bias and systematic risk underestimation inherent to λ1se selection. Decision curve analysis confirmed significant clinical utility across relevant probability thresholds.
Conclusions
This guide equips clinicians with a rigorous methodological framework for the critical appraisal and interpretation of modern clinical prediction models.
-
Research ethics: Not applicable.
-
Informed consent: Not applicable.
-
Author contributions: Javier Arredondo Montero (JAM): Conceptualization; Methodology; Software; Validation; Formal analysis; Investigation; Resources; Data Curation; Writing – Original Draft; Writing – Review & Editing; Visualization; Supervision; Project administration. The author has accepted responsibility for the entire content of this manuscript and approved its submission.
-
Use of Large Language Models, AI and Machine Learning Tools: Artificial intelligence (ChatGPT 4, OpenAI) was used for language editing and for generating a synthetic dataset under the author’s explicit instructions. It did not influence the scientific content, analysis, or interpretation.
-
Conflict of interest: The author states no conflict of interest.
-
Research funding: None declared.
-
Data availability: Not applicable.
References
1. Steyerberg, EW. Clinical prediction models: a practical approach to development, validation, and updating, 2nd ed.. Cham: Springer; 2019.10.1007/978-3-030-16399-0Suche in Google Scholar
2. Shmueli, G. To explain or to predict? Stat Sci 2010;25:289–310. https://doi.org/10.1214/10-STS330.Suche in Google Scholar
3. Harrell, FEJr. Regression modeling strategies: with applications to linear models, logistic and ordinal regression. In: And survival analysis, 2nd ed.. Cham: Springer; 2015.10.1007/978-3-319-19425-7Suche in Google Scholar
4. Hosmer, DWJr., Lemeshow, S, Sturdivant, RX. Applied logistic regression, 3rd ed.. Hoboken, NJ: John Wiley & Sons; 2013.10.1002/9781118548387Suche in Google Scholar
5. Durrleman, S, Simon, R. Flexible regression models with cubic splines. Stat Med 1989;8:551–61. https://doi.org/10.1002/sim.4780080504.Suche in Google Scholar PubMed
6. Hastie, T, Tibshirani, R, Friedman, J. The elements of statistical learning: data mining, inference, and prediction, 2nd ed.. New York (NY): Springer; 2009.10.1007/978-0-387-84858-7Suche in Google Scholar
7. Mansfield, ER, Helms, BP. Detecting multicollinearity. Am Statistician 1982;36:158–60. https://doi.org/10.1080/00031305.1982.10482818.Suche in Google Scholar
8. Peduzzi, P, Concato, J, Kemper, E, Holford, TR, Feinstein, AR. A simulation study of the number of events per variable in logistic regression analysis. J Clin Epidemiol 1996;49:1373–9. https://doi.org/10.1016/s0895-4356(96)00236-3.Suche in Google Scholar PubMed
9. Steyerberg, EW, Eijkemans, MJ, Habbema, JD. Stepwise selection in small data sets: a simulation study of bias in logistic regression analysis. J Clin Epidemiol 1999;52:935–42. https://doi.org/10.1016/s0895-4356(99)00103-1.Suche in Google Scholar PubMed
10. Little, RJA, Rubin, DB. Statistical analysis with missing data, 2nd ed.. Hoboken (NJ): John Wiley & Sons; 2002:408 p.10.1002/9781119013563Suche in Google Scholar
11. Rubin, DB. Multiple imputation for nonresponse in surveys. New York: John Wiley & Sons; 1987.10.1002/9780470316696Suche in Google Scholar
12. Draper, NR, Smith, H. Applied regression analysis, 3rd ed.. New York: John Wiley & Sons; 1998.10.1002/9781118625590Suche in Google Scholar
13. Tibshirani, R. Regression shrinkage and selection via the lasso. J Roy Stat Soc B 1996;58:267–88. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x.Suche in Google Scholar
14. Hanley, JA, McNeil, BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 1982;143:29–36. https://doi.org/10.1148/radiology.143.1.7063747.Suche in Google Scholar PubMed
15. Arredondo Montero, J, Martín-Calvo, N. Diagnostic performance studies: interpretation of ROC analysis and cut-offs. Cir Esp 2023;101:865–7. https://doi.org/10.1016/j.cireng.2022.11.011.Suche in Google Scholar PubMed
16. Stone, M. Cross-validatory choice and assessment of statistical predictions. J Roy Stat Soc B 1974;36:111–33. https://doi.org/10.1111/j.2517-6161.1974.tb00994.x.Suche in Google Scholar
17. Steyerberg, EW, Harrell, FEJr, Borsboom, GJ, Eijkemans, MJ, Vergouwe, Y, Habbema, JD. Internal validation of predictive models: efficiency of some procedures for logistic regression analysis. J Clin Epidemiol 2001;54:774–81. https://doi.org/10.1016/s0895-4356(01)00341-9.Suche in Google Scholar PubMed
18. Siontis, GC, Tzoulaki, I, Castaldi, PJ, Ioannidis, JP. External validation of new risk prediction models is infrequent and reveals worse prognostic discrimination. J Clin Epidemiol 2015;68:25–34. https://doi.org/10.1016/j.jclinepi.2014.09.007.Suche in Google Scholar PubMed
19. Van Calster, B, Nieboer, D, Vergouwe, Y, De Cock, B, Pencina, MJ, Steyerberg, EW. A calibration hierarchy for risk models was defined: from utopia to empirical data. J Clin Epidemiol 2016;74:167–76. https://doi.org/10.1016/j.jclinepi.2015.12.005.Suche in Google Scholar PubMed
20. Austin, PC, Steyerberg, EW. Graphical assessment of internal and external calibration of logistic regression models by using loess smoothers. Stat Med 2014;33:517–35. https://doi.org/10.1002/sim.5941.Suche in Google Scholar PubMed PubMed Central
21. Brier, GW. Verification of forecasts expressed in terms of probability. Mon Weather Rev 1950;78:1–3. https://doi.org/10.1175/1520-0493(1950)078<0001:vofeit>2.0.co;2.10.1175/1520-0493(1950)078<0001:VOFEIT>2.0.CO;2Suche in Google Scholar
22. Vickers, AJ, Elkin, EB. Decision curve analysis: a novel method for evaluating prediction models. Med Decis Mak 2006;26:565–74. https://doi.org/10.1177/0272989X06295361.Suche in Google Scholar
23. Obermeyer, Z, Powers, B, Vogeli, C, Mullainathan, S. Dissecting racial bias in an algorithm used to manage the health of populations. Science 2019;366:447–53. https://doi.org/10.1126/science.aax2342.Suche in Google Scholar
24. Collins, GS, Reitsma, JB, Altman, DG, Moons, KG. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. BMJ 2015;350:g7594. https://doi.org/10.1136/bmj.g7594.Suche in Google Scholar
25. Collins, GS, Moons, KGM, Dhiman, P, Riley, RD, Beam, AL, Van Calster, B, et al.. TRIPOD+AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods. BMJ 2024;385:e078378. https://doi.org/10.1136/bmj-2023-078378.Suche in Google Scholar
Supplementary Material
This article contains supplementary material (https://doi.org/10.1515/dx-2025-0152).
© 2026 Walter de Gruyter GmbH, Berlin/Boston