Abstract
While the use of expected goals (xG) as a metric for assessing soccer performance is increasingly prevalent, the uncertainty associated with their estimates is often overlooked. This work bridges this gap by providing easy-to-implement methods for uncertainty quantification in xG estimates derived from Bayesian models. Based on a convenient posterior approximation, we devise an online prior-to-posterior update scheme, aligning with the typical in-season model training in soccer. Additionally, we present a novel framework to assess and compare the performance dynamics of two teams during a match, while accounting for evolving match scores. Our approach is well-suited for graphical representation and improves interpretability. We validate the accuracy of our methods through simulations, and provide a real-world illustration using data from the Italian Serie A league.
-
Research ethics: Not applicable.
-
Informed consent: Not applicable.
-
Author contributions: All authors have accepted responsibility for the entire content of this manuscript and approved its submission.
-
Use of Large Language Models, AI and Machine Learning Tools: None declared.
-
Conflict of interest: The authors state no conflict of interest.
-
Research funding: Bernardo Nipoti acknowledges support of MUR - Prin 2022 - Grant no. 2022CLTYP4, funded by the European Union – Next Generation EU.
-
Data availability: The simulated datasets used in this study are available from the corresponding author upon reasonable request.
A closed-form approximation of the credible intervals for
In logistic regression, for instance, one may approximate the function
with
We rely on the multivariate Gaussian approximation for the posterior of the vector η resulting at the end of the online learning procedure, specifying
where

Serie A league dataset. Comparison between the shifted cumulative scoring probability functions for Juventus in the match Genoa-Juventus, played on the 27th of November 2016, computed via Gibbs sampler and by applying a closed-form approximation. Shaded areas denote 95 % credible intervals.
References
Albert, J.H. and Chib, S. (1993). Bayesian analysis of binary and polychotomous response data. J. Am. Stat. Assoc. 88: 669–679. https://doi.org/10.2307/2290350.Search in Google Scholar
Baumer, B.S., Matthews, G.J., and Nguyen, Q. (2023). Big ideas in sports analytics and statistical tools for their investigation. Wiley Interdiscip. Rev. Comput. Stat. 15: e1612, https://doi.org/10.1002/wics.1612.Search in Google Scholar
Cavus, M. and Biecek, P. (2022). Explainable expected goal models for performance analysis in football analytics. In: 2022 IEEE 9th international conference on data science and advanced analytics (DSAA). IEEE, pp. 1–9.10.1109/DSAA54385.2022.10032440Search in Google Scholar
Durante, D. (2019). Conjugate Bayes for probit regression via unified skew-normal distributions. Biometrika 106: 765–779. https://doi.org/10.1093/biomet/asz034.Search in Google Scholar
Hewitt, J.H. and Karakuş, O. (2023). A machine learning approach for player and position adjusted expected goals in football (soccer). Franklin Open 4: 100034. https://doi.org/10.1016/j.fraope.2023.100034.Search in Google Scholar
Itti, L. and Baldi, P. (2009). Bayesian surprise attracts human attention. Vision Res. 49: 1295–1306. https://doi.org/10.1016/j.visres.2008.09.007.Search in Google Scholar PubMed PubMed Central
Lambert, M., Bonnabel, S., and Bach, F. (2022). The recursive variational Gaussian approximation (r-vga). Stat. Comput. 32: 10. https://doi.org/10.1007/s11222-021-10068-w.Search in Google Scholar
Macdonald, B. (2012). An expected goals model for evaluating NHL teams and players. In: Proceedings of the 2012 MIT sloan sports analytics conference.Search in Google Scholar
Mead, J., O’Hare, A., and McMenemy, P. (2023). Expected goals in football: improving model performance and demonstrating value. PLoS One 18: e0282295. https://doi.org/10.1371/journal.pone.0282295.Search in Google Scholar PubMed PubMed Central
Mortelier, A., Rioult, F., and Komar, J. (2023). What data should be collected for a good handball expected goal model? In: International workshop on machine learning and data mining for sports analytics. Springer, pp. 119–130.10.1007/978-3-031-53833-9_10Search in Google Scholar
Pollard, R. and Reep, C. (1997). Measuring the effectiveness of playing strategies at soccer. J. R. Stat. Soc. Ser. D: Stat. 46: 541–550. https://doi.org/10.1111/1467-9884.00108.Search in Google Scholar
Polson, N.G., Scott, J.G., and Windle, J. (2013). Bayesian inference for logistic models using Pólya–gamma latent variables. J. Am. Stat. Assoc. 108: 1339–1349. https://doi.org/10.1080/01621459.2013.829001.Search in Google Scholar
Rathke, A. (2017). An examination of expected goals and shot efficiency in soccer. J. Hum. Sport Exerc. 12: 514–529. https://doi.org/10.14198/jhse.2017.12.proc2.05.Search in Google Scholar
Robert, C. and Casella, G. (2004). Monte Carlo statistical methods. Springer Verlag, New York.10.1007/978-1-4757-4145-2Search in Google Scholar
Santos-Fernandez, E., Wu, P., and Mengersen, K.L. (2019). Bayesian statistics meets sports: a comprehensive review. J. Quant. Anal. Sports 15: 289–312. https://doi.org/10.1515/jqas-2018-0106.Search in Google Scholar
Schiavon, L. and Sartori, N. (2019). Bias reduced estimation of a fixed effects model for expected goals in association football. In: Arbia, G., Peluso, S., Pini, A., and Rivellini, G. (Eds.), Smart statistics for smart application. Pearson, London.Search in Google Scholar
Scholtes, A. and Karakuş, O. (2024). Bayes-xG: player and position correction on expected goals (xG) using Bayesian hierarchical approach. Front. Sports Act. Living 6: 1348983, https://doi.org/10.3389/fspor.2024.1348983.Search in Google Scholar PubMed PubMed Central
Spiegelhalter, D.J. and Lauritzen, S.L. (1990). Sequential updating of conditional probabilities on directed graphical structures. Networks 20: 579–605. https://doi.org/10.1002/net.3230200507.Search in Google Scholar
Tadesse, M.G. and Vannucci, M. (2021). Handbook of Bayesian variable selection. Chapman and Hall/CRC, New York.10.1201/9781003089018Search in Google Scholar
© 2024 Walter de Gruyter GmbH, Berlin/Boston
Articles in the same Issue
- Frontmatter
- Editorial
- Thoughts from the Editor
- Research Articles
- European football player valuation: integrating financial models and network theory
- On the efficiency of trading intangible fixed assets in Major League Baseball
- Expected goals under a Bayesian viewpoint: uncertainty quantification and online learning
- Bayesian bivariate Conway–Maxwell–Poisson regression model for correlated count data in sports
- Success factors in national team football: an analysis of the UEFA EURO 2020
Articles in the same Issue
- Frontmatter
- Editorial
- Thoughts from the Editor
- Research Articles
- European football player valuation: integrating financial models and network theory
- On the efficiency of trading intangible fixed assets in Major League Baseball
- Expected goals under a Bayesian viewpoint: uncertainty quantification and online learning
- Bayesian bivariate Conway–Maxwell–Poisson regression model for correlated count data in sports
- Success factors in national team football: an analysis of the UEFA EURO 2020