Home Mathematics A Skellam regression model for quantifying positional value in soccer
Article
Licensed
Unlicensed Requires Authentication

A Skellam regression model for quantifying positional value in soccer

  • Konstantinos Pelechrinis ORCID logo EMAIL logo and Wayne Winston
Published/Copyright: January 6, 2021

Abstract

Soccer is undeniably the most popular sport world-wide and everyone from general managers and coaching staff to fans and media are interested in evaluating players’ performance. Metrics applied successfully in other sports, such as the (adjusted) +/− that allows for division of credit among a basketball team’s players, exhibit several challenges when applied to soccer due to severe co-linearities. Recently, a number of player evaluation metrics have been developed utilizing optical tracking data, but they are based on proprietary data. In this work, our objective is to develop an open framework that can estimate the expected contribution of a soccer player to his team’s winning chances using publicly available data. In particular, using data from (i) approximately 20,000 games from 11 European leagues over eight seasons, and, (ii) player ratings from the FIFA video game, we estimate through a Skellam regression model the importance of every line (attackers, midfielders, defenders and goalkeeping) in winning a soccer game. We consequently translate the model to expected league points added above a replacement player (eLPAR). This model can further be used as a guide for allocating a team’s salary budget to players based on their expected contributions on the pitch. We showcase similar applications using annual salary data from the English Premier League and identify evidence that in our dataset the market appears to under-value defensive line players relative to goalkeepers.


Corresponding author: Konstantinos Pelechrinis, School of Computing and Information, University of Pittsburgh, Pittsburgh, USA, E-mail:

  1. Author contribution: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.

  2. Research funding: None declared.

  3. Conflict of interest statement: The authors declare no conflicts of interest regarding this article.

References

Bornn, L., D. Cervone, and J. Fernandez. 2018. “Soccer Analytics: Unravelling the Complexity of the Beautiful Game.” Significance 15: 26–9, https://doi.org/10.1111/j.1740-9713.2018.01146.x.Search in Google Scholar

Boshnakov, G., T. Kharrat, and I. G. McHale. 2017. “A Bivariate Weibull Count Model for Forecasting Association Football Scores.” International Journal of Forecasting 33: 458–66, https://doi.org/10.1016/j.ijforecast.2016.11.006.Search in Google Scholar

Cotta, L., P. de Melo, F. Benevenuto, and A. A. Loureiro. 2016. “Using FIFA Soccer Video Game Data for Soccer Analytics.” In Workshop on Large Scale Sports Analytics.Search in Google Scholar

Decroos, T., L. Bransen, J. V. Haaren, and J. Davis. 2019. Actions Speak Louder Than Goals: Valuing Player Actions in Soccer. NY, United States: ACM SIGKDD.10.1145/3292500.3330758Search in Google Scholar

Economist. 2018. How GPS Tracking is Changing Football. https://www.1843magazine.com/technology/how-gps-tracking-is-changing-football.Search in Google Scholar

Fairchild, A., K. Pelechrinis, and M. Kokkodis. 2018. “Spatial Analysis of Shots in MLS: A Model for Expected Goals and Fractal Dimensionality.” Journal of Sports Analytics 4: 165–74, https://doi.org/10.3233/jsa-170207.Search in Google Scholar

Fernandez, J., and L. Bornn. 2018. Wide Open Spaces: A Statistical Technique for Measuring Space Creation in Professional Soccer. In annual MIT Sloan Sports Analytics Conference, 2018. Boston, MA.Search in Google Scholar

Fernández, J., L. Bornn, and D. Cervone. 2019. “Decomposing the immeasurable Sport: A Deep Learning Expected Possession Value Framework for Soccer.” In 13th Annual MIT Sloan Sports Analytics Conference.Search in Google Scholar

Greenhough, J., P. Birch, S. Chapman, and G. Rowlands. 2002. “Football Goal Distributions and Extremal Statistics.” Physica A: Statistical Mechanics and its Applications 316: 615–24, https://doi.org/10.1016/s0378-4371(02)01030-0.Search in Google Scholar

He, M., R. Cachucho, and A. Knobbe. 2015. “Football Players Performance and Market Value.” In Proceedings of the 2nd Workshop of Sports Analytics, European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD).Search in Google Scholar

Kaggle. 2016. European Soccer Database. https://www.kaggle.com/hugomathien/soccer/.Search in Google Scholar

Karlis, D., and I. Ntzoufras. 2000. “On Modelling Soccer Data.” Student 3: 229–44.Search in Google Scholar

Karlis, D., and I. Ntzoufras. 2003. “Analysis of Sports Data by Using Bivariate Poisson Models.” Journal of the Royal Statistical Society: Series D (The Statistician) 52: 381–93, https://doi.org/10.1111/1467-9884.00366.Search in Google Scholar

Karlis, D., and I. Ntzoufras. 2005. “Bivariate Poisson and Diagonal Inflated Bivariate Poisson Regression Models in R.” Journal of Statistical Software 14, https://doi.org/10.18637/jss.v014.i10.Search in Google Scholar

Kharrat, T., I. G. McHale, and J. L. Peña. 2020. “Plus–minus Player Ratings for Soccer.” European Journal of Operational Research 283: 726–36, https://doi.org/10.1016/j.ejor.2019.11.026.Search in Google Scholar

Le, H. M., P. Carr, Y. Yue, and P. Lucey. 2017a. “Data-Driven Ghosting Using Deep Imitation Learning.” In MIT Sloan Sports Analytics Conference.Search in Google Scholar

Le, H. M., Y. Yue, and P. Carr. 2017b. “Coordinated Multi-agent Imitation Learning.” ICML. Proceedings of the 34th International Conference on Machine Learning, PMLR 70:1995–2003.Search in Google Scholar

Lee, A. J. 1997. “Modeling Scores in the Premier League: Is Manchester United Really the Best?” Chance 10: 15–9, https://doi.org/10.1080/09332480.1997.10554791.Search in Google Scholar

Lucey, P., A. Bialkowski, M. Monfort, P. Carr, and I. Matthews. 2015. Quality vs Quantity: Improved Shot Prediction in Soccer Using Strategic Features from Spatiotemporal Data. In Annual Mit Sloan Sports Analytics Conference, 2015. Boston, MA.Search in Google Scholar

Lynn, M. 1989. “Scarcity Effects on Desirability: Mediated by Assumed Expensiveness?.” Journal of Economic Psychology 10: 257–74, https://doi.org/10.1016/0167-4870(89)90023-8.Search in Google Scholar

Matano, F., L. F. Richardson, T. Pospisil, C. Eubanks, and J. Qin. 2018. “Augmenting Adjusted Plus–Minus in Soccer with FIFA Ratings.” In Carnegie Mellon Sports Analytics Conference.Search in Google Scholar

McHale, I., and P. Scarf. 2007. “Modelling Soccer Matches Using Bivariate Discrete Distributions with General Dependence Structure.” Statistica Neerlandica 61: 432–45, https://doi.org/10.1111/j.1467-9574.2007.00368.x.Search in Google Scholar

Müller, O., A. Simons, and M. Weinmann. 2017. “Beyond Crowd Judgments: Data-Driven Estimation of Market Value in Association Football.” European Journal of Operational Research 263: 611–24, https://doi.org/10.1016/j.ejor.2017.05.005.Search in Google Scholar

NBCSports. 2018. Best Selling Premier League Player Jerseys Revealed. https://soccer.nbcsports.com/2018/02/15/top-20-premier-league-player-jerseys-revealed/.Search in Google Scholar

Niculescu-Mizil, A., and R. Caruana. 2005. “Predicting Good Probabilities with Supervised Learning.” In Proceedings of the 22nd International Conference on Machine Learning, 625–32.10.1145/1102351.1102430Search in Google Scholar

Noslo, E., P. Lambrix, and N. Carlsson. 2018. “Player Valuation in European Football,” In Workshop on Machine Learning and Data Mining for Sports Analytics (ECML/PKDD).10.1007/978-3-030-17274-9_4Search in Google Scholar

Pelton, K. 2019. How Real Plus–Minus Can Reveal Hidden NBA Stars. https://www.espn.com/nba/story/_/id/28309836/how-real-plus-minus-reveal-hidden-nba-stars.Search in Google Scholar

Pollard, R. 1985. “69.9 Goal-Scoring and the Negative Binomial Distribution.” The Mathematical Gazette 69: 45–7, https://doi.org/10.2307/3616453.Search in Google Scholar

Power, P., H. Ruiz, X. Wei, and P. Lucey. 2017. “Not All Passes are Created Equal: Objectively Measuring the Risk and Reward of Passes in Soccer from Tracking Data.” In KDD ’17, 1605–13.10.1145/3097983.3098051Search in Google Scholar

Shank, K. 2017. Expected Goal Chains: The Link between Passing Sequences and Shots. https://www.americansocceranalysis.com/home/2017/10/3/expected-goal-chains-the-link-between-passing-sequences-and-shots.Search in Google Scholar

Skellam, J. G. 1946. “The Frequency Distribution of the Difference between Two Poisson Variates Belonging to Different Populations.” Journal of the Royal Statistical Society: Series A 109: 296, https://doi.org/10.2307/2981372.Search in Google Scholar

StatsBomb. 2018. The Dual Life of Expected Goals (Part 1). https://statsbomb.com/2018/05/the-dual-life-of-expected-goals-part-1/.Search in Google Scholar

Stern, H. 1991. “On the Probability of Winning a Football Game.” American Statistician 45: 179–83, https://doi.org/10.2307/2684286.Search in Google Scholar

TheEconomist. 2018. Why Footballs Goalkeepers Are Cheap and Unheralded. https://www.economist.com/game-theory/2018/02/09/why-footballs-goalkeepers-are-cheap-and-unheralded.Search in Google Scholar

Weisheimer, A., and T. Palmer. 2014. “On the Reliability of Seasonal Climate Forecasts.” Journal of the Royal Society Interface 11: 20131162, https://doi.org/10.1098/rsif.2013.1162.Search in Google Scholar PubMed PubMed Central

Woolner, K. 2001a. Introduction to Vorp: Value Over Replacement Player. https://web.archive.org/web/20070928064958/http://www.stathead.com/bbeng/woolner/vorpdescnew.htm.Search in Google Scholar

Woolner, K. 2001b. Vorp: Measuring the Value of a Baseball Player’s Performance. https://web.archive.org/web/20080926233543/http://www.stathead.com/articles/woolner/vorp.htm.Search in Google Scholar

Woolner, K. 2002. “Understanding and Measuring Replacement Level.” Baseball Prospectus 1: 55–66.Search in Google Scholar

WorldAtlas. 2018. The Most Popular Sports in the World. https://www.worldatlas.com/articles/what-are-the-most-popular-sports-in-the-world.html.Search in Google Scholar

Received: 2019-11-26
Accepted: 2020-11-29
Published Online: 2021-01-06
Published in Print: 2021-09-27

© 2020 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 14.12.2025 from https://www.degruyterbrill.com/document/doi/10.1515/jqas-2019-0122/pdf
Scroll to top button