Home European football player valuation: integrating financial models and network theory
Article
Licensed
Unlicensed Requires Authentication

European football player valuation: integrating financial models and network theory

  • Albert Cohen and Jimmy Risk ORCID logo EMAIL logo
Published/Copyright: January 7, 2025

Abstract

This paper presents a new framework for player valuation in European football, by fusing principles from financial mathematics and network theory. The valuation model leverages a “passing matrix” to encapsulate player interactions on the field, utilizing centrality measures to quantify individual influence. Unlike traditional approaches, such as regressing on past performance-salary data, this model focuses on in-game performance as a player’s contributions evolve over time. Consequently, our model provides a dynamic and individualized framework for ascertaining a player’s fair market value. The methodology is empirically validated through a case study in European football, employing real-world match and financial data. This cross-disciplinary mechanism for player valuation adapts the effect of connecting pay with performance, first seen in Scully ((1974). Pay and performance in major league baseball. Am. Econ. Rev. 64: 915–930), to include in-game contributions as well as expected present valuation of stochastic variables.


Corresponding author: Jimmy Risk, Department of Mathematics & Statistics, Cal Poly Pomona, Pomona CA 91676, USA, E-mail: 

  1. Research Ethics: Not applicable.

  2. Informed consent: Not applicable.

  3. Author contributions: All authors have accepted responsibility for the entire content of this manuscript and approved its submission.

  4. Use of Large Language Models, AI and Machine Learning Tools: None declared.

  5. Conflict of interest: The authors state no conflict of interest.

  6. Research funding: None declared.

  7. Data availability: Not applicable.

  8. Software availability: Software used in this study is available at the soccer-player-valuation github repository (https://github.com/jimmyrisk/soccer-player-valuation).

Appendix A Details on conditional expectation and filtrations

The notion of a filtration is a technical point that allows conditional expectations such as the one in (4) to be defined properly, and for “information flow” to be rigorously discussed as what information is available at time t. In particular, one can think of F t as the information known at time t. This information is not lost, so F s F t for s < t, that is, the future holds more information. This is made rigorous through the concept of a filtration F = ( F t ) t 0 , the collection of all information sets. The interested reader is referred to (Shreve et al. 2004) for more on this technical point. There are two intuitive properties of conditional expectations with respect to filtrations used in this paper:

  1. E [ ξ s η s | F t ] = ξ s E [ η s | F t ] whenever ξ s F t . This means that a stochastic process is treated as deterministic with respect to the F t -conditional expectation as long as it is known at time t. Intuitively, one can think of ξ s as being a “constant” with respect to F t , since its value is known.

  2. E [ ξ s | F 0 ] = E [ ξ s ] , meaning that the time t = 0 information is the same as the traditional expectation operator. Intuitively, this is the situation where there is no additional information known from the stochastic evolution.

Appendix B Calculations for salary equation

Here we outline the derivation of Equation (9). This involves finding E [ π t + h R t + h | F t ] and consequently S t , t + h = e λ h a t + h E [ π t + h R t + h | F t ] follows from Equation (4). Apply Itô’s formula (Shreve et al. 2004) to d(π t R t ) using the dynamics in (6), then apply the conditional expectation operator E [ | F s ] . Note that dW terms will vanish since dW t is independent of F s . This results in the following equation for t > s:

(18) E [ d ( π t R t ) | F s ] = ( μ θ ) E [ π t R t | F s ] d t + θ π * E [ R t | F s ] d t + ρ σ π σ R E [ R t π t ( 1 π t ) | F s ] d t .

Now, write f ( t ) = E [ π t R t | F s ] . Assuming sufficient regularity so that E [ d π t R t | F s ] = d E [ π t R t | F s ] and denoting f = d d t f , this yields

(19) f ( t ) = ( μ θ ) f ( t ) + θ π * R s e μ ( t s ) + c E [ R t π t ( 1 π t ) | F s ] .

using the fact that E [ R t | F s ] = R s e μ ( t s ) and denoting c = ρσ π σ R .

This is a first-order linear non-homogeneous ordinary differential equation (ODE) with initial condition f(s) = π s R s , which is well studied. However, the E [ R t π t ( 1 π t ) | F s ] term makes this difficult to solve. We use a first order Taylor series approximation of v ( π t ) = π t ( 1 π t ) about its mean reversion parameter π*:

(20) v ( π t ) = v ( π * ) + v ( π * ) ( π t π * ) + O ( π t π * 2 ) , v ( π * ) = 1 2 π * 2 π * ( 1 π * ) ,

which is reasonable since π t is mean-reverting to π*. Truncating the O ( π t π * 2 ) terms and substituting v(π*) + v′(π*)(π t π*) for v(π t ) in E [ R t v ( π t ) | F s ] , the ODE becomes

(21) f ( t ) = μ θ + c v ( π * ) f ( t ) + θ π * + c v ( π * ) c π * v ( π * ) R s e μ ( t s ) .

This is straightforward to solve and has solution

(22) f ( t ) = e μ ( t s ) R t π * + π t π * e ( c v θ ) ( t s ) + c π * ( 1 π * ) 1 e ( c v θ ) ( t s ) θ c v , t s .

Note that this nicely reduces to the initial condition f(s) = π s R s when plugging in t = s.

B.1 Alternative methods for value calculation

Denote f ( x , y , t ; T ) = E [ π T R T | π t = x , R t = y ] = E [ π T R T | F t ] . The Feynman–Kac formula (Shreve et al. 2004) gives the partial differential equation

(23) f t θ ( x π * ) f x + μ y f y + 1 2 σ π 2 x ( 1 x ) f x x + 1 2 σ R 2 y 2 f y y + ρ σ π σ R y x ( 1 x ) f x y = 0

with terminal condition f(x, y, T) = xy, using the shorthand x f = f x , and similar for other partial derivatives. This can be solved numerically to obtain E [ π T R T | π t = x , R t = y ] for any T, t, x and y (for example, f(π0, R0, 0, T)).

Alternatively, the dynamics in Equation (6) can be used to obtain a Monte Carlo estimate of St,t+h. This involves simulating the joint evolution of πt+h and Rt+h given F t a large number of times and computing an average. The number of simulations L can be increased arbitrarily to decrease the standard error of the Monte Carlo estimate (which scales in 1 / L (Glasserman 2004)). However, the joint pdf is unknown so there is an additional time-step discretization error.

To illustrate the accuracy of our method, we use the setup corresponding to the example in Section 4.1 and compare our first-order approximation S0,T as in Equation (9) with a zeroth-order approximation S 0 , T 0 (obtained by replacing π t with π* in the approximation). We also compare these with a numerical PDE estimate S 0 , T PDE based on Equation (23) and a Monte Carlo estimate S 0 , T MC for T = 1 50 , 1,2,3 . The short time horizon T = 1 50 (approximately one week) is included to assess approximation quality over shorter durations.

The PDE estimate is obtained using Wolfram Mathematica 13.3 with default settings, a minimum grid size of 100, and accuracy and precision goals set to 6. The Monte Carlo estimate employs the yuima package in R (Iacus and Yoshida 2018) with M = 106 independent sample paths of π t and R t , using a timestep of Δt = 0.001. Table 5 presents the results.

Table 3:

Estimates of μ, σ R and calibrated values for each team’s player share (a k ) process.

Team name μ ̂ σ ̂ R a 1 a 2 a 3 a 4 a 5
Liverpool 0.073 0.116 0.219 0.233 0.255 0.208 0.239
Arsenal −0.009 0.097 0.307 0.298 0.307 0.216 0.264
Brighton 0.072 0.174 0.236 0.292 0.285 0.214 0.182
Table 4:

Estimates of risk premium λ ̂ over players. Trent refers to Trent Alexander-Arnold.

Team player Liverpool Arsenal Brighton
Salah Trent van Dijk Nketiah Xhaka Holding Gross March Dunk
λ ̂ 0.011 0.043 0.207 0.051 0.136 0.203 0.075 0.169 0.055
Table 5:

Comparing valuation approximations for T = 1 50 , 1,2,3 .

T S 0,T S 0 , T 0 S 0 , T PDE S 0 , T MC SE S 0 , T MC
1 50 4,088.192 4,088.561 4,088.162 4,083.497 0.4994
1 5,511.127 5,511.251 5,510.577 5,507.710 2.0395
2 6,053.327 6,053.006 6,052.819 6,051.692 2.5964
3 6,623.869 6,623.499 6,623.427 6,621.244 3.1936

First, observe that the Monte Carlo standard errors increase with T, reflecting greater variability over longer time horizons. This increase underscores the inclusion of the short-term horizon T = 1/50, where the standard error is relatively low (0.4994), allowing for a more precise comparison. The first-order S0,T is in close agreement with S 0 , T PDE , differing by only 0.03. In contrast, the zeroth-order S 0 , T 0 differs from S0,T by 0.369 and from S 0 , T PDE by 0.399, indicating that including the first-order term significantly enhances the approximation’s accuracy.

At longer time horizons (T = 2 and T = 3), an interesting pattern emerges: the PDE estimates S 0 , T PDE are closer to the zeroth-order S 0 , T 0 than to the first-order S0,T. This reversal is unexpected, as the first-order approximation is an objective improvement over the zeroth-order approximation. A likely explanation is that the PDE solutions at longer time horizons suffer from discretization inaccuracies. Despite using a finer grid and higher accuracy settings than the default, these inaccuracies gave estimates closer to the zeroth-order approximation. Hence, the first-order S0,T is doing quite well.

The Monte Carlo estimates have been mostly excluded from this analysis as the T = 1/50 case reveals a clear issue with discretization error, and longer periods will lead to more significant biases. Additionally, the standard errors increase, effectively reducing the accuracy of the estimates.

Appendix C On passing matrix Markov chains

C.1 Derivation of Markov chain probability

Let ( X n ) n = 0 represent the team possession process, indicating which player has the ball at step n for a given team possession, governed by the Markov chain transition probabilities P (Grimmett and Stirzaker 2020), where a “step” refers to a transition in ball possession (pass, shot, etc.). In addition, we treat the initial distribution α i = P ( X 0 = i ) as the probability of player i beginning a team possession (through start of half, steal, penalty, etc.). From this Markov chain, the metric of interest is

(24) q i = P ( A i | X = S ) , A i = { X = i  for some  = 0,1 , }

Here, X = S means the team possession ended in S instead of U, and A i is the event that player i had the ball at some point during that team possession. Note that P is a finite Markov chain. For simplicity, assume that all states communicate with one another except for the two absorbing states S and U. It is well known that one can find the probability of absorption into one of these states starting in state j. Now,

q i = P ( A i | X = S ) = P ( A i { X = S } ) P ( X = S ) .

For the numerator

P ( A i { X = S } ) = j P ( A i { X = S } | X 0 = j ) P ( X 0 = j ) = j P ( X = S | X 0 = j ) P ( { X = S } \ A i | X 0 = j ) P ( X 0 = j )

where {X = S}\A i is the set difference, i.e. the case of absorption into S but player i never having possession. This probability can be computed by considering a new Markov chain in which state i is treated as absorbing, and by determining the probability of reaching state S before reaching state i (refer to Grimmett and Stirzaker 2020 for more details). For the denominator,

P ( X = S ) = j P ( X = S | X 0 = j ) P ( X 0 = j )

Note that the initial distribution α j = P ( X 0 = j ) appears in both the numerator and denominator, weighing accordingly to the probability that player j begins a team possession.

Appendix D Likelihoods and estimation

Considering observed trajectories of the univariate processes ( π t 1 , , π t N ) and ( R t 1 , , R t N ) , maximum likelihood estimation (MLE) (Casella and Berger 2024) estimates their unknown parameters with the goal to maximize the probability density functions of the observations:

(25) π ̂ * , θ ̂ , σ ̂ π = a r g m a x π * , θ , σ π f ( π t 1 , , π t N ; π * , θ , σ π ) ,

and similarly for μ, σ R . Since the processes in question are Markov, one can instead write (again, in the π case)

f ( π t 1 , , π t N ; π * , θ , σ π ) = f ( π 1 ; π * , θ , σ π ) × i = 2 N f ( π t i | π t i 1 ; π * , θ , σ π )

where the f ( π t i | π t i 1 ; π * , θ , σ π ) are the transition densities and f(π1; π*, θ, σ π ) is the stationary pdf. Since R is a GBM, the MLE’s for μ and σ R are available in closed form

(26) σ ̂ R 2 = 1 N i = 1 N log R t i / R t i 1 y ̂ 2 , μ ̂ = 1 2 σ ̂ R 2 + y ̂

where y ̂ = 1 N i = 1 N log R t i / R t i 1 . For π ̂ * , θ ̂ and σ ̂ π , the transition densities f ( π t i | π t i 1 ; π * , θ , σ π ) have no closed form. The Python package pymle (Kirkby et al. 2024) supports several approximations and obtains the MLEs. Our analysis used the Kessler (1997) density estimate which assumes a Gaussian transition density using a second order mean and variance approximation according to dynamics in (6).

D.1 Bivariate dynamics

Estimating ρ in addition to other parameters involves the maximization

(27) μ ̂ , σ ̂ R , π ̂ * , θ ̂ , σ ̂ π , ρ ̂ = argmax μ , σ R , π * , θ , σ π , ρ × i = 1 N f ( π t i , R t i | π t i 1 , R t i 1 ; μ , σ R , π * , θ , σ π , ρ ) .

Note that some of the estimates may be first fixed, for example according to a univariate process estimate as was done in the paper. In this case one simply plugs in that estimate prior to maximizing (e.g. μ = μ ̂ with (26)). The aforementioned package does not handle multi-process dynamics, so we outline an approach using the Euler–Maruyama method (Kloeden et al. 1992), similar to the Kessler method. Working with L t = log(R t ) is easier, in which case Ito’s lemma shows d L t = μ σ R 2 / 2 d t + σ R d W t R . Euler–Maruyama discretizes the dynamics in (6) so that

(28) π t i = π t i 1 θ π t i 1 π * h i + σ π π t i 1 ( 1 π t i 1 ) Z i π ,
(29) L t i = L t i 1 + μ σ R 2 / 2 h i + σ R Z i R ,

where h i = t i ti−1, and the Z i π , Z i R come from the Brownian motion innovations. These are normally distributed with mean zero and variance h i , are independent of Z j π , Z j R for ij, and have corr Z i π , Z i R = ρ . Consequently, the bivariate transition densities f ( π t i , L t i | π t i 1 , L t i 1 ) are bivariate normal with mean vector and covariance matrix

(30) π t i 1 θ π t i 1 π * h i L t i 1 + μ σ R 2 / 2 h i , σ π 2 π t i 1 ( 1 π t i 1 ) h i ρ σ R σ π π t i 1 ( 1 π t i 1 ) h i ρ σ R σ π π t i 1 ( 1 π t i 1 ) h i σ R 2 h i .

Here, σ R , σ π , θ > 0, −1 ≤ ρ ≤ 1, 0 ≤ π* ≤ 1, and min ( π * , 1 π * ) σ π 2 / ( 2 θ ) . These are used in tandem with Equation (27) to estimate any unknown parameters.

D.2 Estimation of player performance shares

For a game that occurred at time t, the procedure to obtain πj,t for all players j = 1, …, M on the team is as follows:

  1. Construct the passing frequency matrix, i.e. the M × M matrix where the i, j entry is the number of passes from player i to player j. Denote the i, j entry of this matrix as ni,j.

  2. Add a row and column associated with the S and U state. The added rows should be zeroes except for the diagonal entries which equal 1.

  3. The ith entry of the S column is a convex combination of the number of shots made and shots missed for player i:

    (31) n i , S = w S n i , score + ( 1 w S ) n i , miss ,

    where ni,score is the number of times player i scored (implicitly for the game at time t), and ni,miss is the number of missed shots. We follow the weighting scheme of Opta (Stats Perform 2024) with w S = 5/6, which counts a missed shot as 20 % of a score.

  4. The ith entry of the U column is the number of missed passes by that player.

  5. Divide each row of the resulting (M + 2) × (M + 2) matrix by its row sum to obtain P t , the empirical augmented passing matrix for the game at time t.

  6. Let α t R M be the vector whose ith entry, i = 1, …, M, is the proportion of times player i began possession for the game at time t.

  7. Use the methods described in Appendix C.1 to obtain qj,t, j = 1, …, M, and consequently use Equation (12) to obtain πj,t, j = 1, …, M, using a six-game moving average.

We remark that step 4 could be adjusted to include turnovers, or even a weighted approach like in (31) to more heavily penalize egregious mistakes. Step 6 could also use a weighting scheme, e.g. to add weight to steals.

References

Bakosi, J. and Ristorcelli, J. (2010). Exploring the beta distribution in variable-density turbulent mixing. J. Turbul.: N37. https://doi.org/10.1080/14685248.2010.510843.Search in Google Scholar

Brown, K.H. and Jepsen, L.K. (2009). The impact of team revenues on MLB salaries. J. Sports Econ. 10: 192–203. https://doi.org/10.1177/1527002508329858.Search in Google Scholar

Casella, G. and Berger, R. (2024). Statistical inference. CRC Press, Boca Raton, FL, USA.10.1201/9781003456285Search in Google Scholar

Coluccia, D., Fontana, S., and Solimene, S. (2018). An application of the option-pricing model to the valuation of a football player in the ‘Serie A League’. Int. J. Sport Manag. Mark. 18: 155–168. https://doi.org/10.1504/ijsmm.2018.091345.Search in Google Scholar

Deloitte Sports Business Group. (2019–2023). Annual review of football finance 2019–2023, Available at: https://www2.deloitte.com/content/dam/Deloitte/uk/Documents/sports-business-group/deloitte-uk-annual-review-of-football-finance-2023.pdf.Search in Google Scholar

Duch, J., Waitzman, J.S., and Amaral, L.A.N. (2010). Quantifying the performance of individual players in a team activity. PLoS One 5: e10937. https://doi.org/10.1371/journal.pone.0010937.Search in Google Scholar PubMed PubMed Central

Forman, J.L. and Sørensen, M. (2008). The pearson diffusions: a class of statistically tractable diffusion processes. Scand. J. Stat. 35: 438–465. https://doi.org/10.1111/j.1467-9469.2007.00592.x.Search in Google Scholar

Glasserman, P. (2004). Monte Carlo methods in financial engineering, 53. Springer, New York, NY, USA.10.1007/978-0-387-21617-1Search in Google Scholar

Grimmett, G. and Stirzaker, D. (2020). Probability and random processes. Oxford University Press, Oxford UK.Search in Google Scholar

Hull, J.C. and Basu, S. (2016). Options, futures, and other derivatives. Pearson Education India, Noida, India.Search in Google Scholar

Iacus, S.M. and Yoshida, N. (2018). Simulation and inference for stochastic processes with yuima. In: A comprehensive R framework for SDEs and other stochastic processes. Use R. Springer Nature, Cham, Switzerland.10.1007/978-3-319-55569-0Search in Google Scholar

Késenne, S. (2000). Revenue sharing and competitive balance in professional team sports. J. Sports Econ. 1: 56–65. https://doi.org/10.1177/152700250000100105.Search in Google Scholar

Kessler, M. (1997). Estimation of an ergodic diffusion from discrete observations. Scand. J. Stat. 24: 211–229. https://doi.org/10.1111/1467-9469.00059.Search in Google Scholar

Kirkby, J., Nguyen, D., Nguyen, D., and Nguyen, N.N. (2024). pymle: a python package for maximum likelihood estimation and simulation of stochastic differential equations, Available at SSRN 4826948.10.2139/ssrn.4826948Search in Google Scholar

Kloeden, P.E., Platen, E., Kloeden, P.E., and Platen, E. (1992). Stochastic differential equations. Springer, Berlin, Germany.10.1007/978-3-662-12616-5Search in Google Scholar

Krautmann, A.C. (1999). What’s wrong with Scully-estimates of a player’s marginal revenue product. Econ. Inq. 37: 369–381. https://doi.org/10.1111/j.1465-7295.1999.tb01435.x.Search in Google Scholar

Leeds, M.A. and Kowalewski, S. (2001). Winner take all in the NFL: the effect of the salary cap and free agency on the compensation of skill position players. J. Sports Econ. 2: 244–256. https://doi.org/10.1177/152700250100200304.Search in Google Scholar

MacDonald, D.N. and Reynolds, M.O. (1994). Are baseball players paid their marginal products? Manag. Decis. Econ. 15: 443–457. https://doi.org/10.1002/mde.4090150507.Search in Google Scholar

Pena, J.L. and Touchette, H. (2012). A network theory analysis of football strategies. arXiv preprint arXiv:1206.6904.Search in Google Scholar

R Core Team. (2021). R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, Available at: https://www.R-project.org/.Search in Google Scholar

Revuz, D. and Yor, M. (2013). Continuous martingales and brownian motion, 293. Springer Science & Business Media, Berlin, Germany.Search in Google Scholar

Rockerbie, D.W. (2010). Marginal revenue product and salaries: moneyball redux. MPRA Paper No. 21410.Search in Google Scholar

Rockerbie, D.W. and Easton, S.T. (2013). The run to the pennant: a multiple equilibria approach to professional sports leagues, 6. Springer, New York, NY, USA.10.1007/978-1-4614-7885-0Search in Google Scholar

Rockerbie, D.W. and Easton, S.T. (2020). Contract options for buyers and sellers of talent in professional sports. Springer Nature, Cham, Switzerland.10.1007/978-3-030-49513-8Search in Google Scholar

Rosen, S. and Sanderson, A. (2001). Labour markets in professional sports. Econ. J. 111: 47–68. https://doi.org/10.1111/1468-0297.00598.Search in Google Scholar

Scully, G.W. (1974). Pay and performance in major league baseball. Am. Econ. Rev. 64: 915–930.Search in Google Scholar

Shreve, S.E. (2004). Stochastic calculus for finance II: continuous-time models, 11. Springer, New York, NY, USA.10.1007/978-1-4757-4296-1Search in Google Scholar

Stats Perform. (2024). The opta points – explained, Available at: https://optaplayerstats.statsperform.com/en_GB/soccer/opta-points.Search in Google Scholar

Tottenham Football Club. (2002). What is the opta index, Available at: https://www.tottenhamhotspur.com/news-archive-1/what-is-the-opta-index/.Search in Google Scholar

Tunaru, R., Clark, E., and Viney, H. (2005). An option pricing framework for valuation of football players. Rev. Financ. Econ. 14: 281–295, https://doi.org/10.1016/j.rfe.2004.11.002.Search in Google Scholar

Wilders, R.J. (2020). Financial mathematics for actuarial science: the theory of interest. CRC Press, Boca Raton, FL, USA.10.1201/9780429287107Search in Google Scholar

Zimbalist, A. (2001). Salaries and performance: beyond the Scully model. Int. Libr. Crit. Writ. Econ. 135: 311–335.Search in Google Scholar

Zimbalist, A. (2010). Reflections on salary shares and salary caps. J. Sports Econ. 11: 17–28. https://doi.org/10.1177/1527002509354890.Search in Google Scholar

Received: 2024-01-11
Accepted: 2024-11-30
Published Online: 2025-01-07
Published in Print: 2025-03-26

© 2024 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 25.10.2025 from https://www.degruyterbrill.com/document/doi/10.1515/jqas-2024-0006/html
Scroll to top button