Home Estimating individual contributions to team success in women’s college volleyball
Article
Licensed
Unlicensed Requires Authentication

Estimating individual contributions to team success in women’s college volleyball

  • Scott Powers EMAIL logo , Luke Stancil and Naomi Consiglio
Published/Copyright: February 24, 2025

Abstract

The progression of a single point in volleyball starts with a serve and then alternates between teams, each team allowed up to three contacts with the ball. Using charted data from the 2022 NCAA Division I women’s volleyball season (4,147 matches, 600,000+ points, more than 5 million recorded contacts), we model the progression of a point as a Markov chain with the state space defined by the sequence of contacts in the current possession. We estimate the probability of each team winning the point, which changes on each contact. We attribute changes in point probability to the player(s) responsible for each contact, facilitating measurement of performance on the same point scale for different skills. Traditional volleyball statistics do not allow apples-to-apples comparisons across skills, and they do not measure the impact of the performances on team success. For adversarial contact groups (serve/reception and set/attack/block/dig), we estimate a hierarchical linear model for the outcome, with random effects for the players involved; and we adjust performance for strength of schedule not only on the conference/team level but on the individual player level. We can use the results to answer practical questions for volleyball coaches.


Corresponding author: Scott Powers, Department of Sport Management, Rice University, Houston, TX, USA; and Department of Statistics, Rice University, Houston, TX, USA, E-mail:

Acknowledgments

We are grateful to an anonymous associate editor and an anonymous reviewer, whose comments led to significant improvements to this paper.

  1. Research ethics: Not applicable.

  2. Informed consent: Not applicable.

  3. Author contributions: All authors have accepted responsibility for the entire content of this manuscript and approved its submission.

  4. Use of Large Language Models, AI and Machine Learning Tools: None declared.

  5. Conflict of interest: The authors state no conflict of interest.

  6. Research funding: None declared.

  7. Data availability: The data that support the findings of this study are available from Volleymetrics. Restrictions apply to the availability of these data, which were used under license for this study.

Appendix

This appendix provides more diagnostic details on the model fit, as well as details on the dataset. Table 7 is a glossary of common attack codes. Figure 12 shows a histogram of the frequency with which each state of the Markov chain is observed, for estimating the one-step transition probabilities in (1). Because of the sheer volume of data, the empirical transition probabilities have reasonably low standard errors.

Table 7:

A glossary of the most common attack codes, excluding those which occur with frequency less than 1 %. These 13 common attack codes cover 95 % of attacks, with 15 more attack codes covering the remaining 5 %.

Code Name Description Frequency
X5 Go Attack in zone 4 by outside hitter following an in-system pass 30 %
V5 Hut Attack in zone 4 by outside hitter following an out-of-system pass 16 %
X6 X Attack in zone 2 by opposite hitter following an in-system pass 15 %
X1 Front quick Quick ball to middle blocker in zone 3/4, in front of the setter 9 %
V6 Red Attack in zone 2 by opposite hitter following an in-system pass 5 %
CF Slide Attack in zone 2 by middle blocker off of one foot, near antenna 5 %
PP Dump Attack by setter 4 %
X7 Gap Quick attack between zone 3 and 4 3 %
VP Pipe Attack in zone 6 by outside/opposite hitter following an out-of-system pass 3 %
XM Quick center Quick ball to middle blocker in zone 3 (setter off net; MB doesn’t follow) 2 %
XP Bic Attack in zone 6 by outside/opposite hitter following an in-system pass 2 %
X3 Wing mid Attack in zone 3 by outside/opposite hitter 1 %
X2 Back quick Quick ball to middle blocker or opposite hitter, behind setter 1 %
Figure 12: 
Distribution of sample size across game states for the Markov chain model detailed in Section 3.1. The sample size is the number of transitions observed from each state. We exclude the terminal states and the initial state (S, Sv) from this histogram. The state with the greatest sample size is 188,810 transitions observed from state (R, R+).
Figure 12:

Distribution of sample size across game states for the Markov chain model detailed in Section 3.1. The sample size is the number of transitions observed from each state. We exclude the terminal states and the initial state (S, Sv) from this histogram. The state with the greatest sample size is 188,810 transitions observed from state (R, R+).

Table 8 provides a summary of the fit for the serve/reception model detailed in Section 3.2.1, and Table 9 provides a summary of the fit for the set/attack/block/dig models detailed in Section 3.2.2. All eight LMMs converged without error. The error distribution for each of these models violates the normal distribution assumption because the response variable in each case only takes on a handful of values. The error distribution is therefore discrete-valued and multi-modal. Schielzeth et al. (2020) provides a recent analysis of the consequences of the normal-error assumption being violated for LMMs. Based on these findings, we can expect a slight upward bias in the random effect standard deviations reported in Tables 8 and 9. In other words, the predicted player random effects will be less regularized toward the mean than would be optimal.

Table 8:

Estimated variance components (expressed as standard deviations) from the linear mixed-effects model for serve/reception outcomes detailed in Section 3.2.1.

Random effect standard deviation Residual
Serving team Receiving team Standard
Conference Team Server Conference Team Receiver Deviation
σ ̂ γ σ ̂ τ σ ̂ π σ ̂ γ ̃ σ ̂ τ ̃ σ ̂ ρ σ ̂ ϵ
0.005 0.007 0.018 0.009 0.011 0.014 0.161
Table 9:

Estimated variance components (expressed as standard deviations) from the seven linear mixed-effects models for set/attack/block/dig outcomes detailed in Section 3.2.2.

Model Random effect standard deviation Residual
Attacking team Defending team Standard
Conference Team Setter Attacker Conference Team Blocker Digger Deviation
σ ̂ γ σ ̂ τ σ ̂ θ σ ̂ ψ σ ̂ γ ̃ σ ̂ τ ̃ σ ̂ β σ ̂ δ σ ̂ ϵ
(1) attack error indicator 0.004 0.006 0.011 0.001 0.181
(2) clean attack indicator 0.002 0.003 0.006 0.002 0.002 0.003 0.007 0.060
(3) block error indicator 0.006 0.008 0.014 0.003 0.006 0.005 0.019 0.265
(4) block-through indicator 0.004 0.007 0.013 0.005 0.007 0.007 0.025 0.165
(5) block-return outcome 0.021 0.022 0.025 0.011 0.017 0.016 0.007 0.023 0.246
(6) block-through outcome 0.003 0.005 0.010 0.003 0.006 0.008 0.010 0.026 0.213
(7) clean attack outcome 0.007 0.011 0.015 0.002 0.005 0.008 0.025 0.252

To test the sensitivity of the results to the manner in which blocker/digger identity were inferred (when there is no recorded block/dig touch) as detailed in Section 3.2.3, we performed an experiment in which we randomly assigned responsibility. For blocking opportunities with no block touch recorded, we randomly selected one front-row player (with equal 1/3 probability) to bear the responsibility. For digging opportunities with no dig touch recorded, we randomly selected on back-row player (with equal 1/3 probability) to bear the responsibility. We repeated the full analysis with this random assignment and evaluated the correlation between these randomized results and the original results. Specifically, we focused on adjusted Points Gained per set for each skill, i.e. the columns reported in Table 5. We calculated the correlation (weighted by sets played) across all Division I players in adjusted Points Gained per set by total (0.989 ± 0.003), by serving (1.000 ± 0.000), by passing (0.983 ± 0.003), by setting (1.000 ± 0.000), by attacking (0.999 ± 0.001) and by blocking (0.933 ± 0.007). Passing and blocking Points Gained per set are the most affected by this randomized assignment because they are directly impacted by the assignment, whereas the other skill measurements are only indirectly impacted through strength of schedule (the serving skill is sanitized from this impact). We observe blocker identity for 41 % of block opportunities, and we observe digger identity for 82 % of dig opportunities. The correlations reported here a conservative under-estimate of the correlations between our results and those from using the true blocker/digger responsibilities because the randomized assignment is a very naive alternative.

References

American Volleyball Coaches Association. (n.d.). 2022 women’s DI all-Americans, Available at: https://www.avca.org/award/2022-womens-di-all-americans/(Accessed 23 September 2023).Search in Google Scholar

Bagley, C. and Ware, B. (2017) Bump, set, spike: using analytics to rate volleyball teams and players. In: MIT sloan sports analytics conference.Search in Google Scholar

Bates, D., Mächler, M., Bolker, B., and Walker, S. (2015). Fitting linear mixed-effects models using lme4. J. Stat. Software 67: 1–48, https://doi.org/10.18637/jss.v067.i01.Search in Google Scholar

Bukiet, B., Harold, E.R., and Palacios, J.L. (1997). A Markov chain approach to baseball. Oper. Res. 45: 14–23, https://doi.org/10.1287/opre.45.1.14.Search in Google Scholar

Drikos, S. (2018). Pass level and the outcome of attack for age categories in male volleyball. J. Phys. Act. Nutr. Rehabil.: 428–437.Search in Google Scholar

Drikos, S., Ntzoufras, I., and Apostolidis, N. (2019). Bayesian analysis of skills importance in world champions men’s volleyball across ages. Int. J. Comput. Sci. Sport 18, https://doi.org/10.2478/ijcss-2019-0002.Search in Google Scholar

Echlin, G. (2024). The Pro Volleyball Federation for women debuts and draws a record crowd. National Public Radio, Available at: https://www.npr.org/2024/01/28/1227096824/the-pro-volleyball-federation-for-women-debuts-and-draws-a-record-crowd (Accessed 28 January 2024).Search in Google Scholar

Egidi, L. and Ntzoufras, I. (2020). A Bayesian quest for finding a unified model for predicting volleyball games. J. R. Stat. Soc., C: Appl. Stat. 69: 1307–1336, https://doi.org/10.1111/rssc.12436.Search in Google Scholar

Fellingham, G.W. (2022). Evaluating the performance of elite level volleyball players. J. Quant. Anal. Sports 18: 15–34, https://doi.org/10.1515/jqas-2021-0056.Search in Google Scholar

Florence, L.W., Fellingham, G.W., Vehrs, P.R., and Mortensen, N.P. (2008). Skill evaluation in women’s volleyball. J. Quant. Anal. Sports 4, https://doi.org/10.2202/1559-0410.1105.Search in Google Scholar

Gabrio, A. (2021). Bayesian hierarchical models for the prediction of volleyball results. J. Appl. Stat. 48: 301–321, https://doi.org/10.1080/02664763.2020.1723506.Search in Google Scholar PubMed PubMed Central

Hass, Z. and Craig, B.A. (2018). Exploring the potential of the plus/minus in NCAA women’s volleyball via the recovery of court presence information. J. Sports Anal. 4: 285–295, https://doi.org/10.3233/JSA-180217.Search in Google Scholar

Hileno, R., Arasanz, M., and García-de-Alcaraz, A. (2020). The sequencing of game complexes in women’s volleyball. Front. Psychol. 11: 739, https://doi.org/10.3389/fpsyg.2020.00739.Search in Google Scholar PubMed PubMed Central

Laporta, L., Nikolaidis, P., Thomas, L., and Afonso, J. (2015). Attack coverage in high-level men’s volleyball: organization on the edge of chaos? J. Hum. Kinet. 47: 249–257, https://doi.org/10.1515/hukin-2015-0080.Search in Google Scholar PubMed PubMed Central

Miskin, M.A., Fellingham, G.W., and Florence, L.W. (2010). Skill importance in women’s volleyball. J. Quant. Anal. Sports 6, https://doi.org/10.2202/1559-0410.1234.Search in Google Scholar

Newton, P.K. and Aslam, K. (2009). Monte Carlo tennis: a stochastic Markov chain model. J. Quant. Anal. Sports 5, https://doi.org/10.2202/1559-0410.1169.Search in Google Scholar

Ntzoufras, I., Palaskas, V., and Drikos, S. (2021). Bayesian models for prediction of the set-difference in volleyball. IMA J. Manag. Math. 32: 491–518, https://doi.org/10.1093/imaman/dpab007.Search in Google Scholar

Olson, E. (2023). Nebraska volleyball stadium event draws 92,003 to set women’s world attendance record. Associated Press, Available at: https://apnews.com/article/nebraska-volleyball-attendance-record-38f103fe2100a368cddb19b75e1adb8d (Accessed 30 August 2023).Search in Google Scholar

Pfeiffer, M., Zhang, H., and Hohmann, A. (2010). A Markov chain model of elite table tennis competition. Int. J. Sports Sci. Coach. 5: 205–222, https://doi.org/10.1260/1747-9541.5.2.205.Search in Google Scholar

Schielzeth, H., Dingemanse, N., Nakagawa, S., Westneat, D., Allegue, H., Teplitsky, C., Réale, D., Dochtermann, N., Garamszegi, L., and Araya-Ajoy, Y. (2020). Robustness of linear mixed-effects models to violations of distributional assumptions. Methods Ecol. Evol. 11: 1141–1152, https://doi.org/10.1111/2041-210X.13434.Search in Google Scholar

Winston, W.L., Nestler, S., and Pelechrinis, K. (2022). Mathletics: how gamblers, managers, and sports enthusiasts use mathematics in sports. Princeton University Press, Princeton, NJ.10.2307/j.ctv1t8q8wqSearch in Google Scholar

Received: 2024-03-04
Accepted: 2025-01-15
Published Online: 2025-02-24
Published in Print: 2025-06-26

© 2025 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 19.11.2025 from https://www.degruyterbrill.com/document/doi/10.1515/jqas-2024-0038/pdf
Scroll to top button