Abstract
This paper studies sports Strength of Schedule (SoS) metrics and their utility for decision-makers in creating season schedules. We identify limitations of the widely used Bowl Championship Series (BCS) SoS metric, including infrequent rank changes and reduced variance, which limit its effectiveness in assessing true schedule difficulty. In response, we introduce two novel metrics: Probabilistic Random Walk Length (PRWL) SoS and Customized-Distributional (CD) SoS. The PRWL SoS leverages random walks on a graph to provide a comprehensive measure incorporating the difficulty of a team’s opponents schedules. The CD SoS offers a customizable approach, incorporating team-specific goals and preferences, for a more tailored assessment of schedule strength. Additionally, we propose a SoS variability metric to evaluate the variability of SoS among teams, promoting schedule balance. Through a detailed case study involving a large sports conference, we demonstrate the practical application and impact of our proposed metrics. The case study illustrates the superiority of the proposed scheduling format over the existing one, prompting the conference to adopt a new schedule format and eliminate their previous divisional structure. Our contributions provide more accurate tools for evaluating schedule difficulty, enhancing schedule balance, and optimizing sports schedules, ultimately promoting fairness and competitiveness in sports leagues.
Acknowledgments
The authors would like to thank the sports conference for their financial support and for providing the data necessary for this study.
-
Research ethics: This study did not involve human participants or animal subjects and solely utilized simulation methods. Therefore, ethical approval was not required.
-
Informed consent: As no human participants were involved in this simulation study, informed consent was not applicable.
-
Author contributions: Adam DeHollander: Conceptualization, Methodology, Data Analysis, Writing – Original Draft. Mark Karwan: Supervision, Project Administration, Funding Acquisition, Conceptualization, Methodology, Writing – Review & Editing.
-
Use of Large Language Models, AI and Machine Learning Tools: A large language model (LLM) was used solely to assist with wording and language refinement in portions of the manuscript. The model was not used for generating original content, conducting analyses, or contributing to the conceptual or methodological aspects of the work.
-
Conflict of interest: The authors declare no conflicts of interest related to this study.
-
Research funding: This research was supported by funding from a sports conference. The development of the Strength of Schedule (SoS) metrics was conducted independently from this funding; the funding facilitated the application of these metrics to the conference’s data.
-
Data availability: The data generated and analyzed during this study are available from the corresponding author upon reasonable request.
References
Angelini, G., Candila, V., and De Angelis, L. (2022). Weighted Elo rating for tennis match predictions. Eur. J. Oper. Res. 297: 120–132, https://doi.org/10.1016/j.ejor.2021.04.011.Search in Google Scholar
Bouzarth, E.L., Cromer, A.W., Fravel, W.J., Grannan, B.C., and Hutson, K.R. (2020). Dynamically scheduling NFL games to reduce strength of schedule variability. J. Sports Anal. 6: 281–293, https://doi.org/10.3233/jsa-200428.Search in Google Scholar
Bouzarth, E.L., Grannan, B.C., Harris, J.M., and Hutson, K.R. (2022). Scheduling the valley baseball league. INFORMS J. Appl. Anal. 52: 189–197, https://doi.org/10.1287/inte.2021.1076.Search in Google Scholar
Bowman, R.A., Harmon, O., and Ashman, T. (2023). Schedule inequity in the national basketball association. J. Sports Anal. 9: 61–76, https://doi.org/10.3233/jsa-220629.Search in Google Scholar
Burioni, R. and Cassi, D. (2005). Random walks on graphs: ideas, techniques and results. J. Phys. A: Math. Gen. 38: R45, https://doi.org/10.1088/0305-4470/38/8/r01.Search in Google Scholar
Chebotarev, P.Y. (1994). Aggregation of preferences by the generalized row sum method. Math. Soc. Sci. 27: 293–320, https://doi.org/10.1016/0165-4896(93)00740-l.Search in Google Scholar
Csató, L. (2013). Ranking by pairwise comparisons for Swiss-system tournaments. Cent. Eur. J. Oper. Res. 21: 783–803, https://doi.org/10.1007/s10100-012-0261-8.Search in Google Scholar
Csató, L. (2017). On the ranking of a Swiss system chess team tournament. Ann. Oper. Res. 254: 17–36, https://doi.org/10.1007/s10479-017-2440-4.Search in Google Scholar
Csató, L., Devriesere, K., Goossens, D., Gyimesi, A., Lambers, R., and Spieksma, F. (2025). Ranking matters: does the new format select the best teams for the knockout phase in the UEFA champions league? arXiv preprint arXiv:2503.13569, https://doi.org/10.48550/arXiv.2503.13569 Search in Google Scholar
Devlin, S. and Treloar, T. (2018). A network diffusion ranking family that includes the methods of Markov, Massey, and Colley. J. Quant. Anal. Sports 14: 91–101, https://doi.org/10.1515/jqas-2017-0098.Search in Google Scholar
Devriesere, K., Csató, L., and Goossens, D. (2025). Tournament design: a review from an operational research perspective. Eur. J. Oper. Res. 324: 1–21, https://doi.org/10.1016/j.ejor.2024.10.044.Search in Google Scholar
Dolf, M. and Teehan, P. (2015). Reducing the carbon footprint of spectator and team travel at the University of British Columbia’s varsity sports events. Sport Manag. Rev. 18: 244–255, https://doi.org/10.1016/j.smr.2014.06.003.Search in Google Scholar
Elo, A.E. (1978). The rating of chessplayers: past and present. Arco Publishing, New York.Search in Google Scholar
Fearnhead, P. and Taylor, B.M. (2010). Calculating strength of schedule, and choosing teams for March Madness. Am. Statistician 64: 108–115, https://doi.org/10.1198/tast.2010.09161.Search in Google Scholar
Ferrand, Y.B., Magazine, M.J., Rao, U.S., and Glass, T.F. (2018). Managing responsiveness in the emergency department: comparing dynamic priority queue with fast track. J. Oper. Manag. 58–59: 15–26, https://doi.org/10.1016/j.jom.2018.03.001.Search in Google Scholar
FIFA (2018). Revision of the FIFA/coca-cola world ranking, Available at: https://digitalhub.fifa.com/m/f99da4f73212220/original/edbm045h0udbwkqew35a-pdf.pdf (Accessed 21 March 2025).Search in Google Scholar
Gomes de Pinho Zanco, D., Szczecinski, L., Kuhn, E.V., and Seara, R. (2024). Stochastic analysis of the Elo rating algorithm in round-robin tournaments. Digit. Signal Process. 145: 104313, https://doi.org/10.1016/j.dsp.2023.104313.Search in Google Scholar
González-Díaz, J., Hendrickx, R., and Lohmann, E. (2014). Paired comparisons analysis: an axiomatic approach to ranking methods. Soc. Choice Welfare 42: 139–169, https://doi.org/10.1007/s00355-013-0726-2.Search in Google Scholar
Goossens, D., Yi, X., and Van Bulck, D. (2020). Fairness trade-offs in sports timetabling. In: Ley, C. and Dominicy, Y. (Eds.). Science meets sports: when statistics are more than numbers. Cambridge Scholars, Newcastle upon Tyne, UK, pp. 213–244.Search in Google Scholar
Hvattum, L.M. and Arntzen, H. (2010). Using ELO ratings for match result prediction in association football. Int. J. Forecast. 26: 460–470, https://doi.org/10.1016/j.ijforecast.2009.10.002.Search in Google Scholar
Karwan, M., Kurt, M., Pandey, N.K., and Cunningham, K. (2015). Alleviating competitive imbalance in NFL schedules: an integer-programming approach. In: 9th annual MIT sloan sports analytics conference. Boston, MA.Search in Google Scholar
Keener, J.P. (1993). The Perron–Frobenius theorem and the ranking of football teams. SIAM Rev. 35: 80–93, https://doi.org/10.1137/1035004.Search in Google Scholar
Kim, J. and Kim, S. (2024). Evaluating aerial duel ability of football players using height-adjusted Elo rating model. Int. J. Perform. Anal. Sport: 1–14, https://doi.org/10.1080/24748668.2024.2420458.Search in Google Scholar
Lee, Y.H. and Fort, R. (2023). Division play and outcome uncertainty in sports leagues. J. Sports Econ. 24: 639–663, https://doi.org/10.1177/15270025221148995.Search in Google Scholar
Lee, Y.H., Kim, Y., and Kim, S. (2019). Competitive balance with unbalanced schedules. J. Quant. Anal. Sports 15: 239–260, https://doi.org/10.1515/jqas-2017-0100.Search in Google Scholar
Leiva Bertrán, F. (2025). Ranking in incomplete tournaments: the generalized win percentage method, efficiency, and NCAA football. J. Sports Econ. 26: 3–34, https://doi.org/10.1177/15270025241268738.Search in Google Scholar
Lenten, L.J. (2015) Measurement of competitive balance in conference and divisional tournament design, J. Sports Econ. 16: 3–25, https://doi.org/10.1177/1527002512471538.Search in Google Scholar
Uyar, B. and Surdam, D. (2013). Searching for on-field parity: evidence from national football league scheduling during 1991–2006. J. Sports Econ. 14: 479–497, https://doi.org/10.1177/1527002512438901.Search in Google Scholar
Vaziri, B., Dabadghao, S., Yih, Y., and Morin, T.L. (2018). Properties of sports ranking methods. J. Oper. Res. Soc. 69: 776–787, https://doi.org/10.1057/s41274-017-0266-8.Search in Google Scholar
Xia, F., Sun, K., Yu, S., Aziz, A., Wan, L., Pan, S., and Liu, H. (2021). Graph learning: a survey. IEEE Trans. Artif. Intell. 2: 109–127, https://doi.org/10.1109/tai.2021.3076021.Search in Google Scholar
© 2025 Walter de Gruyter GmbH, Berlin/Boston