Abstract
In the last decade, the offensive and defensive philosophies employed by teams in the National Basketball Association (NBA) have changed substantially. As a result, most players can no longer be classified into only one of the five traditional positions (PG, SG, SF, PF, C) and instead spend a percentage of their playing time at multiple positions, making positional data compositional. Further, given the desirability for versatile players, an argument can be made that traditional positions themselves are archaic. Using data from the 2016–17, 2017–18, and 2018–19 seasons, I explore how Bayesian hierarchical models can be used to estimate team defensive strength in three ways. First, only considering players classified by their majority traditional position. Second, by using compositional traditional positional data. Third, using compositional data from modern positions (archetypes) defined by fuzzy k-means clustering. I find that the fuzzy k-means approach leads to a modest improvement in both the root mean squared error and median 95 % posterior predictive interval width for the test data, and, more importantly, identifies 11 modern archetypes that, when combined, are correlated with team win total and adjusted team defensive rating. The modern archetype compositions can be used by stakeholders to better understand team defensive strength.
Acknowledgment
The author would like to thank Eric Callahan and Daniel Keidar for their efforts in scraping the data used in this paper. Cheers!
-
Research ethics: Not applicable.
-
Author contributions: The author has accepted responsibility for the entire content of this manuscript and approved its submission.
-
Competing interests: The author states no conflict of interest.
-
Research funding: None declared.
-
Data availability: All source data files plus a Quarto document displaying the code used for the paper are available as supplementary files.
References
Baumann, A. (2022). A multi-stage clustering algorithm to re-evaluate basketball positions and performance analysis, Ph.D. diss. Dublin, National College of Ireland.Search in Google Scholar
Bezdek, J.C. (2013). Pattern recognition with fuzzy objective function algorithms. Springer Science & Business Media, New York, NY.Search in Google Scholar
Bianchi, F., Facchinetti, T., and Zuccolotto, P. (2017). Role revolution: towards a new meaning of positions in basketball. Electron. J. Appl. Stat. Anal. 10: 712–734.Search in Google Scholar
Bürkner, P.-C. (2017). brms: an R package for Bayesian multilevel models using Stan. J. Stat. Software 80: 1–28, https://doi.org/10.18637/jss.v080.i01.Search in Google Scholar
Cervone, D., Bornn, L., and Goldsberry, K. (2016) NBA court realty. In: 10th MIT sloan sports analytics conference.Search in Google Scholar
Daly-Grafstein, D. and Bornn, L. (2020). Using in-game shot trajectories to better understand defensive impact in the NBA. J. Sports Anal. 6: 235–242, https://doi.org/10.3233/jsa-200400.Search in Google Scholar
Dimitriadou, E., Hornik, K., Leisch, F., Meyer, D., Weingessel, A., and Leisch, M.F. (2006). The e1071 package. Misc Functions of Department of Statistics (e1071), TU Wien, pp. 297–304.Search in Google Scholar
Franks, A., Miller, A., Bornn, L., and Goldsberry, K. (2015). Characterizing the spatial structure of defensive skill in professional basketball. Ann. Appl. Stat. 9: 94–121, https://doi.org/10.1214/14-aoas799.Search in Google Scholar
Gelman, A. and Loken, E. (2013). The garden of forking paths: why multiple comparisons can be a problem, even when there is no “fishing expedition” or “p-hacking” and the research hypothesis was posited ahead of time, 348. Department of Statistics, Columbia University, New York, pp. 1–17.Search in Google Scholar
Gilani, S. (n.d.). hoopR: the SportsDataverse’s R package for Men’s basketball data, Available at: https://hoopr.sportsdataverse.org.Search in Google Scholar
Hedquist, A.L. (2022). Redefining NBA basketball positions through visualization and mega-cluster analysis, Ph.D. diss. Utah State University.Search in Google Scholar
Hollinger, J. (2003). Pro basketball prospectus. Brassey’s, Dulles, VA.Search in Google Scholar
Hornik, K., Böhm, W., and Hornik, M.K. (2023). Package ‘clue’.Search in Google Scholar
Hron, K., Filzmoser, P., and Thompson, K. (2012). Linear regression with compositional explanatory variables. J. Appl. Stat. 39: 1115–1128, https://doi.org/10.1080/02664763.2011.644268.Search in Google Scholar
Kubatko, J. (n.d.). NBA win shares, https://www.basketball-reference.com/about/ws.html (Accessed 3 January 2023).Search in Google Scholar
Lisa, A. (2019). 25 ways the NBA has changed in the last 50 years. Stacker, Available at: https://stacker.com/basketball/25-ways-nba-has-changed-last-50-years.Search in Google Scholar
McIntyre, A., Brooks, J., Guttag, J., and Wiens, J. (2016) Recognizing and analyzing ball screen defense in the NBA. In: Proceedings of the MIT sloan sports analytics conference, Boston, MA, USA, pp. 11–12.Search in Google Scholar
Muniz, M. and Flamand, T. (2022). A weighted network clustering approach in the NBA. J. Sports Anal. Preprint: 1–25, https://doi.org/10.3233/jsa-220584.Search in Google Scholar
Myers, D. (2020). About box plus/minus (BPM). Basketball Reference, Available at: https://www.basketball-reference.com/about/bpm2.html.Search in Google Scholar
Oliver, D. (2004). Basketball on paper: rules and tools for performance analysis. Potomac Books, Inc, Washington DC.Search in Google Scholar
Ozanian, M. (2022). “NBA team values 2022: for the first time in two decades, the top spot goes to a franchise that’s not the knicks or lakers.” Forbes. Forbes Magazine, Available at: https://www.forbes.com/sites/mikeozanian/2022/10/27/nba-team-values-2022-for-the-first-time-in-two-decades-the-top-spot-goes-to-a-franchise-thats-not-the-knicks-or-lakers/?sh=757c3e1f1cce.Search in Google Scholar
Page, G.L., Fellingham, G.W., and Shane Reese, C. (2007). Using box-scores to determine a position’s contribution to winning basketball games. J. Quant. Anal. Sports 3, https://doi.org/10.2202/1559-0410.1033.Search in Google Scholar
R Core Team (2022). R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, Available at: https://www.R-project.org/.Search in Google Scholar
Ripley, B., Venables, W., and Ripley, M.B. (2016). Package ‘nnet’. R Package Version 7: 700.Search in Google Scholar
South, C., Elmore, R., Clarage, A., Sickorez, R., and Cao, J. (2019). A starting point for navigating the world of daily fantasy basketball. Am. Statistician 73: 179–185, https://doi.org/10.1080/00031305.2017.1401559.Search in Google Scholar
Zhang, L., Shi, Y., Jenq, R.R., Do, K.-A., and Peterson, C.B. (2021). Bayesian compositional regression with structured priors for microbiome feature selection. Biometrics 77: 824–838, https://doi.org/10.1111/biom.13335.Search in Google Scholar PubMed PubMed Central
Supplementary Material
This article contains supplementary material (https://doi.org/10.1515/jqas-2024-0010).
© 2024 Walter de Gruyter GmbH, Berlin/Boston