Predicting elite NBA lineups using individual player order statistics

Susan E. Martonosi; Martin Gonzalez; Nicolas Oshiro

doi:10.1515/jqas-2022-0039

Article

Predicting elite NBA lineups using individual player order statistics

Susan E. Martonosi , Martin Gonzalez and Nicolas Oshiro

Published/Copyright: May 23, 2023

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information

From the journal Journal of Quantitative Analysis in Sports Volume 19 Issue 2

Abstract

NBA team managers and owners try to acquire high-performing players. An important consideration in these decisions is how well the new players will perform in combination with their teammates. Our objective is to identify elite five-person lineups, which we define as those having a positive plus-minus per minute (PMM). Using individual player order statistics, our model can identify an elite lineup even if the five players in the lineup have never played together, which can inform player acquisition decisions, salary negotiations, and real-time coaching decisions. We combine seven classification tools into a unanimous consent classifier (all-or-nothing classifier, or ANC) in which a lineup is predicted to be elite only if all seven classifiers predict it to be elite. In this way, we achieve high positive predictive value (i.e., precision), the likelihood that a lineup classified as elite will indeed have a positive PMM. We train and test the model on individual player and lineup data from the 2017–18 season and use the model to predict the performance of lineups drawn from all 30 NBA teams’ 2018–19 regular season rosters. Although the ANC is conservative and misses some high-performing lineups, it achieves high precision and recommends positionally balanced lineups.

Keywords: basketball; classification; lineups; point differential

Corresponding author: Susan E. Martonosi, Harvey Mudd College, Claremont, CA, USA, E-mail: martonosi@g.hmc.edu

Funding source: National Science Foundation

Award Identifier / Grant number: DMS-1757952

Funding source: Harvey Mudd College

Acknowledgments

The authors thank Isys Johnson, Lucius Bynum, and Robert Gonzalez for their contributions to earlier phases of this work and to the code base, portions of which were adapted and used in this paper. Lastly, the authors thank the anonymous reviewers and editors whose feedback greatly improved the analysis.

Author contributions: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.
Research funding: This material is based upon work supported by the National Science Foundation under Grant No. DMS-1757952. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation. The authors would also like to acknowledge financial support from Harvey Mudd College.
Conflict of interest statement: The authors declare no conflicts of interest regarding this article.

Appendix A: Individual player statistics used as predictors in ANC

Table 16:

Individual player statistics used as predictors in ANC.

FGM	Field goals made per minute
FGA	Field goals attempted per minute
FGPCT	Field goal percentage
FG3M	Three-point field goals made per minute
FG3A	Three-point field goals attempted per minute
FG3PCT	Three-point field goals percentage
FTM	Free throws made per minute
FTA	Free throws attempted per minute
FTPCT	Free throw percentage
OREB	Offensive rebounds per minute
DREB	Defensive rebounds per minute
AST	Assists per minute
TOV	Turnovers per minute
STL	Steals per minute
BLK	Blocks per minute
BLKA	Blocks attempted per minute
PF	Personal fouls per minute
PTS	Points earned per minute
PFD	Personal fouls drawn per minute
PMM	Plus-minus per minute
CONTESTEDSHOTS	Shots contested per minute
CONTESTEDSHOTS2PT	Two-point shots contested per minute
CONTESTEDSHOTS3PT	Three-point shots contested per minute
CHARGESDRAWN	Charges drawn per minute
DEFLECTIONS	Passes deflected per minute
LOOSEBALLSRECOVERED	Loose balls recovered per minute
SCREENASSISTS	Screens that led to baskets per minute
BOXOUTS	Box outs per minute

Appendix B: Comparison of ANC to simpler model

One might also wonder whether the complete set of five-player order statistics is required by the ANC to achieve high precision. In this section, we analyze a simpler model that uses only the first order statistics (i.e., the lineup’s minimum) of each individual player metric used by the ANC.

We tune the simple model parameters as in Section 4.1, using ten-fold cross-validation. The parameter combination that lies on the efficient frontier of average precision and worst-case precision over the folds on the training data is given in Table 17. This combination achieved an average precision of 86.5 %, minimum precision of 57.1 % and average accuracy of 51.8 % on the training data. When the performance was insensitive to a parameter value, the value was chosen to match that used in the ANC.

Table 17:

Tuned parameter values used in simple model based on first order statistics.

Subclassifier	Parameter	Chosen value
Decision tree	`cp` (cost complexity)	−1
	`loss` (misclassification penalty)	1
Random forest	`c` (cutoff)	0.7
	`ntree` (number of trees)	100
Boosting	`mfinal` (number of trees)	500
	`maxdepth` (depth of each tree)	3
	`cp` (cost complexity)	0.01
Support vector machine	`cost` (misclassification penalty)	0.1
	`gamma` (influence decay)	0.01
K-nearest neighbors	`k` (number of neighbors)	5
Logistic regression	`thresh` (1 − probability threshold)	0.25
All-or-nothing classifier (ANC)	`numVotes` (agreement required)	7

Having tuned the parameters, we fit the first order statistic model to the full, standardized, training set, as described earlier, and apply the trained model to the testing data. The confusion matrix is given in Table 18.

Table 18:

Confusion matrix for the simple model based on first order statistics applied to the test data set. Of the 12 lineups predicted to be elite, nine have a true label of elite, corresponding to a precision of 75.0 %.

		Predicted class
		Elite	Not elite
True class	Elite	9	86
	Not elite	3	78

Of twelve lineups predicted to be elite, nine of these have a true label of elite, indicating a strictly positive PMM. The simpler model achieves a testing precision of only 75 % compared to the ANC’s testing precision of 86.7 %.

Appendix C: Actual lineup performance for LAL and GSW case study

Table 19:

Actual lineup performance compared to ANC predictions for the Los Angeles Lakers during the 2018–19 season, for all lineups having at least 25 min of playing time. ‘−’ denotes lineups for which no ANC prediction is given.

Los Angeles Lakers
Lineup	Minutes played	Actual PMM	ANC prediction
R. Rondo, K. Caldwell-Pope, B. Ingram, I. Zubac^a, J. Hart	25	0.68	−
L. James, B. Ingram, I. Zubac, L. Ball, K. Kuzma	55	0.36	−
T. Chandler, L. James, K. Caldwell-Pope, L. Ball, K. Kuzma	39	0.36	Not elite
L. James, J. McGee, L. Ball, K. Kuzma, J. Hart	133	0.31	Not elite
L. James, R. Rondo, J. McGee, K. Caldwell-Pope, K. Kuzma	47	0.23	Not elite
T. Chandler, L. Stephenson, K. Caldwell-Pope, B. Ingram, J. Hart	37	0.21	Not elite
T. Chandler, B. Ingram, L. Ball, K. Kuzma, J. Hart	36	0.14	Not elite
T. Chandler, L. James, B. Ingram, L. Ball, K. Kuzma	61	0.13	Not elite
L. James, R. Rondo, J. McGee, K. Caldwell-Pope, B. Ingram	31	0.13	Not elite
B. Ingram, I. Zubac, L. Ball, K. Kuzma, J. Hart	39	0.13	−
L. James, J. McGee, K. Caldwell-Pope, L. Ball, K. Kuzma	34	0.12	Not elite
L. James, J. McGee, R. Bullock, B. Ingram, K. Kuzma	73	0.11	Not elite
T. Chandler, K. Caldwell-Pope, B. Ingram, K. Kuzma, J. Hart	31	0.10	Not elite
T. Chandler, K. Caldwell-Pope, B. Ingram, L. Ball, K. Kuzma	45	0.04	Not elite
T. Chandler, L. James, L. Ball, K. Kuzma, J. Hart	66	0.02	Not elite
L. James, R. Rondo, J. McGee, B. Ingram, K. Kuzma	43	0.00	Not elite
L. James, J. McGee, B. Ingram, L. Ball, K. Kuzma	234	0.00	Not elite
L. James, R. Rondo, R. Bullock, B. Ingram, K. Kuzma	62	−0.05	Not elite
J. McGee, K. Caldwell-Pope, M. Muscala, A. Caruso, J. Jones^b	31	−0.06	−
L. James, K. Caldwell-Pope, L. Ball, K. Kuzma, J. Hart	25	−0.16	Not elite
L. James, R. Rondo, J. McGee, R. Bullock, K. Kuzma	62	−0.21	Not elite
R. Rondo, K. Caldwell-Pope, B. Ingram, I. Zubac, K. Kuzma	29	−0.24	−
L. James, L. Stephenson, L. Ball, K. Kuzma, J. Hart	31	−0.25	Not elite
R. Rondo, M. Beasley, K. Caldwell-Pope, B. Ingram, I. Zubac	25	−0.28	−
J. McGee, K. Caldwell-Pope, B. Ingram, L. Ball, J. Hart	25	−0.32	Not elite
L. James, R. Rondo, B. Ingram, I. Zubac, K. Kuzma	33	−0.43	Not elite
J. McGee, B. Ingram, L. Ball, K. Kuzma, J. Hart	83	−0.47	Not elite
R. Rondo, J. McGee, K. Caldwell-Pope, A. Caruso, M. Wagner^c	27	−1.31	−

^aIvica Zubac was traded to the Los Angeles Clippers and was not included in ANC predictions for the Lakers. ^bJemerrio Jones did not have data from the 2017–18 NBA regular season. ^cMoritz Wagner did not have data from the 2017–18 NBA regular season.

Table 20:

Actual lineup performance compared to ANC predictions for the Golden State Warriors during the 2018–19 season, for all lineups having at least 25 min of playing time.‘−’ denotes lineups for which no ANC prediction is given.

Golden State Warriors
Lineup	Minutes played	Actual PMM	ANC prediction
A. McKinnie, D. Green, K. Looney, S. Livingston, S. Curry,	28	1.01	Not elite
A. Iguodala, D. Green, K. Durant, K. Looney, S. Curry	25	0.80	Elite
A. Iguodala, D. Cousins, D. Green, K. Thompson, S. Curry	29	0.77	Elite
A. Iguodala, J. Bell, K. Durant, K. Thompson, S. Curry	36	0.73	Elite
A. Iguodala, D. Green, K. Durant, K. Thompson, S. Curry	178	0.69	Elite
A. Iguodala, K. Durant, K. Looney, K. Thompson, Q. Cook	35	0.63	Not elite
A. McKinnie, J. Jerebko, K. Durant, K. Looney, S. Curry	26	0.61	Not elite
D. Green, K. Durant, K. Looney, K. Thompson, S. Curry	313	0.39	Elite
D. Cousins, D. Green, K. Durant, K. Thompson, S. Curry	268	0.29	Elite
A. Bogut, D. Green, K. Durant, K. Thompson, S. Curry	83	0.27	Not elite
A. Iguodala, D. Cousins, D. Green, K. Thompson, S. Livingston	67	0.24	Not elite
D. Jones, J. Jerebko, K. Durant, K. Thompson, Q. Cook	29	0.20	Not elite
A. McKinnie, A. Iguodala, K. Durant, K. Looney, S. Curry	48	0.19	Not elite
A. Iguodala, K. Durant, K. Looney, K. Thompson, S. Curry	141	0.17	Elite
A. Iguodala, J. Jerebko, K. Durant, K. Looney, K. Thompson	47	0.13	Not elite
A. McKinnie, A. Iguodala, J. Jerebko, K. Looney, S. Curry	27	0.11	Not elite
D. Jones, D. Green, K. Durant, K. Thompson, S. Curry	142	0.11	Not elite
A. Iguodala, D. Green, J. Jerebko, S. Livingston, S. Curry	54	0.07	Not elite
A. Iguodala, D. Cousins, K. Thompson, Q. Cook, S. Livingston	39	0.05	Not elite
A. Iguodala, D. Green, J. Jerebko, K. Thompson, S. Livingston	26	0.00	Not elite
D. Lee, J. Jerebko, K. Looney, K. Thompson, S. Livingston	30	0.00	Not elite
D. Green, J. Jerebko, K. Durant, K. Thompson, S. Curry	45	−0.22	Elite
A. Iguodala, D. Jones, K. Durant, K. Thompson, Q. Cook	77	−0.33	Not elite
D. Green, J. Bell, K. Durant, K. Thompson, S. Curry	26	−0.38	Elite
A. McKinnie, D. Cousins, D. Green, K. Durant, S. Curry	32	−0.56	Not elite
A. McKinnie, J. Evans^a, J. Jerebko, J. Bell, Q. Cook	37	−0.57	−

^aJacob Evans did not have data from the 2017–18 NBA regular season.

References

Basketball-Reference.com. 2021a. 2017–18 Boston Celtics Roster and Stats. https://www.basketball-reference.com/teams/BOS/2018.html (accessed June 7, 2021).Search in Google Scholar

Basketball-Reference.com. 2021b. 2017–18 Houston Rockets Roster and Stats. https://www.basketball-reference.com/teams/HOU/2018.html (accessed June 7, 2021).Search in Google Scholar

Basketball-Reference.com. 2022a. 2017–18 NBA Player Stats: Per Game. https://www.basketball-reference.com/leagues/NBA_2018_per_game.html (accessed April 7, 2022).Search in Google Scholar

Basketball-Reference.com. 2022b. 2018–19 NBA Player Stats: Per Game. https://www.basketball-reference.com/leagues/NBA_2019_per_game.html (accessed May 10, 2022).Search in Google Scholar

Bendl, J., J. Stourac, O. Salanda, A. Pavelka, E. Wieben, J. Zendulka, J. Brezovsky, and J. Damborsky. 2014. “PredictSNP: Robust and Accurate Consensus Classifier for Prediction of Disease-Related Mutations.” PLoS Computational Biology 10 (1): e1003440, https://doi.org/10.1371/journal.pcbi.1003440.Search in Google Scholar PubMed PubMed Central

Bynum, L. E. J. 2018. “Modeling Subset Behavior: Prescriptive Analytics for Professional Basketball Data.” Senior thesis. Claremont: Harvey Mudd College.Search in Google Scholar

Cheng, G., Z. Zhang, M. Kyebambe, and K. Nasser. 2016. “Predicting the Outcome of NBA Playoffs Based on the Maximum Entropy Principle.” Entropy 18: 450. https://doi.org/10.3390/e18120450.Search in Google Scholar

Clemente, F., F. Martins, D. Kalamaras, and R. Mendes. 2015. “Network Analysis in Basketball: Inspecting the Prominent Players Using Centrality Metrics.” Journal of Physical Education and Sport 15: 212–7.10.1080/24748668.2015.11868825Search in Google Scholar

Deshpande, S. K., and S. T. Jensen. 2016. “Estimating an NBA Player’s Impact on His Team’s Chances of Winning.” Journal of Quantitative Analysis in Sports 12 (2): 51–72. https://doi.org/10.1515/jqas-2015-0027.Search in Google Scholar

Ghimire, S., J. Ehrlich, and S. Sanders. 2020. “Measuring Individual Worker Output in a Complementary Team Setting: Does Regularized Adjusted Plus Minus Isolate Individual NBA Player Contributions?” PLoS One 15 (8): e0237920. https://doi.org/10.1371/journal.pone.0237920.Search in Google Scholar PubMed PubMed Central

Glickman, M., and J. Sonas. 2015. “Introduction to the NCAA Men’s Basketball Prediction Methods Issue.” Journal of Quantitative Analysis in Sports 11: 1–3. https://doi.org/10.1515/jqas-2015-0013.Search in Google Scholar

Gumm, J., G. Hu, and A. Barrett. 2015. “A Machine Learning Strategy for Predicting March Madness Winners.” In Proc. of the 16th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD), 1–6. IEEE.10.1109/SNPD.2015.7176206Search in Google Scholar

Hua, S. 2015. “Comparing Several Modeling Methods on NCAA March Madness.” PhD diss., North Dakota State University.Search in Google Scholar

Kalman, S., and J. Bosch. 2020. “NBA Lineup Analysis on Clustered Player Tendencies: A New Approach to the Positions of Basketball and Modeling Lineup Efficiency of Soft Lineup Aggregates.” In Proc. of the 14th MIT Sloan Sports Analytics Conference, 42. Boston: Analytics.Search in Google Scholar

Kubatko, J., D. Oliver, K. Pelton, and D. Rosebaum. 2015. “A Starting Point for Analyzing Basketball Statistics.” Journal of Quantitative Analysis in Sports 3 (3): 1–12, https://doi.org/10.2202/1559-0410.1070.Search in Google Scholar

Lin, R. 2017. “Mason: Real-Time NBA Matches Outcome Prediction.” PhD diss., Arizona State University.Search in Google Scholar

Loeffelholz, B., E. Bednar, and K. Bauer. 2009. “Predicting NBA Games Using Neural Networks.” Journal of Quantitative Analysis in Sports 5 (1): 1–15. https://doi.org/10.2202/1559-0410.1156.Search in Google Scholar

Maymin, A., P. Maymin, and E. Shen. 2013. “NBA Chemistry: Positive and Negative Synergies in Basketball.” International Journal of Computer Science in Sport 12 (2): 4–23.Search in Google Scholar

McMahon, I. 2018. How (and why) Position-Less Lineups Have Taken Over the NBA Playoffs. The Guardian. https://www.theguardian.com/sport/blog/2018/may/01/how-and-why-position-less-lineups-have-taken-over-the-nba-playoffs (accessed May 11, 2022).Search in Google Scholar

NBA.com. 2018–19a. NBA Advanced Stats: Stats Home/Lineups/Traditional. https://www.nba.com/stats/lineups/traditional/?Season=2018-19&SeasonType=Regular%20Season&sort=MIN&dir=1&PerMode=Totals (accessed May 20, 2021).Search in Google Scholar

NBA.com. 2018–19b. NBA Advanced Stats: Stats Home/Teams/Advanced. https://www.nba. com/stats/teams/advanced/?sort=W&dir=-1&Season=2018-19&SeasonType=Regular 20Season (accessed May 9, 2022).Search in Google Scholar

Oh, M., S. Keshri, and G. Iyengar. 2015. “Graphical Models for Basketball Match Simulation.” In Proc. of the 2015 MIT Sloan Sports Analytics Conference, vol. 2728.Search in Google Scholar

Özmen, 2016 Özmen, M. U. 2016. “Marginal Contribution of Game Statistics to Probability of Winning at Different Levels of Competition in Basketball: Evidence from the Euroleague.” International Journal of Sports Science & Coaching 11: 98–107. https://doi.org/10.1177/1747954115624828.Search in Google Scholar

Pelechrinis, K. 2019. “LinNet: Probabilistic Lineup Evaluation through Network Embedding.” In Machine Learning and Knowledge Discovery in Databases, edited by U. Brefeld, E. Curry, E. Daly, B. MacNamee, A. Marascu, F. Pinelli, M. Berlingerio, and N. Hurley, 20–36. Cham: Springer International Publishing.Search in Google Scholar

Ribeiro, J., P. Silva, R. Duarte, K. Davids, and J. Garganta. 2017. “Team Sports Performance Analysed through the Lens of Social Network Theory: Implications for Research and Practice.” Sports Medicine 47: 1–8. https://doi.org/10.1007/s40279-017-0695-1.Search in Google Scholar PubMed

Robertson, M. 2017. “An Analysis of NBA Spatio-Temporal Data.” MS diss., Duke University.Search in Google Scholar

Ruiz, F. J., and F. Perez-Cruz. 2015. “A Generative Model for Predicting Outcomes in College Basketball.” Journal of Quantitative Analysis in Sports 11 (1): 39–52. https://doi.org/10.1515/jqas-2014-0055.Search in Google Scholar

Shen, G., D. Gao, Q. Wen, and R. Magel. 2016. “Predicting Results of March Madness Using Three Different Methods.” Journal of Sports Research 3: 10–7. https://doi.org/10.18488/journal.90/2016.3.1/90.1.10.17.Search in Google Scholar

Sisneros, R., and M. Van Moer. 2013. “Expanding Plus-Minus for Visual and Statistical Analysis of NBA Box-Score Data.” In 1st IEEE Workshop on Sports Data Visualization.Search in Google Scholar

Vaz de Melo, P., V. Almeida, A. Loureiro, and C. Faloutsos. 2012. “Forecasting in the NBA and Other Team Sports: Network Effects in Action.” ACM Transactions on Knowledge Discovery from Data 6: 13. https://doi.org/10.1145/2362383.2362387.Search in Google Scholar

Wäsche, H., G. Dickson, A. Woll, and U. Brandes. 2017. “Social Network Analysis in Sport Research: An Emerging Paradigm.” European Journal for Sport and Society 14: 1–28. https://doi.org/10.1080/16138171.2017.1318198.Search in Google Scholar

Wikipedia. 2021. 2018–19 Milwaukee Bucks Season. https://en.wikipedia..org/wiki/2018.Search in Google Scholar

Wikipedia. 2022. Mike Budenholzer. https://en.wikipedia.org/wiki/Mike_Budenholzer (accessed May 4, 2022).Search in Google Scholar

Winston, W. L. 2012. Mathletics: How Gamblers, Managers, and Sports Enthusiasts Use Mathematics in Baseball, Basketball, and Football. Princeton, NJ: Princeton University Press.10.1515/9781400842070Search in Google Scholar

Zimmermann, A., S. Moorthy, and Z. Shi. 2013. “Predicting NCAAB Match Outcomes Using ML Techniques - Some Results and Lessons Learned.” In MLSA@PKDD/ECML.Search in Google Scholar

Received: 2021-06-08

Accepted: 2023-05-08

Published Online: 2023-05-23

Published in Print: 2023-06-27

You are currently not able to access this content.

Articles in the same Issue

https://doi.org/10.1515/jqas-2022-0039

Keywords for this article

basketball; classification; lineups; point differential