Abstract
Using high-resolution player tracking data made available by the National Football League (NFL) for their 2019 Big Data Bowl competition, we introduce the Expected Hypothetical Completion Probability (EHCP), a objective framework for evaluating plays. At the heart of EHCP is the question “on a given passing play, did the quarterback throw the pass to the receiver who was most likely to catch it?” To answer this question, we first built a Bayesian non-parametric catch probability model that automatically accounts for complex interactions between inputs like the receiver’s speed and distances to the ball and nearest defender. While building such a model is, in principle, straightforward, using it to reason about a hypothetical pass is challenging because many of the model inputs corresponding to a hypothetical are necessarily unobserved. To wit, it is impossible to observe how close an un-targeted receiver would be to his nearest defender had the pass been thrown to him instead of the receiver who was actually targeted. To overcome this fundamental difficulty, we propose imputing the unobservable inputs and averaging our model predictions across these imputations to derive EHCP. In this way, EHCP can track how the completion probability evolves for each receiver over the course of a play in a way that accounts for the uncertainty about missing inputs.
References
Burke, B. 2019. “Deepqb: deep learning with player tracking to quantify quarterback decision making and performance.” In Proceedings of the 2019 MIT Sloan Sports Analytics Conference. http://www.sloansportsconference.com/wp-content/uploads/2019/02/DeepQB.pdf.Search in Google Scholar
Carpenter, B., A. Gelman, M. D. Hoffman, D. Lee, B. Goodrich, M. Betancourt, M. Brubaker, J. Guo, P. Li, and A. Riddell. 2017. “Stan: a probabilistic programing language.” Journal of Statistical Software 76(1):1–32.10.18637/jss.v076.i01Search in Google Scholar
Cervone, D., A. D’Amour, L. Bornn, and K. Goldsberry. 2014. “Pointwise: predicting points and valuing decisions in real time with NBA optical tracking data.” In Proceedings of the 2014 MIT Sloan Sports Analytics Conference. http://www.sloansportsconference.com/wp-content/uploads/2018/09/cervone_ssac_2014.pdf.Search in Google Scholar
Cervone, D., A. D’Amour, L. Bornn, and K. Goldsberry. 2016. “A multiresolution stochastic process model for predicting basketball possession outcomes.” Journal of the American Statistical Association 111(514):585–599.10.1080/01621459.2016.1141685Search in Google Scholar
Chipman, H. A., E. I. George, and R. E. McCulloch. 2010. “BART: Bayesian additive regression trees.” The Annals of Applied Statistics 4(1):266–298.10.1214/09-AOAS285Search in Google Scholar
Franks, A., A. Miller, L. Bornn, and K. Goldsberry. 2015. “Counterpoints: advanced defensive metrics for NBA basketball.” In Proceedings of the 2015 MIT Sloan Sports Analytics Conference. http://www.sloansportsconference.com/wp-content/uploads/2015/02/SSAC15-RP-Finalist-Counterpoints2.pdf.Search in Google Scholar
Gelman, A., A. Jakulin, M. G. Pittau, and Y.-S. Su. 2008. “A weakly informative default prior distribution for logistic regression.” Annals of Applied Statistics 2(4):1360–1383.10.1214/08-AOAS191Search in Google Scholar
Horowitz, M., R. Yurko, and S. Ventura. 2019. nflscrapR: compiling the NFL play-by-play API for easy use in R. R package version 1.8.1.Search in Google Scholar
Linero, A. R. 2017. “A review of tree-based Bayesian methods.” Communications for Statistical Applications and Methods 24(6):543–559.10.29220/CSAM.2017.24.6.543Search in Google Scholar
Linero, A. R. 2018. “Bayesian regression trees for high-dimensional prediction and variable selection.” Journal of the American Statistical Association 113(522):626–636.10.1080/01621459.2016.1264957Search in Google Scholar
McCulloch, R., R. Sparapani, R. Gramacy, C. Spanbauer, and M. Pratola. 2018. BART: Bayesian Additive Regression Trees. R package version 2.1.10.1002/9781118445112.stat08251Search in Google Scholar
Miller, A. and L. Bornn. 2017. “Possession sketches: mapping NBA strategies.” In Proceedings of the 2017 MIT Sloan Sports Analytics Conference. http://www.sloansportsconference.com/wp-content/uploads/2017/02/1624.pdf.Search in Google Scholar
NFL Next Gen Stats Team. 2018. “Next gen stats introduction to completion probability.” http://www.nfl.com/news/story/0ap3000000964655/article/next-gen-stats-introduction-to-completion-probability.Search in Google Scholar
Stan Development Team. 2018. RStan: the R interface to Stan. R package version 2.17.3.10.2478/msd-2018-0003Search in Google Scholar
©2019 Walter de Gruyter GmbH, Berlin/Boston
Articles in the same Issue
- Frontmatter
- Research note
- Bigger data, better questions, and a return to fourth down behavior: an introduction to a special issue on tracking datain the National football League
- Commentary
- What will we unlearn next? The implications of Lopez (2020)
- Research articles
- Expected hypothetical completion probability
- Extracting NFL tracking data from images to evaluate quarterbacks and pass defenses
- Route identification in the National Football League
- Template matching route classification
- Unsupervised methods for identifying pass coverage among defensive backs with NFL player tracking data
- Going deep: models for continuous-time within-play valuation of game outcomes in American football with tracking data
Articles in the same Issue
- Frontmatter
- Research note
- Bigger data, better questions, and a return to fourth down behavior: an introduction to a special issue on tracking datain the National football League
- Commentary
- What will we unlearn next? The implications of Lopez (2020)
- Research articles
- Expected hypothetical completion probability
- Extracting NFL tracking data from images to evaluate quarterbacks and pass defenses
- Route identification in the National Football League
- Template matching route classification
- Unsupervised methods for identifying pass coverage among defensive backs with NFL player tracking data
- Going deep: models for continuous-time within-play valuation of game outcomes in American football with tracking data