Riding a probabilistic support vector machine to the Stanley Cup

Simon Demers

doi:10.1515/jqas-2014-0093

Abstract:

The predictive performance of various team metrics is compared in the context of 105 best-of-seven national hockey league (NHL) playoff series that took place between 2008 and 2014 inclusively. This analysis provides renewed support for traditional box score statistics such as goal differential, especially in the form of Pythagorean expectations. A parsimonious relevance vector machine (RVM) learning approach is compared with the more common support vector machine (SVM) algorithm. Despite the potential of the RVM approach, the SVM algorithm proved to be superior in the context of hockey playoffs. The probabilistic SVM results are used to derive playoff performance expectations for NHL teams and identify playoff under-achievers and over-achievers. The results suggest that the Arizona Coyotes and the Carolina Hurricanes can both be considered Round 2 over-achievers while the Nashville Predators would be Round 2 under-achievers, even after accounting for several observable team performance metrics and playoff predictors. The Vancouver Canucks came the closest to qualify as Stanley Cup Finals under-achievers after they lost against the Boston Bruins in 2011. Overall, the results tend to support the idea that the NHL fields extremely competitive playoff teams, that chance or other intangible factors play a significant role in NHL playoff outcomes and that playoff upsets will continue to occur regularly.

Keywords: Elo; machine learning; NHL playoffs; relevance vector; support vector

Corresponding author: Simon Demers, Vancouver Police Department – Planning, Research and Audit Section, Vancouver, British Columbia, Canada, Tel.: +604-999-5380, e-mail: simdem@outlook.com

References

Agresti, A. and B. A. Coull. 1998. “Approximate Is Better than Exact for Interval Estimation of Binomial Proportions.” The American Statistician 52(2):119–126.Search in Google Scholar

Ben-Naim, E., F. Vazquez, and S. Redner. 2006. “Parity and Predictability of Competitions.” Journal of Quantitative Analysis in Sports 2(4):1–6.10.2202/1559-0410.1034Search in Google Scholar

Bradley, R. A. and M. E. Terry. 1952. “Rank Analysis of Incomplete Block Designs: I. The Method of Paired Comparisons.” Biometrika 39(3–4):324–345.10.1093/biomet/39.3-4.324Search in Google Scholar

Brown, L. D., T. T. Cai, and A. DasGupta. 2001. “Interval Estimation for a Binomial Proportion.” Statistical Science 16(2):101–117.10.1214/ss/1009213285Search in Google Scholar

Cane, M. 2014. “Score Adjusted Weighted Shots.” (http://puckplusplus.com/2014/12/10/score-adjusted-weighted-shots/).Search in Google Scholar

Charron, C. 2011. “Pythagorean Expectation and the NHL.” (http://www.nucksmisconduct.com/2011/9/2/2400779/pythagorean-expectation-and-the-nhl).Search in Google Scholar

Charron, C. 2013. “Score Effects and You.” (http://nhlnumbers.com/2013/12/5/score-effects-and-you).Search in Google Scholar

Computer Scientist. 2015. “Can Machine Learning Predict NHL Stanley Cup Champions Using Regular Season Data?” (https://latentbytes.wordpress.com/2015/01/16/can-machine-learning-predict-nhl-stanley-cup-champions-using-regular-season-data/).Search in Google Scholar

Costella, J. L. 2014. “Why Score Effects Are the Key to Your Fancy Stats (Puck Momalytics).” (http://sports.yahoo.com/blogs/puck-daddy/fancy-stats-164244781.html).Search in Google Scholar

Danco, A. 2013. “Predicting the Playoffs Part 2: Does Goaltending Matter?” (http://www.habseyesontheprize.com/2013/6/9/4411742/predicting-the-playoffs-part-2-does-goaltending-matter).Search in Google Scholar

Dayaratna, K. D. and S. J. Miller. 2013. “The Pythagorean Won-Loss Formula and Hockey: A Statistical Justification for Using the Classic Baseball Formula as an Evaluative Tool in Hockey.” The Hockey Research Journal 2012/13 XVI:193–209.Search in Google Scholar

Desjardins, G. 2010. “Playoff Goaltending and ‘Hot’ Goalies.” (http://www.arcticicehockey.com/2010/4/12/1404784/playoff-goaltending-and-hot-goalies).Search in Google Scholar

Desjardins, G. 2011. “Further to: In-Season Momentum.” (http://www.arcticicehockey.com/2011/3/4/1987571/further-to-in-season-momentum).Search in Google Scholar

Emptage, N. 2013. “The View from Vegas: Crowd-Sourcing Predictions for the NHL Season.” (http://puckprediction.com/2013/09/05/the-view-from-vegas-crowd-sourcing-predictions-for-the-nhl-season/).Search in Google Scholar

Emptage, N. 2014a. “How Did the Puck Prediction Model Do in 2013-14?” (http://puckprediction.com/2014/06/25/how-did-the-puck-prediction-model-do-in-2013-14/).Search in Google Scholar

Emptage, N. 2014b. “Predicting the 2013-14 NHL Playoffs.” (http://puckprediction.com/2014/03/17/predicting-the-2013-14-nhl-playoffs/).Search in Google Scholar

Emptage, N. 2014c. “Score-Adjusted Fenwick: 2007-08 to 2013-14.” (http://puckprediction.com/2014/05/31/score-adjusted-fenwick-2007-08-to-2013-14/).Search in Google Scholar

Emptage, N. 2014d. “The Most Unlikely NHL Playoff Upsets of the Last Five Years.” (http://puckprediction.com/2014/04/04/the-most-unlikely-nhl-playoff-upsets-of-the-last-five-years/).Search in Google Scholar

Emptage, N. 2014e. “What Can Pythagorean Expectation Teach Us About Winning in the NHL?” (http://puckprediction.com/2014/02/24/what-can-pythagorean-expectation-teach-us-about-winning-in-the-nhl/).Search in Google Scholar

Emptage, N. 2014f. “Why Gambling on the NHL is a Terrible Idea.” (http://puckprediction.com/2014/03/15/why-gambling-on-the-nhl-is-a-terrible-idea/).Search in Google Scholar

Entine, O. A. and D. S. Small. 2008. “The Role of Rest in the NBA Home-Court Advantage.” Journal of Quantitative Analysis in Sports 4(2):1–9.10.2202/1559-0410.1106Search in Google Scholar

Eypasch, E., R. Lefering, C. K. Kum, and H. Troidl. 1995. “Probability of Adverse Events That Have Not Yet Occurred: A Statistical Reminder.” The BMJ 311(7005):619–620.10.1136/bmj.311.7005.619Search in Google Scholar PubMed PubMed Central

Glickman, M. E. 1999. “Parameter Estimation in Large Dynamic Paired Comparison Experiments.” Journal of the Royal Statistical Society: Series C (Applied Statistics) 48(3):377–394.10.1111/1467-9876.00159Search in Google Scholar

Glickman, M. E. and A. C. Jones. 1999. “Rating the Chess Rating System.” Chance 12(2):21–28.Search in Google Scholar

Harding, J. 2015. “Stats: PDO.” (http://www.beerleagueblog.ca/stats-pdo/).Search in Google Scholar

Herron, M. C. 1999. “Postestimation Uncertainty in Limited Dependent Variable Models.” Political Analysis 8(1):83–98.10.1093/oxfordjournals.pan.a029806Search in Google Scholar

Hvattum, L. M. and H. Arntzen. 2010. “Using ELO Ratings for Match Result Prediction in Association Football.” International Journal of Forecasting 26(3):460–470.10.1016/j.ijforecast.2009.10.002Search in Google Scholar

James, B., J. Albert, and H. S. Stern. 1993. “Answering Questions About Baseball Using Statistics.” Chance 6(2):17–30.10.1080/09332480.1993.10542357Search in Google Scholar

Johnson, D. 2013. “How Do High Shooting Percentage Teams Perform in Playoffs?” (http://hockeyanalysis.com/2013/04/23/how-do-high-shooting-percentage-teams-perform-in-playoffs/).Search in Google Scholar

Johnson, D. 2014. “Being Honest About Possession Stats as a Predictive Tool.” (http://hockeyanalysis.com/2014/05/11/honest-possession-stats-predictive-tool/).Search in Google Scholar

Karatzoglou, A., A. Smola, K. Hornik, and A. Zeileis. 2004. “kernlab – An S4 Package for Kernel Methods in R.” Journal of Statistical Software 11(9):1–20.Search in Google Scholar

Li, F. 2014. “NHL Play-by-Play Data Mining Part 3C – The Truth About Score Effects.” (https://playfor60minutes.wordpress.com/2014/11/08/110/).Search in Google Scholar

Lin, H.-T. and R. C. Weng. 2001. “A Note on Platt’s Probabilistic Outputs for Support Vector Machines.” (http://www.csie.ntu.edu.tw/~htlin/paper/doc/plattprob.pdf).Search in Google Scholar

Luszczyszyn, D. 2014a. “Playoff Output Projection (POP).” (http://hithisisdom.com/pop/).Search in Google Scholar

Luszczyszyn, D. 2014b. “Predicting the NHL Playoffs.” (http://hithisisdom.com/2014/04/15/predicting-the-nhl-playoffs/).Search in Google Scholar

MacKay, D. J. 1992. “Bayesian Methods for Adaptive Models.” Ph.D. thesis, California Institute of Technology, Pasadena, California.Search in Google Scholar

NHL Insider. 2015. “NHL, SAP Partnership to Lead Statistical Revolution.” (http://www.nhl.com/ice/news.htm?id=754184).Search in Google Scholar

Nilesh. 2014. “FTC’s NHL Playoffs Preview and Predictions.” (http://www.freetankcarter.com/2014/04/ftcs-nhl-playoffs-preview-and.html).Search in Google Scholar

Purdy, T. 2010a. “Playoff Probabilities.” (http://objectivenhl.blogspot.ca/2010/04/playoff-probabilities.html).Search in Google Scholar

Purdy, T. 2010b. “The Relationship Between Outshooting and Outscoring Over Time.” (http://objectivenhl.blogspot.ca/2010/02/relationship-between-outshooting-and.html).Search in Google Scholar

Purdy, T. 2011. “Loose Ends – Part I: Predicting Future Success.” (http://objectivenhl.blogspot.ca/2011/03/loose-ends-part-i-predictive-validity.html).Search in Google Scholar

Reynolds, S. 2012. “Using Goal Differential to Predict the Future.” (http://nhlnumbers.com/2012/11/15/using-goal-differential-to-predict-the-future).Search in Google Scholar

Silver, N. 2014. “Good News: You Won Game 7; Bad News: You’re Less Likely to Win Round 2.” (http://fivethirtyeight.com/features/good-news-you-won-game-7-bad-news-youre-less-likely-to-win-round-2/).Search in Google Scholar

Sullivan, B. 2014. “Projecting 2014-15 NHL Standings Using Fenwick & Corsi Possession Statistics.” (http://www.sportingcharts.com/articles/nhl/projecting-2014-15-nhl-standings-using-fenwick-corsi-possession-statistics.aspx).Search in Google Scholar

Swartz, T. B., A. Tennakoon, F. Nathoo, P. S. Sarohia, and M. Tsao. 2011. “Ups and Downs: Team Performance in Best-of-Seven Playoff Series.” Journal of Quantitative Analysis in Sports 7(4):1–17.10.2202/1559-0410.1372Search in Google Scholar

Tango, T. M. 2014. “Introducing Weighted Shots Differential (aka Tango).” (http://tangotiger.com/index.php/site/comments/introducing-weighted-shots-differential-aka-tango/).Search in Google Scholar

Tipping, M. E. 2001. “Sparse Bayesian Learning and the Relevance Vector Machine.” Journal of Machine Learning Research 1:211–244.Search in Google Scholar

Tipping, M. E. and A. C. Faul. 2003. “Fast Marginal Likelihood Maximisation for Sparse Bayesian Models.” in Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics, volume 1, pp. 1–8.Search in Google Scholar

Tulsky, E. 2012. “Score-Adjusted Fenwick Standings Update.” (http://www.broadstreethockey.com/2012/2/22/2816616/score-adjusted-fenwick-standings-update).Search in Google Scholar

Tulsky, E. 2014a. “Quantifying the Added Importance of Recent Data.” (http://www.sbnation.com/nhl/2014/1/21/5329992/nhl-stats-projections-data).Search in Google Scholar

Tulsky, E. 2014b. “Review: Quantifying the Importance of NHL Shot Differential.” (http://www.sbnation.com/nhl/2014/1/28/5353036/nhl-shot-differential-stats-importance).Search in Google Scholar

Vollset, S. E. 1993. “Confidence Intervals for a Binomial Proportion.” Statistics in Medicine 12(9):809–824.10.1002/sim.4780120902Search in Google Scholar PubMed

Vrooman, J. 2012. “Theory of the Big Dance: The Playoff Payoff in Pro Sports Leagues.” The Oxford Handbook of Sports Economics, 1:51–75.10.1093/oxfordhb/9780195387773.013.0004Search in Google Scholar

Weissbock, J. 2014. “Forecasting Success in the National Hockey League Using In-Game Statistics and Textual Data.” Ph.D. thesis, University of Ottawa, Ottawa, Canada.Search in Google Scholar

Weissbock, J., H. Viktor, and D. Inkpen. 2013. “Use of Performance Metrics to Forecast Success in the National Hockey League.” in European Conference on Machine Learning: Sports Analytics and Machine Learning Workshop, pp. 1–10.Search in Google Scholar

Wilson, E. B. 1927. “Probable Inference, the Law of Succession, and Statistical Inference.” Journal of the American Statistical Association 22(158):209–212.10.1080/01621459.1927.10502953Search in Google Scholar

Yost, T. 2014a. “Score-Adjusted Fenwick and Remaining Strength of Schedule.” (http://nhlnumbers.com/2014/1/13/score-adjusted-fenwick-and-remaining-strength-of-schedule).Search in Google Scholar

Yost, T. 2014b. “Stanley Cup Final: Want to Pick Playoff Winners? Try Score-Adjusted Fenwick.” (http://www.sportingnews.com/nhl/story/2014-06-05/stanley-cup-final-rangers-kings-score-adjusted-fenwick-2014-playoffs).Search in Google Scholar

Zhang, T. 2004. “Solving Large Scale Linear Prediction Problems Using Stochastic Gradient Descent Algorithms.” in Proceedings of the Twenty-First International Conference on Machine Learning (ICML), ACM, p. 116.Search in Google Scholar

Supplemental Material:

The online version of this article (DOI: 10.1515/jqas-2014-0093) offers supplementary material, available to authorized users.

Published Online: 2015-9-19

Published in Print: 2015-12-1

You are currently not able to access this content.

Supplementary material

Riding a probabilistic support vector machine to the Stanley Cup

Abstract:

References

Supplemental Material:

Articles in the same Issue

Articles in the same Issue

Riding a probabilistic support vector machine to the Stanley Cup

Article

Abstract:

References

Supplemental Material:

Supplementary Material

Articles in the same Issue

Articles in the same Issue

Articles in the same Issue