Abstract
Previously published statistical analyses of NCAA Division I Men’s Tournament (“March Madness”) game outcomes have revealed that the relationship between tournament seed and the time-aggregated number of third-round (“Sweet 16”) appearances for the middle half of the seeds exhibits a statistically and practically significant departure from monotonicity. In particular, the 8- and 9-seeds combined appear less often than any one of seeds 10–12. In this article, we show that a similar “middle-seed anomaly” also occurs in the NCAA Division I Women’s Tournament but does not occur in two other major sports tournaments that are similar in structure to March Madness. We offer explanations for the presence of a middle-seed anomaly in the NCAA basketball tournaments, and its absence in the others, that are based on the combined effects of the functional form of the relationship between team strength and seed specific to each tournament, the degree of parity among teams, and certain elements of tournament structure. Although these explanations account for the existence of middle-seed anomalies in the NCAA basketball tournaments, their larger-than-expected magnitudes, which arise mainly from the overperformance of seeds 10–12 in the second round, remain enigmatic.
-
Author contribution: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.
-
Research funding: None declared.
-
Conflict of interest statement: The authors declare no conflicts of interest regarding this article.
References
Baumann, R., V. A. Matheson, and C. A. Howe. 2010. “Anomalies in Tournament Design: The Madness of March Madness.” Journal of Quantitative Analysis in Sports 6: 1–9. https://doi.org/10.2202/1559-0410.1233.Suche in Google Scholar
Bradley, R. A., and M. E. Terry. 1952. “Rank Analysis of Incomplete Block Designs: I. The Method of Paired Comparisons.” Biometrika 39: 324–45. https://doi.org/10.2307/2334029.Suche in Google Scholar
Clay, D. C., A. S. Bro, and N. J. Clay. 2015. “Geospatial Determinants of Game Outcomes in NCAA Men’s Basketball.” The International Journal of Sport and Society 4: 71–81. https://doi.org/10.18848/2152-7857/cgp/v04i04/64015.Suche in Google Scholar
Doyel, G. 2009. Eight, Nine, Futility Time: Cowboys, Saints will Learn Sunday. Also available at http://www.cbssports.com/print/collegebasketball/story/11533417/rss.Suche in Google Scholar
Dykstra, R., and H. El Barmi. 2006. “Chi-bar-square Distributions.” In Encyclopedia of Statistical Sciences, edited by S. Kotz, C. B. Read, N. Balakrishnan, B. Vidakovic, and N. L. Johnson, 2nd ed. Hoboken, NJ: Wiley.10.1002/0471667196.ess0265.pub2Suche in Google Scholar
El Barmi, H., and M. Johnson. 2006. “A Unified Approach to Testing for and against a Set of Linear Inequality Constraints in the Product Multinomial Setting.” Journal of Multivariate Analysis 97: 1894–912. https://doi.org/10.1016/j.jmva.2005.06.006.Suche in Google Scholar
Giknis, F. 2016. Is Parity Still an Issue for Women’s NCAA Basketball? collegead.com/parity-still-issue-womens-ncaa-basketball.Suche in Google Scholar
Glickman, M. E., and H. S. Stern. 2017. “Estimating Team Strength in the NFL.” In Handbook of Statistical Methods and Analyses in Sports, edited by J. Albert, M. E. Glickman, T. B. Swartz, and R. H. Koning, 113–35. Boca Raton, FL: CRC Press.10.1201/9781315166070Suche in Google Scholar
Harville, D. A., and M. H. Smith. 1994. “The Home-Court Advantage: how Large Is it and Does it Vary from Team to Team?” The American Statistician 48: 22–8. https://doi.org/10.1080/00031305.1994.10476013.Suche in Google Scholar
Morris, T. L., and F. H. Bohkari. 2012. “The Dreaded Middle Seeds — Are They the Worst Seeds in the NCAA Basketball Tournament?” Journal of Quantitative Analysis in Sports 8: 1–13. https://doi.org/10.1515/1559-0410.1343.Suche in Google Scholar
Pollard, R., and G. Pollard. 2005. “Long-term Trends in Professional Sports in Nortmh Aerica and England (1876–2003).” Journal of Sports Sciences 23: 337–50. https://doi.org/10.1080/02640410400021559.Suche in Google Scholar PubMed
Reinig, B. A., and I. Horowitz. 2019. “Analyzing the Impact of the NCAA Selection Committee’s New Quadrant System.” Journal of Sports Analytics 5: 325–33. https://doi.org/10.3233/jsa-190337.Suche in Google Scholar
Stefani, R. T. 1978. “Improved Least Squares Football, Basketball, and Soccer Predictions.” IEEE Transactions on Systems, Man, and Cybernetics 10: 116–23.10.1109/TSMC.1980.4308442Suche in Google Scholar
Swartz, T. B., and A. Arce. 2014. “New Insights Involving the Home Team Advantage.” International Journal of Sports Science & Coaching 9: 681–92. https://doi.org/10.1260/1747-9541.9.4.681.Suche in Google Scholar
Zimmerman, D. L., N. D. Zimmerman, and J. T. Zimmerman. 2020. “March Madness “Anomalies”: Are They Real, and if So, Can They Be Explained?” The American Statistician, https://doi.org/10.1080/00031305.2020.1720814.Suche in Google Scholar
© 2021 Walter de Gruyter GmbH, Berlin/Boston
Artikel in diesem Heft
- Frontmatter
- Research articles
- Winning and losing streaks in the National Hockey League: are teams experiencing momentum or are games a sequence of random events?
- The middle-seed anomaly: why does it occur in some sports tournaments but not others?
- A Skellam regression model for quantifying positional value in soccer
- How to extend Elo: a Bayesian perspective
- A mixed effects multinomial logistic-normal model for forecasting baseball performance
- Distributed lag models to identify the cumulative effects of training and recovery in athletes using multivariate ordinal wellness data
Artikel in diesem Heft
- Frontmatter
- Research articles
- Winning and losing streaks in the National Hockey League: are teams experiencing momentum or are games a sequence of random events?
- The middle-seed anomaly: why does it occur in some sports tournaments but not others?
- A Skellam regression model for quantifying positional value in soccer
- How to extend Elo: a Bayesian perspective
- A mixed effects multinomial logistic-normal model for forecasting baseball performance
- Distributed lag models to identify the cumulative effects of training and recovery in athletes using multivariate ordinal wellness data