Contributions of Carl Morris in sports analytics, a memorium

Jim Albert

doi:10.1515/jqas-2023-0092

Article Publicly Available

Contributions of Carl Morris in sports analytics, a memorium

Jim Albert

Published/Copyright: January 8, 2024

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information

From the journal Journal of Quantitative Analysis in Sports Volume 20 Issue 1

Abstract

Carl Morris 1938–2023 was well-known for his pioneering research in Bayesian multiparameter inference and prediction. Morris was also known for his development of statistical thinking and methodology in sports. This paper provides an overview of Morris’ contributions in sports. This includes Morris’ experience in sports as a youth, summaries of some of Morris’ best-known contributions using sports data, his influence working with students, and some of Morris’ thinking about the interplay of statistics and sports.

Keywords: baseball; tennis; hierarchical Bayes; empirical Bayes

1 Beginnings: growing up in San Diego

Carl Morris in Morris (2014) describes his early experience with sports growing up in San Diego. He enjoyed playing softball and touch football as a youth and later tennis as a teen and an adult. He regularly read the daily baseball box scores where he got introduced to baseball statistics. In the 1950s, there was no local professional Major League Baseball (MLB) team, but he was able to watch the San Diego Padres in the Pacific Coast League. In the early days of integration of MLB, he was able to see some of the early great black players such as Larry Doby, Minnie Minoso and Luke Easter who played in San Diego. Like the author, Morris played the baseball simulation game All-Star Baseball described in Chapter 1 of Albert and Bennett (2003). This game was based on spinners, where the probabilities of outcomes of a plate appearance for a specific player were represented by areas on the spinner.

Morris (2014) remembers that when he first heard of statistics, he thought the entire field was Allan Roth, the Brooklyn Dodger statistician. Roth, hired by Branch Rickey, was the first statistician to work full-time for a major league team. Allan Roth was mentioned by Rickey in his famous article in Life Magazine (Rickey 1954). At that time, Morris realized that wanting to be a statistician was like wanting to be a movie star. Later he understood that first impression was not correct – there were other avenues in the field of statistics.

2 The most important point in tennis

Since Carl Morris was an avid tennis player, it seems appropriate to first describe one of his early statistics papers focused on tennis. In any game, it is important to identify instances during the game that are especially important towards the ultimate goal of winning. A tennis match consists of a group of sets, where each set consists of a group of games, and a game consists of a group of points. Morris (1977) addresses an interesting question: what are the most important points in a game? Morris gives some of the common answers. Some people would say it is the first point since it’s important to get a good start. Since Morris was an experienced tennis player, he might intuitively think the deuce and advantage points were important in a game. Morris wanted a more precise measure of importance and so he formally defined importance to be the difference between two conditional probabilities, the probability the server wins the game given he wins the point and the probability the server wins the game given he loses the point. Since this is a two-person zero-sum game, each point is equally important to the server and the receiver.

Morris uses probabilistic reasoning to address this issue. Successive points are independent where the probability the server wins a point is p. In professional tennis, the server win probability is about 0.6. For values of p > 0.6, Morris finds that the most and least important points in a game are 30-40 and 40-0, respectively. Morris continues this discussion by defining the most important game in a set and the most important set in a match.

Morris makes some interesting comments in the conclusion of this article. Higher average importance happens in points in the ad-side of the court. So, for example, doubles teams are advised to have their stronger player receive on the ad court. Also, since a tennis match tends to have more odd-numbered service games, the player who serves first will serve under less pressure.

The issue of important or so-called high leverage points continue to be an important subject in sports. Klaassen and Magnus (2014) and Kovalchik (2016) discuss important points in tennis. Using Morris’ definition, Kovalchik compares the actual point importance points with the expected values from the probability model. Currently the ATP has a leaderboard of “under pressure” where they record the percentage of tie breaks and deciding sets won by each player. Baumer et al. (2023) describe the general use of win probabilities in sports as one of the big ideas in sports analytics.

3 Shrinking batting averages

3.1 Baseball illustration of a two-stage Bayesian model

As motivation to multilevel modeling, Morris (1998) illustrates the familiar Bayesian two-stage model in a baseball setting Suppose y₁ and y₂ represent respectively the observed winning fractions of a Major League baseball team in the first and following seasons. Let p_j denote the true winning fraction for the team in year j and assume that the team’s ability does not change between seasons, so p₁ = p₂ = p. Morris assigns a prior for p based on his beliefs about the location of p. Using familiar calculations for this two-stage Bayesian model, he illustrates the computations of the posterior for p and the predictive distribution for y₂ given an observed y₁. From this work, he computes the probability the team does worse in the second season, P(y₂ < y₁), to be 0.74 for a team who initially wins 58.5 of their games. This calculation demonstrates the regression to the mean phenomenon. Interestingly, this probability is approximately equal to the fraction of division winners who declined the following season from data from the 1987 to 1996 seasons.

3.2 The data

The Efron and Morris baseball dataset is one of the most famous “small datasets” in statistics and is commonly used to illustrate the benefits of multimodal modeling. Gelman (2023) discusses the benefits of the use of these type of stylized data examples in teaching statistics.

Carl Morris and Brad Efron were classmates at the undergraduate level at Cal Tech and reunited as graduate students at Stanford. They wanted to write a joint paper on Stein estimation with the intent of showing the value of this estimation procedure on real data. Both Morris and Efron were baseball fans and so baseball was a convenient source of data for this study. One could obtain measures of performance early in the season, and then could observe how these players performed for the season’s remainder.

To test Stein’s procedure, they wanted to start with an extreme performer, that is, a hitter who was much better than the rest. So they chose the Hall of Famer Roberto Clemente who was 18 for 45 for a 0.400 batting average in the first few weeks into the 1970 season. Then they found 17 other players with exactly 45 at-bats. The Stein estimator required approximate normality and that the observations had equal sampling variances which explains why they chose players who had the same number of at-bats.

Data was collected from reading microfilms of the New York Times sports pages. (The author recalls getting excited about the publication of batting and pitching statistics for all MLB players that were published in the Sunday newspaper.)

3.3 The scientific American paper

Efron and Morris (1975, 1977 provides a clear demonstration of the benefits of the partial pooling character of the James–Stein estimator using this baseball dataset. The goal is to predict the “rest of the season” batting averages for these 18 players given their batting averages in the first 45 at-bats.

The James–Stein estimate has the general form

θ ̂ i = y ̄ + c ( y i − y ̄ ) ,

where the ith batting average y_i is shrunk towards the overall average y ̄ where the shrinking fraction c is given by

c = 1 − ( k − 3 ) σ 2 Σ ( y i − y ̄ ) 2 ,

where k is the number of means.

Figure 1 from Efron and Morris (1977) illustrates the dramatic shrinkage of the observed averages towards the grand mean. For these data, the shrinkage of the James–Stein estimates is 80 per cent.

Figure 1:

Shrinkage graph from Efron and Morris (1977).

Efron and Morris demonstrate that these estimates are superior to the raw batting averages in predicting the “rest of the season” batting averages using a sum of squared errors criterion. For this example, the James–Stein estimate has a total of squared error of 0.022 compared with the total of squared error of 0.077 of the observed averages. It should be commented that the James–Stein estimator is an attractive quick way to implement Bayesian shrinkage without the computational cost of implementing a full Bayesian multilevel model.

Efron and Morris make several insightful comments about this procedure.

The James–Stein estimate is a type of adaptive estimate. If the data support the assumption that the underlying abilities are equal, the method shrinks the observed averages strongly towards the mean. On the other hand, if the underlying abilities are very different, the size of the shrinkage will be limited.
In the case when there are unequal sample sizes (here, different number of at-bats) for the groups, the shrinkage will be greatest for the groups with small sample sizes. This is illustrated in the Ty Cobb example that follows.
These estimates are helpful for other inferences, say determining the ordering of the underlying means.
The estimates are similar to Bayesian estimates that shrink the observed average towards a prior mean. One attractive feature of this approach is one is not inputting knowledge of the prior or the belief that the means are normally distributed.

Morris revisited this baseball averages dataset in Morris (1983). He illustrated the use of empirical Bayes (EB) confidence intervals to predict the final season batting averages of 18 baseball players given their averages in the first 45 at-bats. The EB intervals are displayed in Morris (1983) reproduced in Figure 2. The circled points represent the early season and final batting averages for the 18 players. The solid lines represent the classical confidence bands and the dashed lines represent the EB confidence bands based on the multilevel model. Morris notes that the EB intervals are much shorter than the classical intervals. In addition all 18 final averages are in the EB intervals while 17 out of the 18 are in the classical intervals. An extreme individual batting average such as 0.400 produces a wide classical interval that doesn’t account for the knowledge that all other observed batting averages are smaller than 0.400.

Figure 2:

Empirical Bayes confidence interval display from Morris (1983).

3.4 Modern perspective

Using modern terminology, the Efron and Morris dataset can be fit using a two-stage Bayesian multilevel model. The number of hits of the ith batter y_i is binomial with sample size 45 and hit probability p_i, i = 1, …, 18. One can represent the hit probabilities by use of the logistic model

logit p i 1 − p i = μ + γ i ,

where μ is an average logit probability, the random effects γ₁, …, γ_N ∼ N(0, σ). The second stage hyperparameters μ, σ are assigned weakly informative priors.

Using the brms package interface (Bürkner 2017) to the Stan programming language, one can fit this model by use of the brm() function where one indicates that the observations are binomial data and the random effects model is represented by use of the (1 | player) syntax in the model description.

fit_z <-

brm(data = baseball,

family = binomial,

hits | trials(45) ∼ 1 + (1 | player),

prior = c(prior(normal(0, 1.5), class = Intercept),

prior(normal(0, 1.5), class = sd)))

Using this MCMC fitting, one can obtain a simulated sample of the joint posterior of the parameters p₁, …, p₁₈, μ, and σ. If one makes reasonable assumptions for the number of at-bats in the remainder of the season, it is straightforward to obtain prediction intervals for the batting averages in the remainder of the season from the posterior predictive distribution. In an exercise, the author found 95 % prediction intervals for the final season batting averages and found that they are similar in location to the empirical Bayes confidence intervals found in Morris (1983).

4 Has there ever been a true 0.400 hitter?

Morris (1983) provides a general framework for an empirical Bayes viewpoint on hierarchical regression models including results on interval estimates including the case where the sampling variances across groups are not equal.

To illustrate these methods, Morris looked at the batting averages of Ty Cobb, one of the greatest hitters in MLB. Figure 3 displays a scatterplot of Ty Cobb’s batting averages for all of his seasons from 1905 through 1928. The figure adds a quadratic smoothing curve that is a reasonable fit to these data.

Figure 3:

Scatterplot of Ty Cobb’s batting averages.

Figure 3 shows that Cobb had an average exceeding 0.400 in the 1911, 1912 and 1922 seasons. But these averages don’t directly display Cobb’s talent. For example, Cobb’s 0.419 average in the 1911 season is influenced both by his hitting talent and the sampling variation due to other factors such as the fielding, pitching and ballpark characteristics. Morris raises the interesting question: “Did Cobb’s batting ability exceed 0.400 sometime during his career?”

Let y_j denote the number of hits of Cobb in n_j at-bats in the jth season. Assume that y_j is binomial with sample size n_j and probability p_j. We can think of p_j as representing Cobb’s true ability of getting a hit in that particular season.

Morris assumes the means of the p_j, the η_j, follow the quadratic model

η j = β 0 + β 1 j + β 2 j 2 .

Morris applied normal approximations and empirical Bayes estimates of the second-stage parameters to get posterior estimates of the p_j. Currently it is straightforward to fit a fully Bayesian hierarchical model where the second-stage parameters are assigned weakly informative distributions. Figure 4 shows the observed, regression and multilevel Bayesian estimates of the hitting probabilities.

Figure 4:

Observed, regression and multilevel Bayesian estimates for Ty Cobb example.

Morris asked if Cobb’s true batting average p_j exceeds 0.400 for any season. One can address this question by examining the posterior density of the maximum batting probability p = max{p₁, …, p₂₃}. From the MCMC fitting, one computes the probability the maximum probability exceeds 0.400 is 0.761 which agrees closely with Morris’ result. The takeaway is that it is likely that Cobb was a true 0.400 hitter.

5 Other sports contributions

Although these particular papers with sports examples may be the most cited, Morris was actively working on statistical problems for a variety of sports throughout his career. Here are some brief descriptions of several of these problems.

A tennis scoring system. Morris developed a box score to track the statistics of serves and of receiving, long points, etc. in a tennis game. He had plans to make this system available for players via an electronic scorekeeper worn on players’ wrists that retains point-by-point data to be analyzed.
Chance of comebacks in an NBA game. On December 21, 2009, the Sacramento Kings came back to win against the Chicago Bulls after being behind by 35 points. In a HSAC blog post, Morris describes how he computes the chance of this event to be approximately 1 out of 24,000.
Williams/Clijsters tennis match. Serena Williams lost a semifinal US Open tennis match against Kim Clijsters partially due to a controversial foot-fault call. In a Wall Street Journal article, Carl Morris estimates that this one particular episode only contributed to 8 % of Williams loss.
Runs per game formula in baseball. Morris had his own formula, Runs per Game (RPG), for determining the runs value of different batting lineups. Using this measure, Morris claimed in an ESPN article that the 2002 Barry Bonds was the best offensive hitter of all time. A team of nine Bonds would generate 22.4 runs per game. Morris (2015) used this measure to show that Pete Rose’s career hitting performance is not as strong as many other historical players who are not currently in the Baseball Hall of Fame.
Are Game 7’s more common in World Series? An Inside Science article describes Morris’ observations about the unusual occurrence of 7-games in World Series between 1952 and 1977.
Predicting Overtime with the Pythagorean Formula. Rosenfeld et al. (2010) extend the famous Bill James Pythagorean formula to model the results of close games in professional games in baseball, basketball and football.
Improving Major League Baseball Park Factor Estimates. It is well known that baseball performance measures are very sensitive to ballpark effects. Acharya et al. (2008), written by Morris and a group of students from HSAC, develops a improved estimate of park effects that corrects the biases found in the common ESPN park effects measures.

6 Harvard sports analysis collective

The famous statistician Frederick Mosteller had a passion for sports – Mosteller (1952) contains a famous World Series study and a review of some of his sports work is given in Mosteller (1997). Morris participated in a small sports seminar at Harvard organized by Mosteller. Following Mosteller’s lead, Morris founded the undergraduate sports club, the Harvard Sports Analysis Collective (HSAC) in 2006. This group, still active in 2023, has explored statistical issues for a number of sports. As stated in a post from the HSAC website, this group wouldn’t have existed without Morris’ dedication, mentorship, and support. Some of HSAC’s current and past leaders have moved on to analytics careers with professional sports teams and the club is always recruiting new members.

As described in Morris (2014), the HSAC have worked on issues including team rankings, player evaluations, salary predictions, injuries, draft analyses, and commentaries on published sports analyses. The students support each other with data sets and by sharing ideas in weekly meetings.

One can learn about Carl Morris’ impact on HSAC by testimonials from students who participated in the club. Daniel Adler, HSAC co-president 2019–2010 writes: My career and life would not be the same without Carl Morris. Though my statistical acumen is surpassed by legitimate statisticians and data scientists, Carl helped equip me with knowledge to understand important statistical concepts and how they apply to sports. I sharpened my statistical and rhetorical skills during HSAC meetings thanks to Carl. He was a true mentor and friend. Carl Morris’s HSAC tree includes many of the pioneers in sports research, including analysts to assistant GM’s. (Adler is currently Assistant General Manager for the Minnesota Twins baseball team.). Carl spent countless hours in various house common rooms and adding his informed perspective to heated sports debates.

Ben Blatt, HSAC member writes: Despite his qualifications and abilities he was patient and generous enough to lend his expertise to us undergraduate students in the Harvard Sports Analysis Collective dreaming up theories on sports that we did not even have the know-how to test for. Many of us had not even taken a full year of freshman statistics classes when he was willing to give feedback and push in the right direction. Personally, many great paths I have been able to travel down in my life would not have been possible without Carl. Since college I have been fortunate enough to work both for an NFL team and also as a reporter for major newspapers about sports statistics. Carl was sharp, both in data and humor. He was not the loudest member of HSAC sitting around Winthrop’s Owen Room but he was the one most able to shift discussion with a single pointed observation.

Kevin Rader, HSAC member writes: Carl’s kindness and curiosity was infectious. He took a band of students and others from the Harvard Stat dept. to see a premier showing of Moneyball by Boston Common. I loved experiencing the inner-working of a field he helped pioneer with him. The first conversation I had with Carl was a deep philosophical discussion about parametric versus nonparameteric analysis within survival analysis. He noticed I struggled a bit finding the words, and so eased my stress by turning the conversation towards baseball. He loved chatting sports analytics with Harvard students, and attended HSAC meetings even as an Emeritus faculty adviser.

7 General comments about the interplay of sports and statistics

In the Morris (2014) interview, Morris makes some interesting comments about the usefulness of sports in exploring statistical concepts.

In teaching statistics, it is important to have good data sets, those that people are familiar with and for which the relevant questions are apparent. For those who are especially interested in sports, sports data can help a lot. Actually, any application a student appreciates and that shows how statistical thinking works is a strong motivator for learning. It really facilitates learning if we show students how statistics helps them understand topics they care about.
Morris used sports data, so he can think about the application and the theory. In that way, not only does theory enhance the application, but knowledge of the application suggests the appropriate theory, and raises better questions. Working on applications challenges one to not sweep real issues under the rug. With authentic data, we pay more attention to the assumptions we make, whether they are valid, and how we can develop better methods and models that reflect the realities of the data and the science we’re investigating.
Sports data challenges us to work with Big Data.
Statisticians who understand a sport well can use that to learn about how people interpret and use statistics and probability in sports. Such opportunities arise when we hear, read, or watch discussions about sports analyses and get the thoughts of team leaders, members of the media, and of fans. This gives us a lens into the thinking processes of bright, motivated people who don’t share our inferential training. We get to see the smart things they say and their statistical flaws. Such people often raise interesting questions that can be analyzed in principled ways and that sometimes give us new ideas to analyze.

Corresponding author: Jim Albert, Department of Mathematics and Statistics, Bowling Green State University, Bowling Green OH 43403, USA, E-mail: albert@bgsu.edu

Research ethics: Not applicable.
Author contributions: Jim Albert accepted responsibility for the entire content of this manuscript and approved its submission.
Competing interests: Jim Albert states no conflict of interest.
Research funding: Not applicable.
Data availability: Not applicable.

References

Acharya, R.A., Ahmed, A.J., D’Amour, A.N., Lu, H., Morris, C.N., Oglevee, B.D., Swift, R.N., and Peterson, A.W. (2008). Improving major league baseball park factor estimates. J. Quant. Anal. Sports 4, https://doi.org/10.2202/1559-0410.1108.Search in Google Scholar

Albert, J. and Bennett, J. (2003). Curve ball, 2nd ed. Springer, New York, NY.Search in Google Scholar

Baumer, B., Matthews, G., and Nguyen, Q. (2023). Big ideas in sports analytics and statistical tools for their investigation. WIREs Comput. Stat., https://doi.org/10.1002/wics.1612.Search in Google Scholar

Bürkner, P.-C. (2017). brms: an R package for Bayesian multilevel models using Stan. J. Stat. Software 80: 1–28. https://doi.org/10.18637/jss.v080.i01.Search in Google Scholar

Efron, B. and Morris, C. (1975). Data analysis using Stein’s estimator and its generalizations. J. Am. Stat. Assoc. 70: 311–319. https://doi.org/10.1080/01621459.1975.10479864.Search in Google Scholar

Efron, B. and Morris, C. (1977). Stein’s paradox in statistics. Sci. Am. 236: 119–127. https://doi.org/10.1038/scientificamerican0577-119.Search in Google Scholar

Gelman, A. (2023). When did the use of stylized data examples become standard practice in statistical research and teaching? Available at: https://statmodeling.stat.columbia.edu/2023/05/04/when-did-the-use-of-stylized-data-examples-become-standard-practice-in-statistical-research-and-teaching/.Search in Google Scholar

Klaassen, F. and Magnus, J. (2014). Analyzing wimbledon. Oxford Academic Press, New York, NY.10.1093/acprof:oso/9780199355952.001.0001Search in Google Scholar

Kovalchik, S. (2016). Klaassen & Magnus’s 22 myths of tennis – myth 3. Stats on the T, Available at: http://on-the-t.com/2016/03/05/klaassen-magnus-hypothesis-3/.Search in Google Scholar

Morris, C. (1977). The most important points in tennis. Optim. Strat. Sport 131–140.Search in Google Scholar

Morris, C.N. (1983). Parametric empirical Bayes inference: theory and applications. J. Am. Stat. Assoc. 78: 47–55. https://doi.org/10.1080/01621459.1983.10477920.Search in Google Scholar

Morris, C.N. (1998). Multiplicity, Bayes and hierarchical models. Stat. in Sports 277–291.Search in Google Scholar

Morris, C.N. (2014). Interview with Carl Morris. Chance 27: 17–24. https://doi.org/10.1080/09332480.2014.965627.Search in Google Scholar

Morris, C. (2015). Comparing sports performances using counts and rates. presented at the 2015 New England Symposium on Statistics in Sports.Search in Google Scholar

Mosteller, F. (1952). The world series competition. J. Am. Stat. Assoc. 47: 355–380. https://doi.org/10.1080/01621459.1952.10501178.Search in Google Scholar

Mosteller, F. (1997). Lessons from sports statistics. Am. Statistician 51: 305–310. https://doi.org/10.2307/2685896.Search in Google Scholar

Rickey, B. (1954). Goodby to some old baseball ideas. Life 2: 78–89.Search in Google Scholar

Rosenfeld, J.W., Fisher, J.I., Adler, D., and Morris, C. (2010). Predicting Overtime with the pythagorean formula. J. Quant. Anal. Sports 6: 1. https://doi.org/10.2202/1559-0410.1244.Search in Google Scholar

Received: 2023-10-13

Accepted: 2023-10-23

Published Online: 2024-01-08

Published in Print: 2024-03-25

Articles in the same Issue

https://doi.org/10.1515/jqas-2023-0092

Keywords for this article

baseball; tennis; hierarchical Bayes; empirical Bayes