Pulling Back the Curtain on the Federal Class Action

Jonah B. Gelbach; Deborah R. Hensler

doi:10.1515/jtl-2025-0002

Enjoy 40% off

academic books on De Gruyter Brill *

Article

Pulling Back the Curtain on the Federal Class Action

Jonah B. Gelbach and Deborah R. Hensler

Published/Copyright: April 7, 2025

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information Explore this Subject

From the journal Journal of Tort Law Volume 17 Issue 2

Abstract

Practitioners, judges, and scholars have long debated numerous aspects of class litigation policy, with many disagreements involving competing empirical assertions. Astonishingly, given the long-running debate about the virtues and vices of class actions, we lack basic information about the federal class action caseload that would allow us to assess these claims, largely because the federal judiciary does not report even the most basic facts about class action litigation. Despite the critical role of class certification in the design of Rule 23’s scheme for class litigation, for example, we do not know how many cases have had class certification motions filed – not over time, not by the nature of suit, and not even in any particular year. What evidence we do have about class actions is fragmented in one way or another, e.g., involving only securities cases (which, unlike other cases, are governed in part by the PSLRA), or involving only cases with proposed settlements, as discussed in Fitzpatrick (2024). The lack of basic information about the federal class action universe facilitates a policy reform debate that enables contending parties to endlessly argue about the virtues and vices of the class action procedure without needing to account for how the procedure works in practice. The debate takes place in public while the class action process itself proceeds behind a curtain of ignorance. The goal of this paper is to start pulling back that curtain by reporting some basic information about federal class action filings and key events in the class action litigation process. Using docket entry text for cases filed between January 1, 2005, and December 31, 2014, we report some basic statistics regarding (1) how many putative class action cases there are, (2) how many have motions for class certification filed, (3) how many have an order on class certification, and (4) the timing of these events. We focus on certification because it is the key event in the life of class litigation: without it, the outcome of the litigation cannot bind the proposed class. Not surprisingly then, claims about certification rates have been central to policy debate about class actions. But except in limited circumstances, these claims have not previously been subject to rigorous analysis. Our approach is based on using a fine-tuned large language model to classify docket entries, as other recent work has done. We believe this will be a fruitful approach for future work in the area.

Keywords: class actions; dockets; civil litigation; machine learning; large language models

Corresponding author: Jonah B. Gelbach, Herman F. Selvin Professor of Law, UC Berkeley School of Law, Berkeley, CA, USA; and Non-Resident Fellow, The Rhode Center, Stanford Law School, Stanford, CA, USA, E-mail: gelbach@berkeley.edu

We are grateful to Li Huang, Nicolas Torres-Echeverry, and Rey Song for excellent research, as well as to the Oscar M. Ruebhausen Fund at the Yale Law School, which provided funding to acquire the data used here, and Bill Eskridge for his help in acquiring the data. This paper was prepared as part of the Conference in Honor of Deborah Hensler held at Stanford Law School on September 20, 2024.

Appendix: Data Pipeline

A Initial Selection of Putative Class Action Cases

To identify cases with putative class action activity, we selected a random sample of 600 cases that had any docket entry that included text we thought would be associated with a class action (e.g., the presence of the phrases “MOTION TO CERTIFY CLASS”, “MOTION FOR CLASS CERTIFICATION”, “CLASS ACTION COMPLAINT”, or “CLASS COMPLAINT”). For a somewhat different project, we had two research assistants hand-code the docket reports for these 600 cases. As a result of their coding, they identified 381 cases out of the 600 as having class-related activity. The remaining cases had one of the phrases listed above in a docket entry for other reasons (e.g., a boilerplate scheduling order that mentions how a court handles class action process). We investigated which phrases had induced a match on our initial search string, and we designed a more complex set of search phrases that would eliminate the false positives in our hand-coded set of cases without also eliminating any of the true positives. This process resulted in selecting all cases with at least one docket entry that included any of the following strings (terms in brackets are allowed to be present or not, and “GRANT|DENY” means either “GRANT” or “DENY” is allowed to be present):

MOTION TO CERTIFY [A ]CLASS
MOTION FOR CLASS CERTIFICATION
CERTIFICATION OF [A ]CLASS
PUTATIVE CLASS
PROPOSED CLASS
SUBCLASS
CLASS ACTION
CLASS CERTIFICATION MOTION
ORDER TO [GRANT |DENY ]CLASS CERTIFICATION
CLASS CLAIMS
CLASS ALLEGATIONS

We identified a total of 58,065 cases that had a docket entry matching at least one of these phrases.

Our original plan was to write code using Python, which would be able to identify whether each docket entry involved a motion or order related to class certification. Due to the generally unstructured nature of docket entries, we were unsuccessful in creating a sufficiently reliable approach to this task, and our project stalled until we switched to using artificial intelligence methods.

B Human-Coding of Docket Entries in Cases with Class-Action Related Activity, to Be Used for Machine-Based Classification

Because we had been focused on different questions when working with the research assistants mentioned above, we did not ask them to code each docket entry for the presence of a motion or order related to class certification. In 2024, we therefore hand-coded all docket entries from the 381 cases that the RAs identified as involving class-related activity for the presence of motions or orders. We also coded an additional 100 cases, creating a total of 481 cases with docket entries hand-coded for motions and orders.

C Fine-Tuning with OpenAI

In the summer of 2024, we worked with a third research assistant to investigate the use of artificial intelligence methods to jumpstart our project. We considered a variety of approaches. Ultimately, the approach that best balanced ease of use and expense was to fine-tune a version of OpenAI’s GPT-4o-mini LLM, and then use that fine-tuned model to classify docket entries for the presence of motions and orders of interest. We carried out the fine-tuning using a sample of 80 % of the hand-coded docket entries as a training set, using the remaining 20 % as a test set to assess the performance of the fine-tuned model. We experimented with a variety of prompts, updating the prompt each time after observing the performance on the test set and seeing ways in which we could improve the prompt for that set. Here are the two prompts we ultimately used:

Motion-Classification Prompt

As a highly skilled law clerk, your task involves analyzing legal documents to determine if they contain requests for class action certification. Your main objectives are:

**Filter Out Irrelevant Documents**: Immediately dismiss any document that begins with “CLERK’S NOTES”, “MINUTE”, or “MINUTE ENTRY”. Respond with 0 in the JSON output.
**Identify Class Certification Requests**: Search for specific key phrases that signal that a party has filed for class certification. These phrases include:
1. ‘MOTION TO CERTIFY CLASS’
2. ‘CLASS CERTIFICATION’
3. ‘CLASS MOTION’

Furthermore, look for mentions of the filing party using terms like ‘FILED BY PLAINTIFF’ or ‘BY’ to confirm the context of class action.

**Exclude Non-Request Mentions**:
1. Documents that use phrases like ‘IN SUPPORT OF’ or ‘OPPOSITION TO’ often indicate a response to, rather than the filing of, a class certification motion. Exclude these from being marked as requests.

**Output Requirements**:

Provide your analysis in JSON format. Use the key ‘motionToCertifyClass’. Assign the value ‘1’ to indicate the presence of a class certification request, and ‘0’ for its absence, based on the content of the document.

Order-Classification Prompt

As a highly skilled law clerk, you are responsible for analyzing legal documents to determine the presence and nature of judges’ orders concerning motion to certify class action. Here are your tasks:

Check if the order is a proposed order. Look for key phrases like ‘SUBMISSION OF PROPOSED ORDER’. If it is, respond with ‘0’.
Check if the document includes an order adjudicating a motion to certify class. Search for key phrases such as ‘MOTION TO CERTIFY CLASS’, ‘CLASS CERTIFICATION’, ‘CLASS ACTION’, ‘CLASS ACTION SETTLEMENT’. If no such order is present, respond with ‘0’.
Differentiate motions to certify a class from other motions. Ascertain the judge’s decision on the class certification. This decision is typically indicated by the terms ‘GRANTED’, ‘DENIED’, ‘GRANTING IN PART AND DENYING IN PART’, or ‘DENYING IN PART AND GRANTING IN PART’.

Your response should be in JSON format with the key ‘orderOnCertifyClass’. The associated value should be one of the following:

‘0’ if there is no order concerning class certification, or if the order is a PROPOSED ORDER.
‘MTC Granted’ if the motion to certify class is granted.
‘MTC Denied’ if the motion to certify class is denied.
‘MTC GIP/DIP’ if the motion to certify class is partially granted. Search for ‘GRANTING IN PART AND DENYING IN PART’ and ‘DENYING IN PART AND GRANTING IN PART’.
‘MTC Denied as moot’ if the motion to certify class is denied because it was rendered moot.
‘MTC Withdrawn’ if the plaintiff has withdrawn the motion to certify class.

D Validation

To assess the performance of the final prompts we used, we handcoded an additional set of 506 docket entries from 486 cases. To create this validation set, we selected a random sample of all docket entries whose text contained any of the following phrases:

CLASS CERTIFICATION
CERTIFY CLASS
CERTIFY A CLASS
CERTIFY THE CLASS

We picked these phrases because they were prevalent in our earlier-hand-coded data’s docket entries that had motions for or orders on class certification. Thus, these docket entries were much more likely than a randomly drawn entry to have either a motion for or order on class certification, increasing their relevance for assessing the performance of the GPT-fine-tuned model. We note as well that a docket entry could match one of these phrases without matching any of the phrases used in our initial matching list (see Section A above).

To validate our fine-tuned GPT models, we applied the models to the validation set’s docket entries, yielding classification values for the motions and orders. We then calculated performance metrics, which we report in Table 6.

The table’s first column reports the number of docket entries that actually represented a motion for class certification, an order on class certification, an order granting class certification with respect to any aspect of the litigation, and an order denying class certification in full. The 506 docket entries that made up our validation data had 58 motions for class certification and 35 orders on class certification, of which 14 were some type of grant, 20 were a full denial, and 1 could not be categorized from the docket entry text.

The metrics indicate that our approach worked reasonably well. We have very few false positives, i.e., instances where the fine-tuned GPT models label an entry as a motion or order when it really is not. We do have a nontrivial number of false negatives, i.e., instances where the fine-tuned GPT models fail to label an entry as a motion or order when it actually is one.

The last three columns of Table 6 report results for three metrics. The first is precision, which is the percentage, among entries that the GPT model labeled as motions, of actual motions; because we had no false positives, the precision was 1. The second metric is the recall rate, which is the share of all actual positives (e.g., motions for class certification) that the GPT model coded as positives. The model found 45 of the 58 actual motions in the validation sample, for a recall rate of 0.78 (and thus a false-negative rate of 0.22, characterized as roughly 1 in 4 in the text). The third metric, F1, is a combination of the recall and precision metrics. This metric is commonly used, because high values of both precision and recall are desirable, yet there can be a tradeoff between them.^[30] Our F1 score for motions was 0.87.

Table 6:

Validation-set metrics of fine-tuned GPT-4o-mini models.

Task	Actual	Precision	Recall	F1
Motion present	58	1.00	0.78	0.87
Order present
Any	35^a	0.93	0.77	0.84
Any grant	14	0.92	0.79	0.85
Full denial (includes denials as moot)	20	0.88	0.75	0.81

^aThere was one order that we could not categorize as to whether it involved a grant or denial.

For orders, the performance was roughly similar on all three metrics, regardless of whether we considered grants alone, denials alone, or the full set of orders irrespective of adjudication result. Precision was a bit greater than 0.9, recall was between 0.75 and 0.8 (corresponding to false-negative rates of between 0.2 and 0.25), and the F1 score was between 0.8 and 0.85.

E Augmenting the Initial Data

We found that some docket entries that included the phrases used to select our validation data set came from cases that were not included in the initial set of putative class action cases described in Appendix Section A. As noted supra, our initial match routine led to a set of 58,065 cases. Of these, 27,126 cases had at least one docket entry that matched one or more of the phrases that we used to create the validation data set described in Appendix Section D. An additional 2,310 cases had docket entries with text that matched the phrases used to create the validation set, but were not in the initial set of putative class action cases.

We augmented our set of cases with putative class action activity by adding docket entries from these 2,310 supplemental cases, so that the total number of cases was then 60,375. Of these, 3,445 cases had a metadata-based filing date that indicated they were filed before 2005. Our assumption was that these cases really were filed before 2005, and were inadvertently included in the data feed Thomson Reuters originally provided us. We dropped all these cases, which led to a set of 56,930 cases. We dropped an additional 58 cases that had a first docketed entry that occurred before 2005, leaving a set of 56,872 cases.

F Deletions for FLSA Cases

Claims brought under the Fair Labor Standards Act of 1938 via 29 U.S.C. § 216b are often described as “collective actions,” and various docket entry text might cause GPT to classify entries as involving motions for class certification or orders to certify. These FLSA actions require collective action group members to opt in, by stark contrast to either the Rule 23(b)(3) opt-out or (b)(1)–(2) mandatory-class frameworks. Consequently we regard them as very different from the putative class actions that are the object of our study, so we exclude cases with any docket entry that contains the string “216(B)” or “216B”. We note that this approach also excludes hybrid FLSA/Rule 23 actions.

We dropped 1,728 cases because they had the string “216(B)” or “216B” in at least one docket entry. Of these, about a third – 580 – have a docket entry that both has one of the 216b-related strings and is labeled by our GPT model as a motion to certify a class, suggesting that these are likely hybrid actions involving both a FLSA collective action and a Rule 23 class action. Another 599 cases with the 216b-string had a GPT-coded motion for class certification but no single docket entry that both matched our 216b search and was labeled by our GPT model as a motion for class certification. Some of these cases might have been hybrids, but without more intensive assessment we cannot tell. Finally, there were 549 (1728-580-599) cases with a match to the 216b-related string but no GPT-labeled motion for class certification.

After deleting the 1,728 cases just described, we were left with 55,144 cases for our analysis set.

G Quality-Control Issues

1 Cases where GPT Indicates Order is Present but Motion is Not

Consider Cejas v. Blanas (EDCA, No. 2:05-cv-01799), which was a prisoner’s civil rights action brought under 42 U.S.C. § 1983. There is no motion to certify a class docketed in this case, and our GPT model properly did not code any entry as such. However, the text of docket entry number 42 is as follows:

FINDINGS AND RECOMMENDATIONS SIGNED BY JUDGE GREGORY G. HOLLOWS ON 10/2/07 RECOMMENDING THAT PLTF’S REQUEST TO HAVE THIS ACTION CERTIFIED AS A CLASS ACTION, CONTAINED IN THE AMENDED COMPLAINT FILED 8/23/07, BE DENIED. CASE REFERRED TO JUDGE KARLTON. WITHIN 20 DAYS AFTER BEING SERVED WITH THESE FINDINGS, PLTF MAY FILE WRITTEN OBJECTIONS WITH THE COURT. (KASTILAHN, A) (ENTERED: 10/03/2007)

Our GPT model coded this entry, docketed by Magistrate Judge Hollows, as a non-MTC and a non-OOC, as it should have: it did not represent a party-docketed motion for class certification, and it was only a recommendation, rather than an order filed by a District Judge, or by a Magistrate with the authority to issue a binding order. However, two months later, the District Judge in the case issued an order docketed in the following entry:

ORDER ADOPTING FINDINGS AND RECOMMENDATIONS IN FULL 42 SIGNED BY JUDGE LAWRENCE K. KARLTON ON 12/6/07: THE PLAINTIFF’S REQUEST TO HAVE THIS ACTION BE CERTIFIED AS A CLASS ACTION IS DENIED. (KAMINSKI, H) (ENTERED: 12/06/2007)

Thus, Judge Karlton construed the plaintiff’s amended complaint as including a request for class certification, which the judge then denied. Our GPT model properly coded this docket entry as representing an order on class certification.

Thus, the case of Cejas v. Blanas is one where all of the following were true:

The case involved class-related litigation activity, because at least one complaint sought class relief.
There was never a motion for class certification filed.
There was an order addressing the issue of class certification.

2 Cases where the First Motion for Class Certification has a Docket Date that Comes before the Case Filing Date.

Consider the case Stepp v. Haley (SDAL, No. 1:05-cv-00303). The docket report’s metadata indicates that this case was filed on June 24, 2005. However, docket entry 7, which our GPT model flagged as a motion for class certification, has the date June 14, 2005 – 10 days earlier than the listed case filing date. The text of the docket entry was: “REQUEST/MOTION FOR CLASS ACTION CERTIFICATION BY MICHAEL STEPP. REFERRED TO JUDGE CASSADY/PSU(SRR), (ENTERED: 06/15/2005)”. Thus, our model did the desired thing by labeling this entry as involving a motion to certify a class, and it is clear from the date in the text of the entry that it was docketed before the listed filing date for the case. The first docketed entry in the case is a complaint, whose own text indicates it was entered into the docket on May 20, 2005, more than a month before the meta-data indicates the case was filed.

A bit more than 10% of cases had such inconsistencies in our data: there were 5,881 in which the first entry’s docketed date was before the filing date. Nor is this an artifact of, say, the metadata being off by just a day or even a few: 5,332 cases had a first-entry docket date that was more than 9 days before the metadata filing date, and 2,548 cases had a first-entry docket date that was more than 90 days before that. We retained all these cases, and throughout this paper, we use the date of the first-docketed entry rather than the metadata filing date, where these differ.

H Nature of Suit Categories

The following table lists the PACER naure of suit codes that we used to construct the case-type categories we use in this paper.

Our case type name	PACER nature of suit code
Antitrust	410
Securities	850
Personal injury & property damage	310, 355, 260, 365, 380, 385
Consumer-contract	195, 370, 371
Contract-other	110, 120, 190
Civil rights	420, 440, 443, 530, 550, 555
Labor	720, 790, 791
Patent	830
Environmental	893
Other	All other PACER nature of suit codes

Received: 2025-01-17

Accepted: 2025-01-22

Published Online: 2025-04-07

Published in Print: 2024-10-28

You are currently not able to access this content.

Articles in the same Issue

https://doi.org/10.1515/jtl-2025-0002

Keywords for this article

class actions; dockets; civil litigation; machine learning; large language models