Abstract
Fixed effects estimation, with linear controls for stratum membership, is often used to estimate treatment effects when assignment propensities differ across strata. In the presence of heterogeneity in treatment effects across strata, this estimator does not target the average treatment effect, however. Indeed, the implied estimand can range anywhere from the lowest to the highest stratum-level average effect. To facilitate the interpretation of results using this approach, I establish that if stratum-level average effects are monotonic in the shares assigned to treatment, then the fixed effects estimand lies between the average treatment effect for the treated and the average treatment effect for the controls.
1 Introduction
Consider a setting in which study units belong to a collection of strata. Share
The need to condition in this way is common in both experimental and observational studies. In experimental work, it arises if researchers employ block randomization with different probabilities within blocks or if they employ multiple treatments with correlated probabilities [2]. It can also arise if they are interested in spillover or network effects, where the probability of exposure to spillovers can vary across units even though the direct treatment is randomly assigned [3]. In observational work, it arises, for instance, if individuals self-select into treatment on the basis of observable characteristics [4].
In such settings – if assignment propensities are known – there are multiple procedures for generating unbiased estimates of average treatment effects. Effects can be estimated within each stratum and then averaged [5, section 6.1]. Unbiased estimates can also be generated using matching [6], or using treatment interactions [7], propensity weighting [3], or doubly robust approaches [8].
In practice, however, a common strategy is to use ordinary least squares (OLS) to estimate
where
If there are heterogeneous effects, however, estimates from this procedure are prone to bias [11]. Less well understood is when these biases arise and how important they are likely to be, with contributions by Słoczyński [9], discussed below, a notable exception.
In this article, I address this interpretive challenge. I identify conditions under which the fixed effects estimand – the quantity implicitly targeted by least squares estimation of equation (1) – is “close” to causal quantities of interest.
In addition, I provide a proposition that establishes that if the share of units assigned to treatment in each stratum is monotonic in stratum average treatment effects, then the fixed effects estimand is bounded by the expected average treatment effect for the controls and the expected average treatment effect for the treated.
The utility of this result depends on the plausibility of monotonicity between assignments and treatment effects.
Monotonic relations are guaranteed if there are just two strata. They may also arise, however, if both treatment effects and assignment propensities reflect some systematic feature of units. For instance, under Roy selection [12], units are more likely to opt into treatment if they expect benefits. Indeed, experimental design might deliberately select assignment probabilities to reflect expected benefits [13]. More subtle logics might also imply monotonicity. For instance, relatively popular children – with more network connections – might be more likely to be indirectly exposed to an antibullying treatment that has been randomly assigned to children, yet less likely to benefit from it [14]. In experiments to study in-group cooperation that use random pairing between individuals, individuals from larger groups have a larger propensity to be matched with in-group partners, but larger groups might also display different levels of in-group cooperation on average [15].
Absent monotonicity, the fixed effects estimator may be shooting at an estimand very far from standard estimands of interest.
2 Setup
Let
Employing the potential outcomes framework [1], let
The outcome for a given unit is a random variable given by
I consider the following (sample) estimands:
where
Here,
Now consider an estimate of treatment effects resulting from using OLS to regress the outcome on treatment and a set of indicator variables for each of the strata. In this case, fixed effects estimation returns a weighted average of the estimates of stratum-level treatment effects:
Derivations for this expression are provided in Theorem 5 in the study of Ding [16] and equation (2) in the study of Goldsmith-Pinkham et al. [2], both using Frisch–Waugh–Lovell theorem. In addition, I provide a direct proof in supplementary materials (S1).
Observe that the weights in equation (7) reflect the variance in treatment assignment within strata, not the share treated, within each stratum, and may be increasing or decreasing in the share treated.
The estimator is unbiased for the following estimand (see equation (9) in [11] for the two-stratum case):
Here, the second equality follows from the assumption that
We can see from this that since least squares weights can take any value between 0 and 1 for any stratum, depending only on the values taken by the collection
Thus, as a general matter, there is no reason to expect that the least squares estimand is close to
Example 1
For a dramatic illustration, consider a case with three equal-sized strata (
This case has strong symmetry in treatment and control. Half the units are in treatment, and half are in control. The variation in propensities is the same in both groups. And
The example can also be used to illustrate a more subtle point: biases can arise even if all units have identical assignment propensities if the shares assigned to treatment are, nevertheless, heterogeneous. Consider a variation of this example induced by a “randomized saturation design” [17], in which there is a prior randomization to determine whether share
3 Results
Inspection of equations (4), (5) and (7) suggests three cases in which
Proposition 1 establishes that if the shares of units assigned to treatment are monotonic in within-stratum treatment effects, then
Proposition 1
If for all
Proof
Consider the case in which
We have
Equivalently (see Supplementary materials (S2)):
Note that the quantity in parentheses in equation (9) can be positive or negative. More specifically, defining
Exploiting monotonicity, let
which we know to be true because
The proof for
A number of considerations are of interest with regard to this result.
First, monotonicity is not a necessary condition for
Second, while monotonicity ensures that
Third, an analogous statement holds for sample statistics. Defining
Finally, there are fruitful connections here with findings in the study by Słoczyński [9]. Słoczyński [9] identified
See Supplementary materials (S3) for intermediate steps.
This weight admits a substantive interpretation. Quantity
Linearity is a stronger assumption than monotonicity however, and if only monotonicity can be defended, then the weighted quantities in the study by Słoczyński [9] lose their connection to causal estimands. However, Proposition 1 provided here can still be used.
4 Conclusion
Researchers commonly use covariate adjustment to account for known variation in treatment assignment propensities. This situation can arise in both observational and experimental studies.
A common analysis strategy in such cases is to regress outcomes on treatment using a set of controls entered additively. A flexible version of this approach, which I focus on here, is one in which researchers use fixed effects specifications to seek to capture variation in assignment propensities.
This approach is unfortunately not guaranteed to produce unbiased estimates of the average treatment effect. Moreover, it is not well understood how estimates generated in this manner diverge from
For this reason, this approach should, in general, be avoided. And fortunately, there are multiple ways to generate estimates of average treatment effects in this setting. Most simply, equation (3) can be used to estimate within-stratum effects; a weighted average of these will be unbiased for
Despite the availability of these alternatives, using fixed effects to address assignment heterogeneity remains common, as documented recently in Gibbons et al. [10]. If users are unable to access data and re-estimate effects correctly, rules of thumb become useful to help interpret reported findings. A number are provided here. First, for “rare” treatments, the least squares estimand lies close to the average treatment effect for the treated; for “common” treatments, it is close to the treatment effect for the controls. Second, if propensity variance is similar across strata, then the OLS estimand lies close to the ATE, even if actual propensities diverge. Third, when a monotonicity condition is satisfied
Acknowledgement
My thanks to Winston Lin and to Craig McIntosh, Marion Dumas, Andy Gelman, Joshua Angrist, Kosuke Imai, Laura Paler, Neelan Sircar, and Guido Imbens for generous comments on an earlier version of this manuscript.
-
Funding information: The author states no funding involved.
-
Author contribution: The author confirms the sole responsibility for the conception of the study, presented results and manuscript preparation.
-
Conflict of interest: The author states no conflict of interest.
-
Data availability statement: No data are used in this research.
References
[1] Rosenbaum PR, Rubin DB. The central role of the propensity score in observational studies for causal effects. Biometrika. 1983;70:41–55. Search in Google Scholar
[2] Goldsmith-Pinkham P, Hull P, Kolesár M. Contamination bias in linear regressions. Cambridge, MA: National Bureau of Economic Research; 2022. 10.3386/w30108Search in Google Scholar
[3] Aronow P, Samii C. Estimating average causal effects under general interference, with application to a social network experiment. Ann Appl Stat. 2017;11(4):1912–47. 10.1214/16-AOAS1005Search in Google Scholar
[4] Rosenbaum PR, Rubin DB. The central role of the propensity score in observational studies for causal effects. Biometrika. 1983;70(1):41–55. 10.1093/biomet/70.1.41Search in Google Scholar
[5] Duflo E, Glennerster R, Kremer M. Chapter 61 Using randomization in development economics research: A toolkit. In: Schultz TP, Strauss JA, editors. Handbook of development economics. vol. 4 of Handbook of Development Economics. Amsterdam: Elsevier; 2007. p. 3895–962. 10.1016/S1573-4471(07)04061-2Search in Google Scholar
[6] Ho DE, Imai K, King G, Stuart EA. Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Polit Anal. 2007;15(3):199–236. 10.1093/pan/mpl013Search in Google Scholar
[7] Lin W. Agnostic notes on regression adjustments to experimental data: Reexamining freedmanas critique. Ann Appl Stat. 2013;7(1):295–318. 10.1214/12-AOAS583Search in Google Scholar
[8] Bang H, Robins JM. Doubly robust estimation in missing data and causal inference models. Biometrics. 2005;61(4):962–73. 10.1111/j.1541-0420.2005.00377.xSearch in Google Scholar PubMed
[9] Słoczyński T. Interpreting OLS estimands when treatment effects are heterogeneous: Smaller groups get larger weights. Rev Econ Stat. 2022;104(3):501–9. 10.1162/rest_a_00953Search in Google Scholar
[10] Gibbons CE, Suárez Serrato JC, Urbancic MB. Broken or fixed effects? J Econ Methods. 2019;8(1):20170002. 10.1515/jem-2017-0002Search in Google Scholar
[11] Angrist JD. Estimating the labor market impact of voluntary military service using social security data on military applicants. Econometrica. 1998 March;66(2):249–88. 10.2307/2998558Search in Google Scholar
[12] Heckman JJ, Taber C. Roy model. In: Durlauf SN, Blume LE, editors. Microeconometrics. London: Palgrave Macmillan; 2009. p. 221–8.10.1057/9780230280816_27Search in Google Scholar
[13] Chassang S, Padró i Miquel G, Snowberg E. Selective trials: A principal-agent approach to randomized controlled experiments. Amer Econ Rev. 2012;102(4):1279–309. 10.1257/aer.102.4.1279Search in Google Scholar
[14] Paluck EL, Shepherd H. The salience of social referents: a field experiment on collective norms and harassment behavior in a school social network. J Personality Soc Psychol. 2012;103(6):899. 10.1037/a0030015Search in Google Scholar PubMed
[15] Habyarimana J, Humphreys M, Posner DN, Weinstein JM. Why does ethnic diversity undermine public goods provision? Amer Polit Sci Rev. 2007;101(4):709–25. 10.1017/S0003055407070499Search in Google Scholar
[16] Ding P. The Frisch-Waugh-Lovell theorem for standard errors. Stat Probabil Lett. 2021;168:108945. 10.1016/j.spl.2020.108945Search in Google Scholar
[17] Baird S, Bohren JA, McIntosh C, Ózler B. Optimal design of experiments in the presence of interference. Rev Econ Stat. 2018;100(5):844–60. 10.1162/rest_a_00716Search in Google Scholar
[18] Blair G, Cooper J, Coppock A, Humphreys M, Sonnet L. estimatr: Fast Estimators for Design-Based Inference; 2024. R package version 1.0.2. https://github.com/DeclareDesign/estimatr. Search in Google Scholar
[19] Blair G, Coppock A, Humphreys M. The trouble with controlling for blocks; 2018. Accessed: 2024-10-14. https://declaredesign.org/blog/posts/biased-fixed-effects.html. Search in Google Scholar
[20] Bernstein DS. Matrix mathematics: theory, facts, and formulas. Princeton: Princeton University Press; 2009. 10.1515/9781400833344Search in Google Scholar
© 2025 the author(s), published by De Gruyter
This work is licensed under the Creative Commons Attribution 4.0 International License.
Articles in the same Issue
- Research Articles
- Decision making, symmetry and structure: Justifying causal interventions
- Targeted maximum likelihood based estimation for longitudinal mediation analysis
- Optimal precision of coarse structural nested mean models to estimate the effect of initiating ART in early and acute HIV infection
- Targeting mediating mechanisms of social disparities with an interventional effects framework, applied to the gender pay gap in Western Germany
- Role of placebo samples in observational studies
- Combining observational and experimental data for causal inference considering data privacy
- Recovery and inference of causal effects with sequential adjustment for confounding and attrition
- Conservative inference for counterfactuals
- Treatment effect estimation with observational network data using machine learning
- Causal structure learning in directed, possibly cyclic, graphical models
- Mediated probabilities of causation
- Beyond conditional averages: Estimating the individual causal effect distribution
- Matching estimators of causal effects in clustered observational studies
- Ancestor regression in structural vector autoregressive models
- Single proxy synthetic control
- Bounds on the fixed effects estimand in the presence of heterogeneous assignment propensities
- Minimax rates and adaptivity in combining experimental and observational data
- Highly adaptive Lasso for estimation of heterogeneous treatment effects and treatment recommendation
- A clarification on the links between potential outcomes and do-interventions
- Valid causal inference with unobserved confounding in high-dimensional settings
- Spillover detection for donor selection in synthetic control models
- Causal additive models with smooth backfitting
- Experiment-selector cross-validated targeted maximum likelihood estimator for hybrid RCT-external data studies
- Applying the Causal Roadmap to longitudinal national registry data in Denmark: A case study of second-line diabetes medication and dementia
- Orthogonal prediction of counterfactual outcomes
- Review Article
- The necessity of construct and external validity for deductive causal inference
Articles in the same Issue
- Research Articles
- Decision making, symmetry and structure: Justifying causal interventions
- Targeted maximum likelihood based estimation for longitudinal mediation analysis
- Optimal precision of coarse structural nested mean models to estimate the effect of initiating ART in early and acute HIV infection
- Targeting mediating mechanisms of social disparities with an interventional effects framework, applied to the gender pay gap in Western Germany
- Role of placebo samples in observational studies
- Combining observational and experimental data for causal inference considering data privacy
- Recovery and inference of causal effects with sequential adjustment for confounding and attrition
- Conservative inference for counterfactuals
- Treatment effect estimation with observational network data using machine learning
- Causal structure learning in directed, possibly cyclic, graphical models
- Mediated probabilities of causation
- Beyond conditional averages: Estimating the individual causal effect distribution
- Matching estimators of causal effects in clustered observational studies
- Ancestor regression in structural vector autoregressive models
- Single proxy synthetic control
- Bounds on the fixed effects estimand in the presence of heterogeneous assignment propensities
- Minimax rates and adaptivity in combining experimental and observational data
- Highly adaptive Lasso for estimation of heterogeneous treatment effects and treatment recommendation
- A clarification on the links between potential outcomes and do-interventions
- Valid causal inference with unobserved confounding in high-dimensional settings
- Spillover detection for donor selection in synthetic control models
- Causal additive models with smooth backfitting
- Experiment-selector cross-validated targeted maximum likelihood estimator for hybrid RCT-external data studies
- Applying the Causal Roadmap to longitudinal national registry data in Denmark: A case study of second-line diabetes medication and dementia
- Orthogonal prediction of counterfactual outcomes
- Review Article
- The necessity of construct and external validity for deductive causal inference