Home Entropy Balancing for Continuous Treatments
Article Open Access

Entropy Balancing for Continuous Treatments

  • Stefan Tübbicke EMAIL logo
Published/Copyright: December 15, 2021
Become an author with De Gruyter Brill

Abstract

Interest in evaluating the effects of continuous treatments has been on the rise recently. To facilitate the estimation of causal effects in this setting, the present paper introduces entropy balancing for continuous treatments (EBCT) – an intuitive and user-friendly automated covariate balancing scheme – by extending the original entropy balancing methodology of Hainmueller, J. 2012. “Entropy Balancing for Causal Effects: A Multivariate Reweighting Method to Produce Balanced Samples in Observational Studies.” Political Analysis 20 (1): 25–46. In order to estimate balancing weights, the proposed approach solves a globally convex constrained optimization problem, allowing for computationally efficient software implementation. EBCT weights reliably eradicate Pearson correlations between covariates (and their transformations) and the continuous treatment variable. As uncorrelatedness may not be sufficient to guarantee consistent estimates of dose–response functions, EBCT also allows to render higher moments of the treatment variable uncorrelated with covariates to mitigate this issue. Empirical Monte-Carlo simulations suggest that treatment effect estimates using EBCT display favorable properties in terms of bias and root mean squared error, especially when balance on higher moments of the treatment variable is sought. These properties make EBCT an attractive method for the evaluation of continuous treatments. Software implementation is available for Stata and R.

JEL Classification: C14; C21; C87

1 Introduction

Methods for balancing covariate distributions have become essential tools to flexibly control for confounding due to observed covariates. While binary treatments continue to be the most common case encountered in practice, situations in which all units receive some treatment with different intensity or dose are also pervasive in economics and other disciplines. Hence, the evaluation of such continuous treatments has gained more attention recently. Examples include the evaluation of job training programs with varying duration (Choe, Flores-Lagunes, and Lee 2015; Flores et al. 2012; Kluve et al. 2012; Li and Fraser 2015) or quality (Galdo and Chong 2012) and subsidies of different magnitude to firms or entire regions (Becker, Egger, and von Ehrlich 2012; Bia and Mattei 2012; Mitze, Paloyo, and Alecke 2015).

Since the seminal papers by Hirano and Imbens (2004), Imai and van Dyk (2004) and Robins, Hernán, and Brumback (2000), the literature has seen an enormous growth in available methods based on the generalized propensity score (GPS, Imbens 2000). Available methods use outcome modeling (Callaway and Huang 2020; Hirano and Imbens 2004; Zhao, van Dyk, and Imai 2020), stratification (Graham, McCoy, and Stephens 2015; Imai and van Dyk 2004), matching (Wu et al. 2020), weighting (Ai, Linton, and Zhang 2021; Fong, Hazlett, and Imai 2018; Huber et al. 2020; Naimi et al. 2014; Yiu and Su 2018), machine learning (Colangelo and Lee 2020; Kennedy et al. 2017; Kreif et al. 2015; Su, Ura, and Zhang 2019; Zhu, Coffman, and Ghosh 2015) or a combination thereof. Unfortunately, the GPS is notoriously difficult to estimate and many methods based on the GPS may require an iterative estimation procedure until satisfactory balance is achieved. Against this background, this paper makes a number of contributions to the literature.

First, it extends the entropy balancing approach by Hainmueller (2012) for the estimation of balancing weights from the binary treatment framework to the context of continuous treatments. The proposed approach, called entropy balancing for continuous treatments (EBCT), provides a practitioner-friendly automated covariate balancing scheme that obviates the need to estimate the GPS. It solves a globally convex optimization problem and obtains balancing weights by minimizing the deviation from (uniform) base weights subject to zero correlation and normalization constraints, allowing for highly efficient software implementation.[1] In its basic form, EBCT simply renders the treatment variable uncorrelated with covariates. While providing an intuitive approach to covariate balancing, this may be insufficient to remove bias due to observed covariates when estimating the effects of a continuous treatment. To curb this potential bias, it may be worthwhile to extend the uncorrelatedness conditions to higher moments of the treatment variable (Yiu and Su 2018). Hence, EBCT allows to obtain balancing weights that re-weight the sample such that higher orders of the treatment variable are also uncorrelated with covariates.

Second, the empirical application regarding the effects of smoking on medical expenditures based on the National Medical Expenditure Survey data highlights that, indeed, there may remain strong non-linear associations between covariates and the treatment variable despite them being uncorrelated in the re-weighted dataset. On the one hand, this underlines the importance of flexible balance checking even when using automated covariate balancing techniques for continuous treatments. On the other hand, it provides the motivation for inspecting how the performance of DRF estimators based on EBCT is affected when higher moments of the treatment variable are also rendered uncorrelated with covariates.

Lastly, the paper provides credible simulation evidence on the performance of effect estimates based on EBCT and comparison methods using parametric and non-parametric regression techniques. To make the data-generating process as realistic as possible, it is designed as an empirical Monte-Carlo experiment (Huber, Lechner, and Wunsch 2013) using the same data as in the application. Thus, the simulation features possibly highly complex functional relationships between covariates, the treatment variable and the outcome. This allows to provide credible evidence on the performance of the different estimation procedures. The simulation results suggest that treatment effect estimates based on EBCT indeed display favorable finite sample performance in terms of bias and root mean squared error, especially when balance on higher powers of the treatment variable is sought. Moreover, non-parametric estimation techniques seem to provide some insurance against bias due to insufficient covariate balance, at least to some degree.

This paper is closely related to concurrent research by Vegetabile et al. (2021). Although they also investigate the performance of entropy balancing in the context of continuous treatments, the contribution of this paper is different in a number of important ways. First, while Vegetabile et al. (2021) allow for the possibility to render higher moments of the treatment variable uncorrelated with covariates, they do not present evidence on how this affects the performance of resulting effect estimates. Second, this paper goes beyond commonly-employed balancing statistics based on simple uncorrelatedness which may be insufficient to detect meaningful covariate imbalances in higher orders of the treatment variable. Lastly, this paper provides arguably more realistic evidence on estimators’ performance by the use of empirical Monte-Carlo simulations. Despite these differences, their contributions will be referred to throughout the paper where necessary.

The remainder of this paper is organized as follows. Section 2 introduces the reader to causal effects of continuous treatments in the potential outcomes framework, necessary identifying assumptions as well as some selected estimation procedures based on the GPS. Section 3 outlines the details of the proposed EBCT method and provides some guidance on specification issues. Section 4 takes EBCT and GPS-based comparison methods to the data by performing empirical Monte-Carlo simulations to obtain evidence on performance of estimation procedures and by estimating the effects of smoking on medical expenditures. Section 5 concludes.

2 Causal Effects of Continuous Treatments

Before discussing identification and estimation methods, it is useful to review causal effects in the context of continuous treatments in terms of the potential outcomes framework, mainly attributed to Roy (1951) and Rubin (1974). Following the notation of Imbens (2000) and Hirano and Imbens (2004), let us assume that we observe an i.i.d. sample of N individuals i with a vector of pre-treatment covariates X i R K , where K is the number of covariates. Furthermore, we have information on a post-treatment outcome Y i and some treatment received with a certain intensity measured by T i with possible values T . The potential outcomes are given by Y i (t) – also often called the unit-dose response – denoting the outcome that would have been observed had the unit received treatment with intensity t. Aggregating these unit-level responses leads to the dose–response function (DRF) E[Y i (t)]. Along with its derivative dE[Y i (t)]/dt, the DRF represents the key relationship to be estimated in practice. If treatment intensities were randomly assigned, comparisons of average outcomes between individuals with different treatment intensities would directly give consistent estimates of these quantities. Unfortunately, this is mostly not the case, not even in experimental settings. Hence, in the absence of some other exogenous variation in treatment assignment through a natural experiment, the following three identifying assumptions need to be invoked in order to obtain consistent estimates in observational studies.[2]

2.1 Identifying Assumptions

First, conditional on observed pre-treatment covariates X, potential outcomes must be independent of the treatment intensity received, i.e.

(1) Y i ( t ) T i X i t T .

This assumption is called the conditional independence assumption (CIA, Lechner 2001) also known as the selection-on-observables assumption (Heckman and Robb 1985) and requires that the researcher observes all covariates X that simultaneously determine the selection into different treatment intensities as well as the outcome of interest. The CIA is potentially a very strong assumption. Hence, it needs to be discussed on a case-by-case basis for the application at hand and requires substantive knowledge about the selection mechanism at play. The second assumption requires there to be common support, i.e. the conditional density of treatment needs to be positive over T :

(2) f T X ( T = t X i ) > 0 t T .

In comparison to the binary treatment case, this condition is potentially much stronger and may be violated, especially in regions of low density in terms of the treatment variable. If this is the case, the sample needs to be trimmed and the DRF is estimated on the subset of observations in order to avoid extrapolation (Crump et al. 2009; Lechner and Strittmatter 2019). Lastly, one needs to assume the so-called stable-unit treatment value assumption (SUTVA, see Rubin 1980), requiring that each individual’s outcome only depends on their own level of treatment intensity. Essentially, this rules out general equilibrium and spill-over effects of treatment (see Imbens and Wooldridge 2009; Manski 2013, for examples).

2.2 Re-weighting Methods Based on the Generalized Propensity Score

Comparing outcomes of individuals with exactly the same set of X but different T to estimate the DRF quickly becomes infeasible with growing dimension of X. To avoid this curse of dimensionality, Hirano and Imbens (2004) show that the conditional independence assumption also holds by conditioning on the generalized propensity score (GPS) R i = f TX (T i X i ), i.e. the conditional density of the treatment intensity evaluated at T i and X i . This is because, conditional on the GPS, the covariate distribution is expected to be independent of the treatment intensity received, i.e.

(3) X i T i R i .

Weighting methods – which will be the main focus of this paper – make use of this fact and re-weight observations based on their GPS. Some of the most prominent methods of this kind will be used for comparison regarding the proposed methodology. The first approach originates from inverse probability weighting (IPW, see Horvitz and Thompson 1952) and is provided by Robins, Hernán, and Brumback (2000) who generalize IPW with weights defined as

(4) w i = f T ( T i ) f T X ( T i X i ) .

Similar to Hirano and Imbens (2004), Robins, Hernán, and Brumback (2000) estimate (un-) conditional densities f T (T i ) and f TX (T i X i ) based on OLS regressions and the assumption that the treatment intensity follows a normal distribution. However, such distributional assumptions are likely to be violated in practice, which may result in serious bias.[3] Other procedures, such as the machine-learning based approach by Su, Ura, and Zhang (2019), are more robust but also less intuitive and user-friendly.

The other comparison methods combine the estimation of the GPS with algorithmic optimization in order to minimize the Pearson correlation between covariates and the treatment variable. The methods used are the parametric and the non-parametric covariate balancing generalized propensity score (CBGPS, Fong, Hazlett, and Imai 2018) as well as generalized boosted models (GBM, Zhu, Coffman, and Ghosh 2015). As will become clear in the remainder of the paper, these automated balancing approaches based on the GPS mostly improve upon covariate balance and robustness relative to standard IPW. However, these re-weighting procedures may not achieve satisfactory balance, leaving estimates susceptible to bias due to residual imbalance.

2.3 Estimation of the Dose–Response Function

Once balancing weights are obtained, the DRF can be estimated using parametric or non-parametric regression techniques based on the re-weighted sample. In general, for this to yield consistent estimates, the aforementioned identifying assumptions need to be true and balancing weights have to render the covariate distribution independent of the treatment intensity. If any of these conditions is not fulfilled, resulting estimates will be inconsistent. Moreover, there is a risk of mis-specification bias due to wrong functional form assumptions for the DRF. To avoid this, flexible parametric specifications using (fractional-) polynomials (Sauerbrei and Royston 1999) or (Basis) splines (d. Boor 1978) in the treatment variable or completely non-parametric methods such as local linear regressions (Fan 1992) may be preferred. To provide some guidance to the reader, global polynomial regressions are compared to local linear regressions in the simulations in Section 4.

3 Extending the Entropy Balancing Scheme

Before deriving entropy balancing weights for continuous treatments, let us briefly review the original approach by Hainmueller (2012) for binary treatments. In the binary treatment setting, some units received a certain treatment and others did not. Most often, the goal is to estimate the average treatment effect on the treated, i.e. one wishes to impute the average outcomes treated individuals had shown, had they not received the treatment. When estimating this counterfactual based on the selection-on-observables assumption, the entropy balancing approach by Hainmueller (2012) adjusts for covariate differences between treated units and untreated units by re-weighting N 0 untreated units to exactly match moments from the treatment group of size N 1.

For notational convenience, assume that T i is non-negative and that treated individuals have T i > 0. Furthermore, let X ̃ i = f ( X i ) N 1 1 i T i > 0 f ( X i ) , where f(⋅) denotes that the covariate vector X i may be extended via the inclusion of higher-order or interaction terms of its elements. That is, X ̃ i is a possibly expanded and de-meaned version of the covariate vector, where means are solely calculated over treated units. Based on these definitions, one can write the optimization problem of the entropy balancing scheme as

(5) min w H ( w ) = i T i = 0 h ( w i ) s . t . i T i = 0 w i X ̃ i = 0 i T i = 0 w i = 1 w i > 0 i T i = 0 ,

where H(w) is a loss function and w i are the balancing weights. To implement the approach, Hainmueller (2012) uses the Kullback (1959) entropy metric h(w i ) = w i  ln(w i /q i ), where q i are some base weights chosen by the analyst. Balancing weights that satisfy (5) exactly match specified covariate moments among the treated by re-weighting control units. If f(X i ) = X i , then only covariate means will be balanced. If f(X i ) also includes quadratic terms of covariates and interaction terms, balancing weights will additionally match their variances and correlations across groups if a solution the optimization problem exists. Generally, if f(X i ) and thus X ̃ i is chosen to be relatively flexible, treatment assignment can be regarded as approximately independent of covariates in the re-weighted sample such that treatment effects can be consistently estimated without further assumptions on the data-generating process beyond the previously discussed identifying assumptions.

3.1 Entropy Balancing for Continuous Treatments

To extend the entropy balancing approach to the case of continuous treatments, a few changes have to made. First, in contrast to the binary case, the estimation of balancing weights w i is performed for the set of treated units, i.e. units i with T i > 0. This is because we wish to impute average outcomes of treated units had they received certain doses of the treatment.[4] Second, the choice of moment conditions need to be altered. Following recent developments in the automated balancing literature, Entropy Balancing for Continuous Treatments (EBCT) focuses on a set of uncorrelatedness conditions based on Pearson correlations.

Analogous to above, define X ̃ i to be the de-meaned version of the (possibly expanded) covariate vector. Furthermore, let T ̃ i p be the de-meaned pth order term of the treatment intensity. Lastly, define the column vector g ( p , X ̃ i , T ̃ i ) = X ̃ i , T ̃ i , , T ̃ i p , X ̃ i T ̃ i , , X ̃ i T ̃ i p . For a given p, the EBCT method aims to solve the following constrained minimization problem:

(6) min w H ( w ) = i T i > 0 h ( w i ) s . t . i T i > 0 w i g ( p , X ̃ i , T ̃ i ) = 0 i T i > 0 w i = 1 w i > 0 i T i > 0

EBCT minimizes the loss function H(w) subject to the balancing constraints in terms of g ( p , X ̃ i , T ̃ i ) and the normalizing constraints that weights have to sum up to one and be strictly positive. If p = 1 is chosen, resulting weights retain unconditional means of covariates (and their higher-order or cross-moments if included) as well as the treatment intensity and they purge the treatment variable from its correlation with covariates. However, simple uncorrelatedness may not be sufficient to achieve acceptable covariate balance and hence, obtain consistent estimates of the DRF in the next step (Yiu and Su 2018). This is because even if X ̃ i is chosen to be relatively flexible, uncorrelatedness between T ̃ i and X ̃ i does not in general imply independence, i.e. the covariate distribution is not necessarily the same across the entire treatment intensity distribution. This is an issue also touched upon by Vegetabile et al. (2021), who also investigate the performance of EBCT. However, in contrast to this paper, Vegetabile et al. (2021) do not further examine the influence on rendering higher moments of T i uncorrelated with X ̃ i on the performance of EBCT.[5]

In order to detect potential residual confounding, it is crucial to flexibly check for remaining dependencies by estimating pseudo-DRFs using covariates (and their higher moments or interactions) as pseudo-outcomes for an additional balancing check. If covariates are truly balanced, these pseudo-DRFs should be completely flat and their derivatives should be zero. If this is not the case, p should to be increased and balance re-assessed. For example, choosing p = 2 means that, in addition to the properties above, the marginal mean of T i 2 is retained and X ̃ i is rendered uncorrelated with T i 2 . It should be clear there is most likely a bias-variance trade-off here: by increasing p, one approximates independence between T i and X i more closely and thus, one is likely to reduce bias but this may come at the cost of increased variance of estimates.

3.2 Implementing EBCT

To implement the proposed EBCT approach, this paper also uses the Kullback (1959) entropy metric h(w i ) = w i  ln(w i /q i ). If no base weights q i are specified, uniform weights q i = 1/N 1i are used. This implies that EBCT chooses balancing weights such that they differ as little as possible from baseline weights in terms of the entropy metric while achieving zero correlation conditions in the re-weighted sample. Notice that the loss function attains a minimum at w i = q i i and is undefined for non-positive weights. The latter property allows to drop the positivity constraint on weights, reducing the optimization problem to one with only equality constrains. Using the Lagrange method, the constrained optimization can be re-written as an unconstrained optimization as

(7) min w , λ , γ L ( w , λ , γ ) = i = 1 N 1 w i ln ( w i / q i ) λ i = 1 N 1 w i 1 γ i = 1 N 1 w i g p , X ̃ i , T ̃ i ,

where λ and γ are Lagrange-multipliers on the constraints. As 2 H / w i 2 > 0 for all w i > 0 and because the constraints are linear in w i , the optimization problem (7) has a global minimum if the constraints are consistent (Boyd and Vandenberghe 2004, Chapter 5). In order to reduce the dimensionality of the optimization problem, the implied structure of balancing weights is obtained by re-arranging the first-order condition L / w i = 0 and plugging the result into the condition L / λ = 0 . This yields the weighting function in terms of the Lagrange-multipliers γ, base weights q i and the data g ( p , X ̃ i , T ̃ i ) as

(8) w i = q i exp γ g p , X ̃ i , T ̃ i i = 1 N 1 q i exp γ g p , X ̃ i , T ̃ i ,

where λ has been cancelled out. Hence, when p = 1, weights implied by EBCT are a log-linear function of a linear index containing covariates (and their transformations), the treatment intensity and their cross-products. When p > 1, the linear index also contains the higher-order terms of T ̃ i and their interactions with X ̃ i . Substituting this expression into the Lagrange function yields the dual L d as

(9) L d ( γ ) = ln i = 1 N 1 q i exp γ g p , X ̃ i , T ̃ i .

Differentiating L d with respect to γ yields the first-order conditions

(10) i = 1 N 1 exp γ * g p , X ̃ i , T ̃ i g p , X ̃ i , T ̃ i i = 1 N 1 exp γ * g p , X ̃ i , T ̃ i = 0 ,

where γ* refer to the multiplier values at the optimum.[6] As Eq. (10) are non-linear in those multipliers, they have to be solved for numerically. This is done using a quasi-Newton optimization approach. Due to the convexity of the optimization problem, the algorithm tends to converge relatively quickly even in large datasets. Once values for γ* are obtained, balancing weights are backed out using (8) for subsequent analysis. As noted by Hainmueller (2012), optimization can be performed iteratively to limit the influence of units with potentially extreme weights. To do so, the researcher estimates EBCT weights and truncates excessive weights beyond some threshold. For the binary literature, Imbens (2004) suggests a value of 5%. For the estimation of causal effects in the continuous treatment case, the threshold should probably be substantially smaller, e.g. 1–2%, in order to avoid an overly large influence of single observations on the estimated shape of the DRF. Having truncated weights to some threshold, they are used as base weights for a second run of the algorithm. Resulting weights should display smaller maximum weights.

4 Re-Analyzing the 1987 National Medical Expenditure Data

In this section, EBCT and comparison weighting methods are applied to the analysis of a subset of the 1987 National Medical Expenditure Survey (NMES) data, originally analyzed by Johnson et al. (2003), in order to inspect the relative performance of the different weighting methods. EBCT weights will be estimated for p = 1, 2 and 3. Comparison methods comprise of previously described approaches: Inverse Probability Weighting (IPW, Robins, Hernán, and Brumback 2000) – which estimates the GPS using OLS regression similar to Hirano and Imbens (2004) – as well as published balance-optimizing re-weighting algorithms that are readily available in R. The latter consist of the parametric and the non-parametric Covariate Balancing Generalized Propensity Score (CBGPS and npCBGPS, Fong, Hazlett, and Imai 2018) and the machine-learning based Generalized Boosted Modeling (GBM, Zhu, Coffman, and Ghosh 2015).[7] Methods that do not estimate balancing weights such as Kennedy et al. (2017) or Wu et al. (2020) are not used in order to be able to use a common balancing criterion in the analysis. A more holistic analysis of finite-sample properties of estimators is left for future research.

4.1 The Data

The 1987 NMES data includes information on respondents smoking behavior, their medical expenditures and some background characteristics. For the analysis, N = 9, 408 current or previous smokers are used.[8] For them, the number of pack years smoked (=smoking duration in years ⋅ cigarette packs smoked per day) is generated as a measure of cumulated exposure to smoking. Available background characteristics are the continuous (starting) age, a male indicator and categorical variables on race, seatbelt usage, education, marital status, census region of residence and poverty status. The treatment variable is taken in logarithmic terms to reduce the skewness of the distribution and make the normality assumption by IPW and CBGPS more plausible. A histogram of the resulting distribution can be found in Figure 1. In general, the data are characterized by relatively strong correlations between the treatment variable and covariates, especially the (starting) age. As second and third-order terms of (starting) age are strongly related to the treatment intensity, they are also included in the specification for the estimation of balancing weights.

Figure 1: 
Histogram of the treatment intensity.
This graph shows a histogram of the treatment intensity t = ln(pack years).
Figure 1:

Histogram of the treatment intensity.

This graph shows a histogram of the treatment intensity t = ln(pack years).

4.2 Estimating Weights and Inspecting Correlations

First, balancing weights are estimated. Computation times differ substantially from below 1 s for EBCT with p = 1 up to 17 min for npCBGPS.[9] Having estimated weights, the next step is to check whether the (automated) balancing algorithms did what they were designed to do: purge the treatment variable from its correlation with covariates. Table 1 gives an overview of (mean absolute) Pearson correlations before and after weighting for all control variables and their transformations used.

Table 1:

Balancing quality – (weighted) Pearson correlations.

Covariate Unweighted IPW CBGPS npCBGPS GBM EBCT
p = 1 p = 2 p = 3
Starting age
Linear −0.14 0.20 −0.07 −0.06 −0.08 0.00 0.00 0.00
Squared −0.13 0.19 −0.04 −0.05 −0.07 0.00 0.00 0.00
Cubed −0.11 0.16 −0.02 −0.04 −0.06 0.00 0.00 0.00
Age
Linear 0.47 −0.34 0.06 0.07 0.01 0.00 0.00 0.00
Squared 0.43 −0.32 0.06 0.07 0.01 0.00 0.00 0.00
Cubed 0.38 −0.29 0.05 0.08 0.01 0.00 0.00 0.00
Sex
Male 0.14 0.23 0.04 0.20 0.14 0.00 0.00 0.00
Race
Black −0.14 0.19 −0.01 −0.06 −0.01 0.00 0.00 0.00
Other 0.19 −0.24 0.02 −0.10 0.03 0.00 0.00 0.00
Seatbelt usage
Sometimes −0.01 0.2 −0.03 −0.03 0.05 0.00 0.00 0.00
Often −0.04 −0.18 0.05 0.02 −0.11 0.00 0.00 0.00
Education
High school 0.01 −0.04 0.02 0.10 0.06 0.00 0.00 0.00
Some college −0.04 −0.02 −0.03 −0.01 0.04 0.00 0.00 0.00
College degree 0.11 −0.03 0.00 −0.10 −0.01 0.00 0.00 0.00
Other 0.11 −0.18 −0.01 0.02 −0.02 0.00 0.00 0.00
Marital status
Widowed 0.04 0.07 −0.02 0.13 0.05 0.00 0.00 0.00
Separated −0.02 0.11 −0.02 −0.03 0.04 0.00 0.00 0.00
Never married −0.26 0.17 0.07 −0.09 −0.01 0.00 0.00 0.00
Census region
Mid-west 0.02 0.16 −0.03 0.07 0.05 0.00 0.00 0.00
South −0.02 −0.20 0.05 −0.07 −0.08 0.00 0.00 0.00
West −0.01 0.15 −0.02 −0.02 0.01 0.00 0.00 0.00
Poverty status
Poor −0.02 −0.03 −0.02 0.10 −0.01 0.00 0.00 0.00
Low income −0.01 −0.04 −0.01 −0.09 0.00 0.00 0.00 0.00
Middle income 0.00 −0.15 −0.03 0.10 −0.01 0.00 0.00 0.00
High income 0.03 0.18 −0.02 −0.02 0.00 0.00 0.00 0.00
Mean absolute correlation 0.12 0.16 0.03 0.08 0.04 0.00 0.00 0.00
Maximum weight in % 7.1 12.9 6.6 0.7 0.4 0.8 1.4
  1. The table shows (mean absolute) Pearson correlations between the treatment variable t = ln(pack years) and covariates in the raw sample as well as in the re-weighed samples. Re-weighting approaches employed are inverse probability weighting estimated via OLS (IPW, see Robins, Hernán, and Brumback 2000), (non-) parametric covariate balancing generalized propensity scores (np-/CBGPS, see Fong, Hazlett, and Imai 2018), generalized boosted modeling (GBM, see Zhu, Coffman, and Ghosh 2015) as well as the novel entropy balancing for continuous treatments (EBCT). The parameter p shows up to which power the treatment variable has been rendered uncorrelated with covariates.

Before weighting there is a substantial absolute correlation between ln(pack years) and the polynomials in age (38–47%). Correlations with the other covariates are smaller in magnitude but often substantive nonetheless. Compared to the unweighted sample, IPW increases the mean absolute correlation between covariates and the smoking intensity from 12 to 16%, leading to higher correlations in magnitude for several variables. Clearly, this is a very unfavorable balancing outcome. The parametric CBGPS reduces correlations for almost all variables or leads to slight increases in correlations for variables that were initially almost completely uncorrelated with the treatment. Its non-parametric counterpart is about as successful in reducing absolute correlations for covariates that were initially heavily related to the treatment. However, it leads to larger increases in imbalance for variables with lower initial correlations. For example, the correlation between the male indicator and the treatment increases from 14 to 20% after weighting using the npCBGPS. GBM is almost as effective in alleviating correlations between covariates and the treatment as the parametric CBGPS. Only the correlation between the male indicator and the treatment of 14% remains above the 0.1 threshold suggested by Zhu, Coffman, and Ghosh (2015). EBCT eradicates all correlations of the smoking intensity with covariates in this application, independent of which polynomial order p is used to obtain the balancing weights. Comparing the smallest maximum weight shares, one can see that EBCT with p = 1 yields the lowest value with 0.4% compared to 0.7% (GBM), 0.8% (EBCT, p = 2), 1.4 (EBCT, p = 3), 6.6% (npCBGPS), IPW (7.1%) and almost 13% (CBGPS). Hence, especially the parametric CBGPS method achieves better balance only by allowing for relatively extreme weights in this setting.

4.3 Balance Checks via Pseudo-DRFs

As mentioned earlier, uncorrelatedness does not necessarily imply the equality of covariate distributions across different levels of the treatment variable. Hence, it is crucial to check for potentially remaining imbalances before estimating the DRF to avoid bias in the resulting estimates. To inspect balance further, the following analysis estimates pseudo-DRFs using linear regressions of each covariate on a fourth-order polynomial in the treatment variable. This is similar in spirit to the procedure suggested by Smith and Todd (2005) in the binary treatment setting. Figure 2 shows box-plots of the regression R 2 for all covariates used in the estimation of balancing weights as well as all sensible two-way interactions. If covariates are truly balanced, the pseudo-DRFs should be completely flat and thus, R 2 should be close to zero after weighting.

Figure 2: 
Balancing quality – regression R
2.
This graph shows boxplots of R
2 from a regression of each covariate and all sensible two-way interactions on a fourth-order polynomial of the treatment intensity t = ln(pack years) before and after weighting. Re-weighting approaches employed are inverse probability weighting estimated via OLS (IPW, see Robins, Hernán, and Brumback 2000), (non-) parametric covariate balancing generalized propensity scores (np-/CBGPS, see Fong, Hazlett, and Imai 2018), generalized boosted modeling (GBM, see Zhu, Coffman, and Ghosh 2015) as well as the novel entropy balancing for continuous treatments (EBCT). The parameter p shows up to which power the treatment variable has been rendered uncorrelated with covariates.
Figure 2:

Balancing quality – regression R 2.

This graph shows boxplots of R 2 from a regression of each covariate and all sensible two-way interactions on a fourth-order polynomial of the treatment intensity t = ln(pack years) before and after weighting. Re-weighting approaches employed are inverse probability weighting estimated via OLS (IPW, see Robins, Hernán, and Brumback 2000), (non-) parametric covariate balancing generalized propensity scores (np-/CBGPS, see Fong, Hazlett, and Imai 2018), generalized boosted modeling (GBM, see Zhu, Coffman, and Ghosh 2015) as well as the novel entropy balancing for continuous treatments (EBCT). The parameter p shows up to which power the treatment variable has been rendered uncorrelated with covariates.

As Figure 2 reveals, this is the case for the majority of variables. However, all methods that focus on simple uncorrelatedness as their balancing target display unwanted residual imbalances due to lingering non-linear relationships between covariates and the treatment intensity.[10] This is despite the relatively flexible specification used for the estimation of balancing weights and shows the importance of checking for covariate balance beyond simple correlations before analyzing outcomes. Increasing the polynomial degree p for EBCT drastically improves balance, for p = 3 only negligible regression R 2 are found after balancing.

4.4 Bias and Mean Squared Error

This paragraph provides evidence on the expected performance of effect estimates in terms of (absolute) bias and root mean squared error using the different balancing approaches considered. Similar to the approach by Frölich, Huber, and Wiesenfarth (2017), empirical Monte-Carlo simulations are performed based on real data.[11] For this analysis, the observed medical expenditures from the N = 9, 408 current or previous smokers are replaced by matched outcomes from the group of the N = 9, 804 non-smokers also available in the dataset. More specifically, the smokers and non-smokers are matched using a non-parametric nearest-neighbor matching with replacement based on the Mahalanobis distance metric. The list of variables to match on includes the same set of variables as described above, except for starting age. The great advantage of this approach is that due to its non-parametric nature, the resulting dataset features realistic (and possibly complex) functional relationships between covariates and the matched outcome. As the matched medical expenditures stems from non-smokers, they are unaffected by the smoking intensity such that the CIA holds and estimates should display completely flat DRFs. This allows to inspect the performance of estimators in a realistic setting. For the simulations, R = 200 samples of size N = 9, 408 are drawn with replacement from the original data. For each of the samples, balancing weights are re-estimated. Outcome regressions are implemented via weighted least squares regression using a cubic polynomial in ln(pack years) as well as through weighted local linear kernel regressions based on the Epanechnikov Kernel (Fan 1992). For the kernel regression, the bandwidth is chosen in a data-driven manner via cross-validation as implemented by the locpol package (Cabrera 2018). From these estimates, the dose–response functions E[Y i (t)] are obtained as predicted values for an evenly-spaced grid of values of the treatment variable between its first and 99th percentile. Table 2 shows estimates of absolute bias and root mean squared error (RMSE), averaged over all grid values.

Table 2:

Empirical Monte-Carlo simulation: average bias and mean squared error.

Unweighted IPW CBGPS npCBGPS GBM EBCT
p = 1 p = 2 p = 3
Global polynomial regression
Absolute bias 270.1 406.9 272.0 172.3 223.4 226.3 85.8 112.5
Root mean squared error 279.3 581.2 883.0 389.9 273.8 267.2 143.8 199.8
Local linear regression
Absolute bias 259.3 244.2 210.2 161.2 169.4 185.1 93.7 118.6
Root mean squared error 267.0 389.3 658.0 376.8 225.6 241.2 152.7 206.5
  1. The table shows results from the empirical Monte-Carlo simulation. Displayed are the estimated absolute bias as well as root mean squared error based on 200 replications. For the simulation, observed outcomes of smokers are replaced by matched outcomes from non-smokers. Estimates of the DRF are obtained via weighted linear regression using a third-order polynomial in the treatment variable as well as via weighted local linear kernel regression based on the Epanechnikov kernel with bandwidths chosen via cross-validation. Estimates are averaged over the evenly-spaced grid-values in terms of the treatment variable. Re-weighting approaches employed are inverse probability weighting estimated via OLS (IPW, see Robins, Hernán, and Brumback 2000), (non-) parametric covariate balancing generalized propensity scores (np-/CBGPS, see Fong, Hazlett, and Imai 2018), generalized boosted modeling (GBM, see Zhu, Coffman, and Ghosh 2015) as well as the novel entropy balancing for continuous treatments (EBCT). The parameter p shows up to which power the treatment variable has been rendered uncorrelated with covariates.

The simulation results show that IPW and the parametric CBGPS barely lead to any gains in terms of bias and yield an increase in RMSE relative to the unweighted scenario. When using the polynomial regression, they even increase bias and RMSE compared to the unweighted regression. The non-parametric CBGPS on the other hand reduces bias substantially but leads to an increase in RMSE, implying relatively large variance of estimates. GBM and EBCT with p = 1 perform similarly but still display non-negligible bias. Only when EBCT also renders higher moments of the treatment variable uncorrelated with covariates, bias and also RMSE drop substantially. Moreover, the results show that bias and RMSE is reduced by switching from parametric to non-parametric estimation of the DRF in most cases. When using EBCT with p > 1, DRF estimates using local linear regressions display slightly larger bias and RMSE than using a global polynomial regression. Thus, it appears that non-parametric estimation of the DRF is able to mitigate the effects of residual confounding on resulting estimates at least to some degree.

To inspect the performance of the estimators further, Figure 3 displays estimates of the bias as a function of the treatment variable. To indicate variability of estimates, 95% normal confidence bands are shown based on the Monte-Carlo error (Koehler, Brown, and Haneuse 2009). For the sake of brevity, only the results for non-parametric DRF estimates are shown.

Figure 3: 
Empirical Monte-Carlo simulation: bias in detail.
The graph shows further results from the empirical Monte-Carlo simulation based on 200 replications. Displayed are the estimated bias as well as 95% normal confidence intervals based on the Monte-Carlo error a function of the treatment variable t =ln(pack years). For the simulation, observed outcomes of smokers are replaced by matched outcomes from non-smokers. Estimates shown are obtained via weighted local linear kernel regression based on the Epanechnikov kernel with bandwidths chosen via cross-validation. Re-weighting approaches employed are inverse probability weighting estimated via OLS (IPW, see Robins, Hernán, and Brumback 2000), (non-) parametric covariate balancing generalized propensity scores (np-/CBGPS, see Fong, Hazlett, and Imai 2018), generalized boosted modeling (GBM, see Zhu, Coffman, and Ghosh 2015) as well as the novel entropy balancing for continuous treatments (EBCT). The parameter p shows up to which power the treatment variable has been rendered uncorrelated with covariates.
Figure 3:

Empirical Monte-Carlo simulation: bias in detail.

The graph shows further results from the empirical Monte-Carlo simulation based on 200 replications. Displayed are the estimated bias as well as 95% normal confidence intervals based on the Monte-Carlo error a function of the treatment variable t =ln(pack years). For the simulation, observed outcomes of smokers are replaced by matched outcomes from non-smokers. Estimates shown are obtained via weighted local linear kernel regression based on the Epanechnikov kernel with bandwidths chosen via cross-validation. Re-weighting approaches employed are inverse probability weighting estimated via OLS (IPW, see Robins, Hernán, and Brumback 2000), (non-) parametric covariate balancing generalized propensity scores (np-/CBGPS, see Fong, Hazlett, and Imai 2018), generalized boosted modeling (GBM, see Zhu, Coffman, and Ghosh 2015) as well as the novel entropy balancing for continuous treatments (EBCT). The parameter p shows up to which power the treatment variable has been rendered uncorrelated with covariates.

Figure 3 shows that in the unweighted dataset, naive DRF estimates are significantly biased downward in the lower tail and biased upward in the upper tail of the treatment intensity distribution. While IPW and (np-)CBGPS remove some of that bias, they display excessive variance in at least some parts of the distribution of the treatment variable. GBM and EBCT with p = 1 have much lower variance, but are still display a significant upward bias in the upper tail. Only when higher moments of T i are also rendered uncorrelated using EBCT, bias turns insignificant. Comparing the width of confidence intervals for EBCT with p = 2 and p = 3, one can see an increase in the variance mainly in the tails when increasing the degree of polynomial in T i to be balanced.

4.5 Estimating the Effects of Smoking

This part estimates the actual DRF of the smoking intensity regarding observed medical expenditures. Due to the relatively poor performance of IPW and (np-)CBGPS in the simulations, estimates based on those weighting procedures are not presented in the main text. All results can be found in Figures A.1 and A.2 in the Appendix. To obtain standard errors, the bootstrap (Efron and Tibshirani 1986; MacKinnon 2006) is used as Vegetabile et al. (2021) find that it performs well in combination with EBCT. Standard errors are estimated using 200 bootstrap replications, including the estimation of balancing weights (and the kernel bandwidth) in each replication.[12] Resulting estimates of the DRFs along with 95% confidence bands based on the normal approximation are displayed in Figure 4.

Figure 4: 
Effects of smoking on medical expenditures.
This graph shows the estimated dose–response function (DRF) between the log(pack years) and medical expenditures based on an either weighted linear regressions using cubic specification or weighted local linear kernel regressions based on the Epanechnikov kernel with bandwidths chosen via cross-validation. Results are only shown for generalized boosted modeling (GBM, see Zhu, Coffman, and Ghosh 2015) as well as the novel entropy balancing for continuous treatments (EBCT). The full set of results can be found in Figures A.1 and A.2. The shaded grey areas represent 95% normal confidence intervals of the DRF based on bootstrapped standard errors obtained using R = 200 replications.
Figure 4:

Effects of smoking on medical expenditures.

This graph shows the estimated dose–response function (DRF) between the log(pack years) and medical expenditures based on an either weighted linear regressions using cubic specification or weighted local linear kernel regressions based on the Epanechnikov kernel with bandwidths chosen via cross-validation. Results are only shown for generalized boosted modeling (GBM, see Zhu, Coffman, and Ghosh 2015) as well as the novel entropy balancing for continuous treatments (EBCT). The full set of results can be found in Figures A.1 and A.2. The shaded grey areas represent 95% normal confidence intervals of the DRF based on bootstrapped standard errors obtained using R = 200 replications.

Overall, differences in point estimates of the DRF between global polynomial and local linear regressions are small, although confidence bands are somewhat larger for the non-parametric estimates. All estimates show an increase in medical expenditures with large smoking intensities. While GBM and EBCT with p = 1 show a slight decline in medical expenditures for low treatment intensities, estimates based on EBCT with p > 1 no longer display such a pattern. Hence, also rendering higher orders of the treatment intensity uncorrelated with covariates leads to the most reasonable estimates of the DRF as smoking can be expected to have non-negative effects on medical expenditures.

5 Conclusions

This paper introduces EBCT, a user-friendly and flexible re-weighting procedure that can be used to estimate causal effects in the context of continuous treatments. Owing to the globally convex optimization problem solved by EBCT, it reliably eradicates correlations between covariates and the treatment intensity. As uncorrelatedness does not guarantee that the covariate distribution is independent of the treatment variable after re-weighting, EBCT allows to also render higher moments of the treatment variable uncorrelated with covariates in order to mitigate this issue. Indeed, the empirical Monte-Carlo simulation shows that this feature may be crucial in order to insure against potential biases due to residual confounding and reduce mean squared error of dose–response estimates. To ease application of the proposed method, software implementation is readily available for Stata and R. In order to help researchers make informed choices for their application at hand, future research should provide a broader analysis of the finite sample performance of available methods for the estimation of DRFs under different data situations.


Corresponding author: Stefan Tübbicke, Institute for Employment Research (IAB), Regensburger Str. 104, Nuremberg 90478, Germany, E-mail:

Acknowledgments

The author would like to thank Marco Caliendo, Guido Imbens, Martin Lange, Cosima Obst and Sylvi Rzepka, participants of the 2019 annual conference of the European Economic Association, the 2020 CFE-CMStatistics conference, the editor and an anonymous reviewer for helpful comments.

Appendix
Figure A.1: 
Effects of smoking on medical expenditures – non-parametric estimates.
This graph shows the estimated dose–response function (DRF) between the log(pack years) and medical expenditures based on local linear kernel regressions with an Epanechnikov Kernel. The kernel bandwidth has been chosen via cross-validation. Re-weighting approaches employed are inverse probability weighting estimated via OLS (IPW, see Robins, Hernán, and Brumback 2000), (non-) parametric covariate balancing generalized propensity scores (np-/CBGPS, see Fong, Hazlett, and Imai 2018), generalized boosted modeling (GBM, see Zhu, Coffman, and Ghosh 2015) as well as the novel entropy balancing for continuous treatments (EBCT). The shaded grey areas represent 95% normal confidence intervals of the DRF based on bootstrapped standard errors obtained using R = 200 replications.
Figure A.1:

Effects of smoking on medical expenditures – non-parametric estimates.

This graph shows the estimated dose–response function (DRF) between the log(pack years) and medical expenditures based on local linear kernel regressions with an Epanechnikov Kernel. The kernel bandwidth has been chosen via cross-validation. Re-weighting approaches employed are inverse probability weighting estimated via OLS (IPW, see Robins, Hernán, and Brumback 2000), (non-) parametric covariate balancing generalized propensity scores (np-/CBGPS, see Fong, Hazlett, and Imai 2018), generalized boosted modeling (GBM, see Zhu, Coffman, and Ghosh 2015) as well as the novel entropy balancing for continuous treatments (EBCT). The shaded grey areas represent 95% normal confidence intervals of the DRF based on bootstrapped standard errors obtained using R = 200 replications.

Figure A.2: 
Effects of smoking on medical expenditures – parametric estimates.
This graph shows the estimated dose–response function (DRF) between the log(pack years) and medical expenditures based on weighted least squares regressions using a cubic specification. Re-weighting approaches employed are inverse probability weighting estimated via OLS (IPW, see Robins, Hernán, and Brumback 2000), (non-) parametric covariate balancing generalized propensity scores (np-/CBGPS, see Fong, Hazlett, and Imai 2018), generalized boosted modeling (GBM, see Zhu, Coffman, and Ghosh 2015) as well as the novel entropy balancing for continuous treatments (EBCT). The shaded grey areas represent 95% normal confidence intervals of the DRF based on bootstrapped standard errors obtained using R = 200 replications.
Figure A.2:

Effects of smoking on medical expenditures – parametric estimates.

This graph shows the estimated dose–response function (DRF) between the log(pack years) and medical expenditures based on weighted least squares regressions using a cubic specification. Re-weighting approaches employed are inverse probability weighting estimated via OLS (IPW, see Robins, Hernán, and Brumback 2000), (non-) parametric covariate balancing generalized propensity scores (np-/CBGPS, see Fong, Hazlett, and Imai 2018), generalized boosted modeling (GBM, see Zhu, Coffman, and Ghosh 2015) as well as the novel entropy balancing for continuous treatments (EBCT). The shaded grey areas represent 95% normal confidence intervals of the DRF based on bootstrapped standard errors obtained using R = 200 replications.

References

Ai, C., O. Linton, and Z. Zhang. 2021. “Estimation and Inference for the Counterfactual Distribution and Quantile Functions in Continuous Treatment Models.” Journal of Econometrics. 12 (3): 779–816.10.1016/j.jeconom.2020.12.009Search in Google Scholar

Becker, S. O., P. H. Egger, and M. von Ehrlich. 2012. “Too Much of a Good Thing? On the Growth Effects of the EU’s Regional Policy.” European Economic Review 56 (4): 648–68. https://doi.org/10.1016/j.euroecorev.2012.03.001.Search in Google Scholar

Bia, M., and A. Mattei. 2012. “Assessing the Effect of the Amount of Financial Aids to Piedmont Firms Using the Generalized Propensity Score.” Statistical Methods and Applications 21 (4): 485–516. https://doi.org/10.1007/s10260-012-0193-4.Search in Google Scholar

Boor, d. C. 1978. A Practical Guide to Splines, vol. 27. New York: Springer-Verlag.10.1007/978-1-4612-6333-3Search in Google Scholar

Boyd, S., and L. Vandenberghe. 2004. Convex Optimization. Cambridge, UK: Cambridge University Press.10.1017/CBO9780511804441Search in Google Scholar

Cabrera, J. L. O. 2018. locpol: Kernel Local Polynomial Regression. R package version 0.7-0.Search in Google Scholar

Callaway, B., and W. Huang. 2020. “Distributional Effects of a Continuous Treatment with an Application on Intergenerational Mobility.” Oxford Bulletin of Economics & Statistics 82 (4): 808–42. https://doi.org/10.1111/obes.12355.Search in Google Scholar

Choe, C., A. Flores-Lagunes, and S.-J. Lee. 2015. “Do dropouts with Longer Training Exposure Benefit from Training Programs? Korean Evidence Employing Methods for Continuous Treatments.” Empirical Economics 48 (2): 849–81. https://doi.org/10.1007/s00181-014-0805-y.Search in Google Scholar

Colangelo, K., and Y.-Y. Lee. 2020. Double Debiased Machine Learning Nonparametric Inference with Continuous Treatments. arXiv Papers 2004.03036. arXiv.org.Search in Google Scholar

Crump, R., V. J. Hotz, G. W. Imbens, and O. A. Mitnik. 2009. “Dealing with Limited Overlap in Estimation of Average Treatment Effects.” Biometrika 96 (1): 187–99. https://doi.org/10.1093/biomet/asn055.Search in Google Scholar

Efron, B., and R. Tibshirani. 1986. “Bootstrap Methods for Standard Errors, Confidence Intervals, and Other Measures of Statistical Accuracy.” Statistical Science 1 (1): 54–75. https://doi.org/10.1214/ss/1177013815.Search in Google Scholar

Fan, J. 1992. “Design-Adaptive Nonparametric Regression.” Journal of the American Statistical Association 87: 998–1004. https://doi.org/10.1080/01621459.1992.10476255.Search in Google Scholar

Flores, C. A., A. Flores-Lagunes, A. Gonzalez, and T. C. Neumann. 2012. “Estimating the Effects of Length of Exposure to Instruction in a Training Program: The Case of Job Corps.” The Review of Economics and Statistics 94 (1): 153–71. https://doi.org/10.1162/rest_a_00177.Search in Google Scholar

Fong, C., C. Hazlett, and K. Imai. 2018. “Covariate Balancing Propensity Score for a Continuous Treatment: Application to the Efficacy of Political Advertisements.” Annals of Applied Statistics 12 (1): 156–77. https://doi.org/10.1214/17-aoas1101.Search in Google Scholar

Frölich, M., M. Huber, and M. Wiesenfarth. 2017. “The Finite Sample Performance of Semi- and Non-parametric Estimators for Treatment Effects and Policy Evaluation.” Computational Statistics & Data Analysis 115 (C): 91–102. https://doi.org/10.1016/j.csda.2017.05.007.Search in Google Scholar

Galdo, J., and A. Chong. 2012. “Does the Quality of Public-Sponsored Training Programs Matter? Evidence from Bidding Processes Data.” Labour Economics 19 (6): 970–86. https://doi.org/10.1016/j.labeco.2012.08.001.Search in Google Scholar

Graham, D. J., E. J. McCoy, and D. A. Stephens. 2015. Doubly Robust Dose-Response Estimation for Continuous Treatments via Generalized Propensity Score Augmented Outcome Regression. arXiv Papers 1506.04991. arXiv.org.Search in Google Scholar

Greifer, N. 2020. WeightIt: Weighting for Covariate Balance in Observational Studies. R package version 0.9.0.Search in Google Scholar

Hainmueller, J. 2012. “Entropy Balancing for Causal Effects: A Multivariate Reweighting Method to Produce Balanced Samples in Observational Studies.” Political Analysis 20 (1): 25–46. https://doi.org/10.1093/pan/mpr025.Search in Google Scholar

Heckman, J. J., and R. Robb. 1985. “Alternative Methods for Evaluating the Impact of Interventions: An Overview.” Journal of Econometrics 30 (1): 239–67. https://doi.org/10.1016/0304-4076(85)90139-3.Search in Google Scholar

Hirano, K., and G. W. Imbens. 2004. “The Propensity Score with Continuous Treatments.” In Applied Bayesian Modeling and Causal Inference from Incomplete-Data Perspectives: An Essential Journey with Donald Rubin’s Statistical Family, edited by Gelman, A. and Meng, X.-L., pp. 73–84. Chichester: John Wiley & Sons.10.1002/0470090456.ch7Search in Google Scholar

Horvitz, D. G., and D. J. Thompson. 1952. “A Generalization of Sampling without Replacement from a Finite Universe.” Journal of the American Statistical Association 47 (260): 663–85. https://doi.org/10.1080/01621459.1952.10483446.Search in Google Scholar

Huber, M., M. Lechner, and C. Wunsch. 2013. “The Performance of Estimators Based on the Propensity Score.” Journal of Econometrics 175 (1): 1–21. https://doi.org/10.1016/j.jeconom.2012.11.006.Search in Google Scholar

Huber, M., Y.-C. Hsu, Y.-Y. Lee, and L. Lettry. 2020. “Direct and Indirect Effects of Continuous Treatments Based on Generalized Propensity Score Weighting.” Journal of Applied Econometrics 35 (7): 814–40. https://doi.org/10.1002/jae.2765.Search in Google Scholar

Imai, K., and D. A. van Dyk. 2004. “Causal Inference with General Treatment Regimes: Generalizing the Propensity Score.” Journal of the American Statistical Association 99 (467): 854–66. https://doi.org/10.1198/016214504000001187.Search in Google Scholar

Imbens, G. 2004. “Nonparametric Estimation of Average Treatment Effects under Exogeneity: A Review.” The Review of Economics and Statistics 86 (1): 4–29. https://doi.org/10.1162/003465304323023651.Search in Google Scholar

Imbens, G. W. 2000. “The Role of the Propensity Score in Estimating Dose-Response Functions.” Biometrika 87 (3): 706–10. https://doi.org/10.1093/biomet/87.3.706.Search in Google Scholar

Imbens, G. W., and J. M. Wooldridge. 2009. “Recent Developments in the Econometrics of Program Evaluation.” Journal of Economic Literature 47 (1): 5–86. https://doi.org/10.1257/jel.47.1.5.Search in Google Scholar

Johnson, E., F. Dominici, M. Griswold, and S. L. Zeger. 2003. “Disease Cases and Their Medical Costs Attributable to Smoking: An Analysis of the National Medical Expenditure Survey.” Journal of Econometrics 112 (1): 135–51. https://doi.org/10.1016/s0304-4076(02)00157-4.Search in Google Scholar

Kennedy, E. H., Z. Ma, M. D. McHugh, and D. S. Small. 2017. “Non-Parametric Methods for Doubly Robust Estimation of Continuous Treatment Effects.” Journal of the Royal Statistical Society: Series B 79 (4): 1229–45. https://doi.org/10.1111/rssb.12212.Search in Google Scholar PubMed PubMed Central

Kluve, J., H. Schneider, A. Uhlendorff, and Z. Zhao. 2012. “Evaluating Continuous Training Programs Using the Generalized Propensity Score.” Journal of the Royal Statistical Society: Series A 175 (2): 587–617. https://doi.org/10.1111/j.1467-985x.2011.01000.x.Search in Google Scholar

Koehler, E., E. Brown, and S. J.-P. Haneuse. 2009. “On the Assessment of Monte Carlo Error in Simulation-Based Statistical Analyses.” The American Statistician 63 (2): 155–62. https://doi.org/10.1198/tast.2009.0030.Search in Google Scholar PubMed PubMed Central

Kreif, N., R. Grieve, I. Díaz, and D. Harrison. 2015. “Evaluation of the Effect of a Continuous Treatment: A Machine Learning Approach with an Application to Treatment for Traumatic Brain Injury.” Health Economics 24 (9): 1213–28. https://doi.org/10.1002/hec.3189.Search in Google Scholar PubMed PubMed Central

Kullback, S. 1959. Information Theory and Statistics. Chichester: John Wiley & Sons.Search in Google Scholar

Lechner, M. 2001. “Identification and Estimation of Causal Effects of Multiple Treatments under the Conditional Independence Assumption.” In Econometric Evaluation of Labour Market Policies, edited by Lechner, M. and Pfeiffer, F., pp. 43–58. Heidelberg: Physica-Verlag HD. https://doi.org/10.1007/978-3-642-57615-7_3.Search in Google Scholar

Lechner, M., and A. Strittmatter. 2019. “Practical Procedures to Deal with Common Support Problems in Matching Estimation.” Econometric Reviews 38 (2): 193–207. https://doi.org/10.1080/07474938.2017.1318509.Search in Google Scholar

Li, J., and M. W. Fraser. 2015. “Evaluating Dosage Effects in a Social-Emotional Skills Training Program for Children: An Application of Generalized Propensity Scores.” Journal of Social Service Research 41 (3): 345–64. https://doi.org/10.1080/01488376.2014.994797.Search in Google Scholar

MacKinnon, J. G. 2006. “Bootstrap Methods in Econometrics.” The Economic Record 82: 2–18. https://doi.org/10.1111/j.1475-4932.2006.00328.x.Search in Google Scholar

Manski, C. F. 2013. “Identification of Treatment Response with Social Interactions.” The Econometrics Journal 16 (1): 1–23. https://doi.org/10.1111/j.1368-423x.2012.00368.x.Search in Google Scholar

Mitze, T., A. R. Paloyo, and B. Alecke. 2015. “Is There a Purchase Limit on Regional Growth? A Quasi-Experimental Evaluation of Investment Grants Using Matching Techniques.” International Regional Science Review 38 (4): 388–412. https://doi.org/10.1177/0160017613505200.Search in Google Scholar

Naimi, A., E. Moodie, N. Auger, and J. Kaufman. 2014. “Constructing Inverse Probability Weights for Continuous Exposures - A Comparison of Methods.” Epidemoiology 25 (1): 292–9. https://doi.org/10.1097/EDE.0000000000000053.Search in Google Scholar PubMed

Robins, J., M. Hernán, and B. Brumback. 2000. “Marginal Structural Models and Causal Inference in Epidemiology.” Epidemiology 11 (5): 550–60. https://doi.org/10.1097/00001648-200009000-00011.Search in Google Scholar PubMed

Roy, A. D. 1951. “Some Thoughts on the Distribution of Earnings.” Oxford Economic Papers 3 (2): 135–46. https://doi.org/10.1093/oxfordjournals.oep.a041827.Search in Google Scholar

Rubin, D. 1974. “Estimating Causal Effects of Treatments in Randomised and Nonrandomised Studies.” Journal of Educational Psychology 66 (5): 688–701. https://doi.org/10.1037/h0037350.Search in Google Scholar

Rubin, D. 1980. “Comment on Basu, D. - Randomization Analysis of Experimental Data: The Fisher Randomization Test.” Journal of the American Statistical Association 75 (371): 591–3. https://doi.org/10.2307/2287653.Search in Google Scholar

Sauerbrei, W., and P. Royston. 1999. “Building Multivariable Prognostic and Diagnostic Models: Transformation of the Predictors by Using Fractional Polynomials.” Journal of the Royal Statistical Society: Series A 162 (1): 71–94. https://doi.org/10.1111/1467-985x.00122.Search in Google Scholar

Smith, J., and P. Todd. 2005. “Rejoinder.” Journal of Econometrics 125: 365–75. https://doi.org/10.1016/j.jeconom.2004.04.013.Search in Google Scholar

Su, L., T. Ura, and Y. Zhang. 2019. “Non-Separable Models with High-Dimensional Data.” Journal of Econometrics 212 (2): 646–77. https://doi.org/10.1016/j.jeconom.2019.06.004.Search in Google Scholar

Tübbicke, S. 2020. Entropy Balancing for Continuous Treatments. arXiv Papers 2001.06281. arXiv.org.Search in Google Scholar

Vegetabile, B. G., B. A. Griffin, D. L. Coffman, M. Cefalu, M. W. Robbins, and D. F. McCaffrey. 2021. “Nonparametric Estimation of Population Average Dose-Response Curves Using Entropy Balancing Weights for Continuous Exposures.” Health Services & Outcomes Research Methodology 21 (1): 69–110. https://doi.org/10.1007/s10742-020-00236-2.Search in Google Scholar PubMed PubMed Central

Wu, X., F. Mealli, M.-A. Kioumourtzoglou, F. Dominici, and D. Braun. 2020. Matching on Generalized Propensity Scores with Continuous Exposures. arXiv Papers 1812.06575. arXiv.org.Search in Google Scholar

Yiu, S., and L. Su. 2018. “Covariate Association Eliminating Weights: A Unified Weighting Framework for Causal Effect Estimation.” Biometrika 105 (3): 709–22. https://doi.org/10.1093/biomet/asy015.Search in Google Scholar PubMed PubMed Central

Zhao, S., D. A. van Dyk, and K. Imai. 2020. “Propensity Score-Based Methods for Causal Inference in Observational Studies with Non-binary Treatments.” Statistical Methods in Medical Research 29 (3): 709–27. https://doi.org/10.1177/0962280219888745.Search in Google Scholar PubMed

Zhu, Y., D. L. Coffman, and D. Ghosh. 2015. “A Boosting Algorithm for Estimating Generalized Propensity Scores with Continuous Treatments.” Journal of Causal Inference 3 (1): 25–40. https://doi.org/10.1515/jci-2014-0022.Search in Google Scholar PubMed PubMed Central

Received: 2021-01-11
Revised: 2021-11-24
Accepted: 2021-11-25
Published Online: 2021-12-15

© 2021 Stefan Tübbicke, published by De Gruyter, Berlin/Boston

This work is licensed under the Creative Commons Attribution 4.0 International License.

Downloaded on 7.10.2025 from https://www.degruyterbrill.com/document/doi/10.1515/jem-2021-0002/html
Scroll to top button