Abstract
Propensity score (PS) matching is widely used for studying treatment effects in observational studies. This article proposes the method of matching weights (MWs) as an analog to one-to-one pair matching without replacement on the PS with a caliper. Compared with pair matching, the proposed method offers more efficient estimation, more accurate variance calculation, better balance, and simpler asymptotic analysis. A statistical test for the misspecification of the PS model is proposed for balance checking purposes. An augmented version of the MW estimator is developed that has the double robust property, that is, the estimator is consistent, if either the outcome model or the PS model is correct. The proposed method is studied in simulations and illustrated through a real data example.
Appendix
Proposition 1 Suppose that the PS can only take finitely many values at  ,
,  .
.  and
 and  . If we do a one-to-one exact matching on the PS without replacement and choose randomly when multiple matched pairs are available, then the matching estimator has the same asymptotic limit as the MW estimator as
. If we do a one-to-one exact matching on the PS without replacement and choose randomly when multiple matched pairs are available, then the matching estimator has the same asymptotic limit as the MW estimator as  . In addition, the effective sample size of the MW estimator is asymptotically equivalent to the expected number of matched pairs.
. In addition, the effective sample size of the MW estimator is asymptotically equivalent to the expected number of matched pairs.
Proof Denote  by
 by  , and let
, and let  (
 ( ) be the set of matched subjects from the treatment (control) group with PS
) be the set of matched subjects from the treatment (control) group with PS  . The matching estimator is
. The matching estimator is


Similarly,

Therefore, as  ,
,

which is the same asymptotic limit as the MW estimator  . The effective sample size of the MW estimator is
. The effective sample size of the MW estimator is  for the treatment group and
 for the treatment group and  for the control group. The number of matched pairs is
 for the control group. The number of matched pairs is  . These quantities are asymptotically equivalent from the derivation above and
. These quantities are asymptotically equivalent from the derivation above and  . □
. □
Proof of Theorem 1: If  and
 and  are correctly specified, then
 are correctly specified, then  .
.

Since  and
 and  ,
,

Since  and
 and

Similarly,  . Summarizing the results above and the expression of
. Summarizing the results above and the expression of  , we have
, we have  when
 when  and
 and  are correctly specified.
 are correctly specified.
Next, we assume that the PS model is correctly specified, that is,  s are correct. We can rearrange the terms in eq. [6] of the article and write
s are correct. We can rearrange the terms in eq. [6] of the article and write  as
 as
![[10]](/document/doi/10.1515/ijb-2012-0030/asset/graphic/ijb-2012-0030_eq10.png)
The first term is  , which converges to
, which converges to  . The second term equals to
. The second term equals to

Since  and
 and  , the second term converges to 0. Similarly, the third term in eq. [10] also converges to 0. Therefore,
, the second term converges to 0. Similarly, the third term in eq. [10] also converges to 0. Therefore,  when the PS model is correctly specified. □
 when the PS model is correctly specified. □
Proof of Theorem 2: When the PS model is known,  . The MW estimator approximately equals to
. The MW estimator approximately equals to
![[11]](/document/doi/10.1515/ijb-2012-0030/asset/graphic/ijb-2012-0030_eq11.png)
where  . We can define
. We can define  and
 and  similarly and view
 similarly and view  as the new data and
 as the new data and  as the new potential outcomes. It is obvious that if the SUTVA assumption and unconfoundedness assumption hold, then similar assumptions hold for the new data and new potential outcomes as well:
 as the new potential outcomes. It is obvious that if the SUTVA assumption and unconfoundedness assumption hold, then similar assumptions hold for the new data and new potential outcomes as well:  and
 and  .
.

Expression (11) suggests that  can be viewed as an inverse probability weighting estimator, if we think of
 can be viewed as an inverse probability weighting estimator, if we think of  as the outcome variable. Therefore, the semiparametric theory in §13.5 of Tsiatis [44], originally developed for inverse probability weighting method, can be applied to show that the class of influence functions of regular asymptotically linear estimators of
 as the outcome variable. Therefore, the semiparametric theory in §13.5 of Tsiatis [44], originally developed for inverse probability weighting method, can be applied to show that the class of influence functions of regular asymptotically linear estimators of  is given by
 is given by

where  for any function
 for any function  . The efficient influence function in this class is uniquely given by
. The efficient influence function in this class is uniquely given by

 as in eq. [6] of the article is the estimator corresponding to this efficient influence function. □
 as in eq. [6] of the article is the estimator corresponding to this efficient influence function. □
Acknowledgement
We greatly appreciate the helpful comments from the editor, associate editor and referees. This work was carried out while Liang Li was a faculty biostatistician in the Department of Quantitative Health Sciences at Cleveland Clinic.
References
1. Rosenbaum P, Rubin D. The central role of the propensity score in observational studies for causal effects. Biometrika 1983;70:41–55.10.1093/biomet/70.1.41Search in Google Scholar
2. D’Agostino RB. Propensity score methods for bias reduction in the comparison of a treatment to a non-randomized control group. Stat Med 1998;17:2265–81.10.1002/(SICI)1097-0258(19981015)17:19<2265::AID-SIM918>3.0.CO;2-BSearch in Google Scholar
3. Imbens GW. Nonparametric estimation of average treatment effects under exogeneity: a review. Rev Econ Stat 2004;86:4–29.10.1162/003465304323023651Search in Google Scholar
4. Ho DE, Imai K, King G, Stuart EA. Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Pol Anal 2007;15:199–236.10.1093/pan/mpl013Search in Google Scholar
5. Guo S, Fraser M. Propensity score analysis: statistical methods and applications. Thousand Oaks, CA, USA: Sage Publications, Inc. 2010.Search in Google Scholar
6. Rubin DB. Which ifs have causal answers? J Am Stat Assoc 1986;81:961–2.10.1080/01621459.1986.10478355Search in Google Scholar
7. Imbens GW, Wooldridge JM. Recent developments in the econometrics of program evaluation. J Econ Lit 2009;47:5–86.10.1257/jel.47.1.5Search in Google Scholar
8. Stuart E. Matching methods for causal inference: a review and a look forward. Stat Sci 2010;25:1–21.10.1214/09-STS313Search in Google Scholar
9. Luo Z, Gardiner JC, Bradley CJ. Applying propensity score methods in medical research: pitfalls and prospects. Med Care Res Rev 2010;67:528–54.10.1177/1077558710361486Search in Google Scholar
10. Rosenbaum P, Rubin D. Reducing bias in observational studies using subclassification on the propensity score. J Am Stat Assoc 1984;79:516–24.10.1080/01621459.1984.10478078Search in Google Scholar
11. Lunceford JK, Davidian M. Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study. Stat Med 2004;23:2937–60.10.1002/sim.1903Search in Google Scholar
12. Rosenbaum P, Rubin D. Constructing a control-group using multivariate matched sampling methods that incorporate the propensity score. Am Stat 1985;39:33–8.Search in Google Scholar
13. Austin PC. A critical appraisal of propensity-score matching in the medical literature between 1996 and 2003. Stat Med 2008;27:2037–49.10.1002/sim.3150Search in Google Scholar PubMed
14. Hirano K, Imbens GW. Estimation of causal effects using propensity score weighting: an application to data on right heart catheterization. Health Serv Outcomes Res Methodol 2001;2:259–78.10.1023/A:1020371312283Search in Google Scholar
15. Freedman DA, Berk RA. Weighting regressions by propensity scores. Eval Rev 2008;32:392–409.10.1177/0193841X08317586Search in Google Scholar PubMed
16. Austin PC. Some methods of propensity-score matching had superior performance to others: results of an empirical investigation and monte carlo simulations. Biometrical J 2009;51:171–84.10.1002/bimj.200810488Search in Google Scholar PubMed
17. Austin PC. Balance diagnostics for comparing the distribution of baseline covariates between treatment groups in propensity-score matched samples. Stat Med 2009;28:3083–107.10.1002/sim.3697Search in Google Scholar PubMed PubMed Central
18. Imai K, King G, Stuart EA. Misunderstandings among experimentalists and observationalists in causal inference. J R Stat Soc Ser A 2008;171:481–502.10.1111/j.1467-985X.2007.00527.xSearch in Google Scholar
19. Austin PC. Optimal caliper widths for propensity-score matching when estimating differences in means and differences in proportions in observational studies. Pharm Stat 2010;10:150–61.10.1002/pst.433Search in Google Scholar PubMed PubMed Central
20. Hill J. Discussion of research using propensity-score matching: comments on “a critical appraisal of propensity-score matching in the medical literature between 1996 and 2003” by peter Austin. Stat Med 2008;27:2055–61.10.1002/sim.3245Search in Google Scholar PubMed
21. Stuart E. Developing practical recommendations for the use of propensity scores: discussion of “a critical appraisal of propensity score matching in the medical literature between 1996 and 2003” by peter Austin. Stat Med 2008;27:2062–5.10.1002/sim.3207Search in Google Scholar PubMed
22. Hansen BB. The essential role of balance tests in propensity-matched observational studies: Comments on “a critical appraisal of propensity-score matching in the medical literature between 1996 and 2003” by peter Austin. Stat Med 2008;27:2050–4.10.1002/sim.3208Search in Google Scholar PubMed
23. Abadie A, Imbens GW. On the failure of the bootstrap for matching estimators. Mimeo, Kennedy School of Government, Harvard University, 2005.10.3386/t0325Search in Google Scholar
24. Abadie A, Imbens GW. Large sample properties of matching estimators for average treatment effects. Econometrica 2006;74:235–67.10.1111/j.1468-0262.2006.00655.xSearch in Google Scholar
25. Abadie A, Imbens GW. Matching on the estimated propensity score. NBER Working Paper Series, w15301, 2009. Available at SSRN: http://ssrn.com/abstract=1463894.10.3386/w15301Search in Google Scholar
26. Robins JM, Sued M, Lei-Gomez Q, Rotnitzky A. Comment: performance of double-robust estimators when “inverse probability” weights are highly variable. Stat Sci 2007;22:544–59.10.1214/07-STS227DSearch in Google Scholar
27. Bang H, Robins JM. Doubly robust estimation in missing data and causal inference models. Biometrics 2005;61:962–72.10.1111/j.1541-0420.2005.00377.xSearch in Google Scholar PubMed
28. van der Laan MJ, Robins JM. Unified methods for censored longitudinal data and causality. New York: Springer, 2003.10.1007/978-0-387-21700-0Search in Google Scholar
29. Robins JM, Hernan MA, Brumback B. Marginal structural models and causal inference in epidemiology. Epidemiology 2000;11:550–60.10.1097/00001648-200009000-00011Search in Google Scholar PubMed
30. Friedman JH, Silverman BW. Flexible parsimonious smoothing and additive modeling. Technometrics 1989;31:3–21.10.1080/00401706.1989.10488470Search in Google Scholar
31. Rubin DB. Estimating causal effects from large data sets using propensity scores. Ann Intern Med 1997;127:757–63.10.7326/0003-4819-127-8_Part_2-199710151-00064Search in Google Scholar
32. Li L, Wang WW, Chan ISF. Correlation coefficient inference on censored bioassay data. J Biopharm Stat 2005;15:501–12.10.1081/BIP-200056552Search in Google Scholar PubMed
33. Senn S. Testing for baseline balance in clinical trials. Stat Med 1994;13:1715–26.10.1002/sim.4780131703Search in Google Scholar PubMed
34. Hansen BB, Bowers J. Covariate balance in simple, stratified and clustered comparative studies. Stat Sci 2008;23:219–36.Search in Google Scholar
35. Shaikh AM, Simonsen M, Vytlacil EJ, Yildiz N. A specification test for the propensity score using its distribution conditional on participation. J Econometrics 2009;1:33–46.10.1016/j.jeconom.2009.01.014Search in Google Scholar
36. Lee WS. Propensity score matching and variations on the balancing test. Empirical Econ 2013;44(1):47–80.10.1007/s00181-011-0481-0Search in Google Scholar
37. Austin PC. Type i error rates, coverage of confidence intervals, and variance estimation in propensity-score matched analyses. Int J Biostat 2009;5:1–21.10.2202/1557-4679.1146Search in Google Scholar PubMed PubMed Central
38. Robins JM, Wang N. Inference for imputation estimators. Biometrika 2000;87:113–24.10.1093/biomet/87.1.113Search in Google Scholar
39. Kang JDY, Schafer JL. Demystifying double robustness: a comparison of alternative strategies for estimating a population mean from incomplete data (with discussions and rejoinder). Stat Sci 2007;22:523–39.Search in Google Scholar
40. Rosenbaum PR. Model-based direct adjustment. J Am Stat Assoc 1987;82:387–94.10.1080/01621459.1987.10478441Search in Google Scholar
41. Cao WH, Tsiatis AA, Davidian M. Improving efficiency and robustness of the doubly robust estimator for a population mean with incomplete data. Biometrika 2009;96:723–34.10.1093/biomet/asp033Search in Google Scholar PubMed PubMed Central
42. Crump RK, Hotz VJ, Imbens GW, Mitnik OA. Dealing with limited overlap in estimation of average treatment effects. Biometrika 2009;96:187–99.10.1093/biomet/asn055Search in Google Scholar
43. Tan ZQ. Bounded, efficient and doubly robust estimation with inverse weighting. Biometrika 2010;97:661–82.10.1093/biomet/asq035Search in Google Scholar
44. Tsiatis AA. Semiparametric theory and missing data. New York: Springer, 2006.Search in Google Scholar
©2013 by Walter de Gruyter Berlin / Boston
Articles in the same Issue
- Masthead
- Masthead
- Research Articles
- Sensitivity Analysis for Causal Inference under Unmeasured Confounding and Measurement Error Problems
- Assessing the Causal Effect of Policies: An Example Using Stochastic Interventions
- Novel Point Estimation from a Semiparametric Ratio Estimator (SPRE): Long-Term Health Outcomes from Short-Term Linear Data, with Application to Weight Loss in Obesity
- Exact Nonparametric Confidence Bands for the Survivor Function
- Semiparametric Regression Analysis of Clustered Interval-Censored Failure Time Data with Informative Cluster Size
- A Weighting Analogue to Pair Matching in Propensity Score Analysis
- Alternative Monotonicity Assumptions for Improving Bounds on Natural Direct Effects
- Estimation of Risk Ratios in Cohort Studies with a Common Outcome: A Simple and Efficient Two-stage Approach
- Distance-Based Mapping of Disease Risk
- The Balanced Survivor Average Causal Effect
- Commentary
- Principal Stratification: A Broader Vision
Articles in the same Issue
- Masthead
- Masthead
- Research Articles
- Sensitivity Analysis for Causal Inference under Unmeasured Confounding and Measurement Error Problems
- Assessing the Causal Effect of Policies: An Example Using Stochastic Interventions
- Novel Point Estimation from a Semiparametric Ratio Estimator (SPRE): Long-Term Health Outcomes from Short-Term Linear Data, with Application to Weight Loss in Obesity
- Exact Nonparametric Confidence Bands for the Survivor Function
- Semiparametric Regression Analysis of Clustered Interval-Censored Failure Time Data with Informative Cluster Size
- A Weighting Analogue to Pair Matching in Propensity Score Analysis
- Alternative Monotonicity Assumptions for Improving Bounds on Natural Direct Effects
- Estimation of Risk Ratios in Cohort Studies with a Common Outcome: A Simple and Efficient Two-stage Approach
- Distance-Based Mapping of Disease Risk
- The Balanced Survivor Average Causal Effect
- Commentary
- Principal Stratification: A Broader Vision