Review and comparison of treatment effect estimators using propensity and prognostic scores

Myoung-Jae Lee; Sanghyeok Lee

doi:10.1515/ijb-2021-0005

Article

Review and comparison of treatment effect estimators using propensity and prognostic scores

Myoung-Jae Lee and Sanghyeok Lee

Published/Copyright: August 9, 2022

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information

From the journal The International Journal of Biostatistics Volume 18 Issue 2

Abstract

In finding effects of a binary treatment, practitioners use mostly either propensity score matching (PSM) or inverse probability weighting (IPW). However, many new treatment effect estimators are available now using propensity score and “prognostic score”, and some of these estimators are much better than PSM and IPW in several aspects. In this paper, we review those recent treatment effect estimators to show how they are related to one another, and why they are better than PSM and IPW. We compare 26 estimators in total through extensive simulation and empirical studies. Based on these, we recommend recent treatment effect estimators using “overlap weight”, and “targeted MLE” using statistical/machine learning, as well as a simple regression imputation/adjustment estimator using linear prognostic score models.

Keywords: complete pairing; inverse probability weighting; matching; prognostic score; propensity score; regression imputation/adjustment

Corresponding author: Sanghyeok Lee, Department of Economics, American University in Cairo, New Cairo 11835, Egypt, E-mail: sanghyeok.lee@aucegypt.edu

Myoung-Jae Lee and Sanghyeok Lee have research interest in statistics, econometrics and treatment effect analysis

Acknowledgment

The authors are grateful to the Editor and the reviewers for their helpful comments.

Author contribution: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.
Research funding: The research of Myoung-jae Lee has been supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2020R1A2C1A01007786), and by a Korea University fund.
Conflict of interest statement: The authors declare no conflicts of interest regarding this article.

References

1. Abadie, A, Imbens, G. Bias-corrected matching estimators for average treatment effects. J Bus Econ Stat 2011;29:1–11. https://doi.org/10.1198/jbes.2009.07333.Search in Google Scholar

2. Abadie, A, Imbens, G. Matching on the estimated propensity score. Econometrica 2016;84:781–807. https://doi.org/10.3982/ecta11293.Search in Google Scholar

3. Abadie, A, Drukker, D, Herr, JL, Imbens, GW. Implementing matching estimators for average treatment effects in Stata. STATA J 2004;4:290–311. https://doi.org/10.1177/1536867x0400400307.Search in Google Scholar

4. Austin, PC. A critical appraisal of propensity-score matching in the medical literature between 1996 and 2003. Stat Med 2008;27:2037–49. https://doi.org/10.1002/sim.3150.Search in Google Scholar PubMed

5. Bodory, H, Camponovo, L, Huber, M, Lechner, M. The finite sample performance of inference methods for propensity score matching and weighting estimators. J Bus Econ Stat 2020;38:183–200. https://doi.org/10.1080/07350015.2018.1476247.Search in Google Scholar

6. Busso, M, DiNardo, J, McCrary, J. New evidence on the finite sample properties of propensity score reweighting and matching estimators. Rev Econ Stat 2014;96:885–97. https://doi.org/10.1162/rest_a_00431.Search in Google Scholar

7. Chatton, A, Le Borgne, F, Leyrat, C, Gillaizeau, F, Rousseau, C, Barbin, L, et al.. G-computation, propensity score-based methods, and targeted maximum likelihood estimator for causal inference with different covariates sets: a comparative simulation study. Sci Rep 2020;10:9219. https://doi.org/10.1038/s41598-020-65917-x.Search in Google Scholar PubMed PubMed Central

8. Choi, J, Lee, MJ. Overlap weight and propensity score residual for heterogeneous effects: a review with extensions. J Stat Plann Inference 2022. forthcoming.10.1016/j.jspi.2022.04.003Search in Google Scholar

9. Doenst, T, Haverich, T, Serruys, P, et al.. PCI and CABG for treating stable coronary artery disease: JACC review topic of the week. J Am Coll Cardiol 2019;73:964–76. https://doi.org/10.1016/j.jacc.2018.11.053.Search in Google Scholar PubMed

10. Elze, MC, Gregson, J, Baber, U, Williamson, E, Sartori, S, Mehran, R, et al.. Comparison of propensity score methods and covariate adjustment. J Am Coll Cardiol 2017;69:345–57. https://doi.org/10.1016/j.jacc.2016.10.060.Search in Google Scholar PubMed

11. Franklin, JM, Eddings, W, Austin, PC, Stuart, EA, Schneeweiss, S. Comparing the performance of propensity score methods in healthcare database studies with rare outcomes. Stat Med 2017;36:1946–63. https://doi.org/10.1002/sim.7250.Search in Google Scholar PubMed

12. Frölich, M. Finite sample properties of propensity-score matching and weighting estimators. Rev Econ Stat 2004;86:77–90. https://doi.org/10.1162/003465304323023697.Search in Google Scholar

13. Gruber, S, van der Laan, MJ. An application of collaborative targeted maximum likelihood estimation in causal inference and genomics. Int J Biostat 2010;6:18. https://doi.org/10.2202/1557-4679.1182.Search in Google Scholar PubMed PubMed Central

14. Hansen, BB. The prognostic analogue of the propensity score. Biometrika 2008;95:481–8. https://doi.org/10.1093/biomet/asn004.Search in Google Scholar

15. Hirano, K, Imbens, GW, Ridder, G. Efficient estimation of average treatment effects using the estimated propensity score. Econometrica 2003;71:1161–89. https://doi.org/10.1111/1468-0262.00442.Search in Google Scholar

16. Hong, G. Marginal mean weighting through stratification: adjustment for selection bias in multilevel data. J Educ Behav Stat 2010;35:499–531. https://doi.org/10.3102/1076998609359785.Search in Google Scholar

17. Horvitz, D, Thompson, D. A generalization of sampling without replacement from a finite population. J Am Stat Assoc 1952;47:663–85. https://doi.org/10.1080/01621459.1952.10483446.Search in Google Scholar

18. Huber, M, Lechner, M, Wunsch, C. The performance of estimators based on the propensity score. J Econom 2013;175:1–21. https://doi.org/10.1016/j.jeconom.2012.11.006.Search in Google Scholar

19. Imai, K, Ratkovic, M. Covariate balancing propensity score. J Roy Stat Soc 2014;76:243–63. https://doi.org/10.1111/rssb.12027.Search in Google Scholar

20. Imbens, GW. The role of the propensity score in estimating dose-response functions. Biometrika 2000;87:706–10. https://doi.org/10.1093/biomet/87.3.706.Search in Google Scholar

21. Imbens, GW, Rubin, DB. Causal inference for statistics, social, and biomedical sciences: an introduction. New York: Cambridge University Press; 2015.10.1017/CBO9781139025751Search in Google Scholar

22. Kang, JDY, Schafer, JL. Demystifying double robustness: a comparison of alternative strategies for estimating a population mean from incomplete data. Stat Sci 2007;22:523–39. https://doi.org/10.1214/07-sts227.Search in Google Scholar

23. King, G, Nielsen, R. Why propensity scores should not be used for matching. Polit Anal 2019;27:435–54. https://doi.org/10.1017/pan.2019.11.Search in Google Scholar

24. Kreif, N, Gruber, S, Radice, R, Grieve, R, Sekhon, JS. Evaluating treatment effectiveness under model misspecification: a comparison of targeted maximum likelihood estimation with bias-corrected matching. Stat Methods Med Res 2016;25:2315–36. https://doi.org/10.1177/0962280214521341.Search in Google Scholar PubMed PubMed Central

25. Lee, MJ. Micro-econometrics for policy, program, and treatment effects. Oxford: Oxford University Press; 2005.10.1093/0199267693.001.0001Search in Google Scholar

26. Lee, MJ. Nonparametric tests for distributional treatment effects for censored responses. J Roy Stat Soc 2009;71:243–64. https://doi.org/10.1111/j.1467-9868.2008.00683.x.Search in Google Scholar

27. Lee, MJ. Treatment effects in sample selection models and their nonparametric estimation. J Econom 2012;167:317–29. https://doi.org/10.1016/j.jeconom.2011.09.018.Search in Google Scholar

28. Lee, MJ. Matching, regression discontinuity, difference in differences, and beyond. New York: Oxford University Press; 2016.10.1093/acprof:oso/9780190258733.001.0001Search in Google Scholar

29. Lee, MJ. Simple least squares estimator for treatment effects using propensity score residuals. Biometrika 2018;105:149–64. https://doi.org/10.1093/biomet/asx062.Search in Google Scholar

30. Lee, MJ. Instrument residual estimator for any response variable with endogenous binary treatment. J Roy Stat Soc 2021;83:612–35. https://doi.org/10.1111/rssb.12442.Search in Google Scholar

31. Lee, MJ, Lee, SH. Double robustness without weighting. Stat Probab Lett 2019;146:175–80. https://doi.org/10.1016/j.spl.2018.11.017.Search in Google Scholar

32. Li, L, Greene, T. A weighting analogue to pair matching in propensity score analysis. Int J Biostat 2013;9:215–34. https://doi.org/10.1515/ijb-2012-0030.Search in Google Scholar PubMed

33. Li, F, Morgan, KL, Zaslavsky, AM. Balancing covariates via propensity score weighting. J Am Stat Assoc 2018;113:390–400. https://doi.org/10.1080/01621459.2016.1260466.Search in Google Scholar

34. Linden, A. Improving causal inference with a doubly robust estimator that combines propensity score stratification and weighting. J Eval Clin Pract 2017;23:697–702. https://doi.org/10.1111/jep.12714.Search in Google Scholar PubMed

35. Linden, A, Uysal, SD, Ryan, A, Adams, JL. Estimating causal effects for multivalued treatments: a comparison of approaches. Stat Med 2016;35:534–52. https://doi.org/10.1002/sim.6768.Search in Google Scholar PubMed

36. Lunceford, JK, Davidian, M. Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study. Stat Med 2004;23:2937–60. https://doi.org/10.1002/sim.1903.Search in Google Scholar PubMed

37. Moore, KL, van der Laan, MJ. Covariate adjustment in randomized trials with binary outcomes: targeted maximum likelihood estimation. Stat Med 2009;28:39–64. https://doi.org/10.1002/sim.3445.Search in Google Scholar PubMed PubMed Central

38. Muñoz, ID, van der Laan, MJ. Population intervention causal effects based on stochastic interventions. Biometrics 2012;68:541–9. https://doi.org/10.1111/j.1541-0420.2011.01685.x.Search in Google Scholar PubMed PubMed Central

39. Nayan, M, Hamilton, RJ, Juurline, DN, Finelli, A, Kulkarni, GS, Austin, PC. Critical appraisal of the application of propensity score methods in the urology literature. BJU Int 2017;120:873–80. https://doi.org/10.1111/bju.13930.Search in Google Scholar PubMed

40. Pang, M, Schuster, T, Filion, KB, Schnitzer, ME, Eberg, M, Platt, RW. Effect estimation in point-exposure studies with binary outcomes and high-dimensional covariate data–a comparison of targeted maximum likelihood estimation and inverse probability of treatment weighting. Int J Biostat 2016;12:20150034. https://doi.org/10.1515/ijb-2015-0034.Search in Google Scholar PubMed PubMed Central

41. Pearl, J. Causality, 2nd ed. Cambridge: Cambridge University Press; 2009.10.1017/CBO9780511803161Search in Google Scholar

42. Peikes, DN, Moreno, L, Orzol, SM. Propensity score matching: a note of caution for evaluators of social programs. Am Statistician 2008;62:222–31. https://doi.org/10.1198/000313008x332016.Search in Google Scholar

43. Porter, KE, Gruber, S, van der Laan, MJ, Sekhon, JS. The relative performance of targeted maximum likelihood estimators. Int J Biostat 2011;7:31. https://doi.org/10.2202/1557-4679.1308.Search in Google Scholar PubMed PubMed Central

44. Robins, JM, Mark, SD, Newey, WK. Estimating exposure effects by modelling the expectation of exposure conditional on confounders. Biometrics 1992;48:479–95. https://doi.org/10.2307/2532304.Search in Google Scholar

45. Robins, JM, Rotnitzky, A, Zhao, LP. Estimation of regression coefficients when some regressors are not always observed. J Am Stat Assoc 1994;89:846–66. https://doi.org/10.1080/01621459.1994.10476818.Search in Google Scholar

46. Robins, JM, Sued, M, Lei-Gomez, Q, Rotnitzky, A. Performance of double-robust estimators when inverse probability weights are highly variable. Stat Sci 2007;22:544–59. https://doi.org/10.1214/07-sts227d.Search in Google Scholar

47. Rose, S, van der Laan, MJ. Simple optimal weighting of cases and controls in case-control studies. Int J Biostat 2008;4:19. https://doi.org/10.2202/1557-4679.1115.Search in Google Scholar PubMed PubMed Central

48. Rosenbaum, PR. Observational studies, 2nd ed. New York: Springer; 2002.10.1007/978-1-4757-3692-2Search in Google Scholar

49. Rosenbaum, PR, Rubin, DB. The central role of the propensity score in observational studies for causal effects. Biometrika 1983;70:41–55. https://doi.org/10.1093/biomet/70.1.41.Search in Google Scholar

50. Rosenbaum, PR, Rubin, DB. Reducing bias in observational studies using subclassification on the propensity score. J Am Stat Assoc 1984;79:516–24. https://doi.org/10.1080/01621459.1984.10478078.Search in Google Scholar

51. Rosenbaum, PR, Rubin, DB. Constructing a control group using multivariate matched sampling methods that incorporate the propensity score. Am Statistician 1985;39:33–8. https://doi.org/10.2307/2683903.Search in Google Scholar

52. Rotnitzky, A, Lei, QH, Sued, M, Robins, JM. Improved double-robust estimation in missing data and causal inference models. Biometrika 2012;99:439–56. https://doi.org/10.1093/biomet/ass013.Search in Google Scholar PubMed PubMed Central

53. Rubin, D, van der Laan, MJ. A doubly robust censoring unbiased transformation. Int J Biostat 2007;3:4. https://doi.org/10.2202/1557-4679.1052.Search in Google Scholar PubMed

54. Rubin, DB, Thomas, N. Combining propensity score matching with additional adjustments for prognostic covariates. J Am Stat Assoc 2000;95:573–85. https://doi.org/10.1080/01621459.2000.10474233.Search in Google Scholar

55. Scharfstein, DO, Rotnitzky, A, Robins, JM. Adjusting for nonignorable drop-out using semiparametric nonresponse models. J Am Stat Assoc 1999;94:1096–120. https://doi.org/10.1080/01621459.1999.10473862.Search in Google Scholar

56. Schnitzer, ME, Moodie, EE, Platt, RW. Targeted maximum likelihood estimation for marginal time-dependent treatment effects under density misspecification. Biostatistics 2013;14:1–14. https://doi.org/10.1093/biostatistics/kxs024.Search in Google Scholar PubMed

57. Schnitzer, ME, van der Laan, MJ, Moodie, EE, Platt, RW. Effect of breastfeeding on gastrointestinal infection in infants: a targeted maximum likelihood approach for clustered longitudinal data. Ann Appl Stat 2014;8:703–25. https://doi.org/10.1214/14-aoas727.Search in Google Scholar PubMed PubMed Central

58. Stuart, EA. Matching methods for causal inference: a review and a look forward. Stat Sci 2010;25:1–21. https://doi.org/10.1214/09-STS313.Search in Google Scholar PubMed PubMed Central

59. Stuart, EA, Lee, BK, Leacy, FP. Prognostic score-based balance measures can be a useful diagnostic for propensity score methods in comparative effectiveness research. J Clin Epidemiol 2013;66:S84–90. https://doi.org/10.1016/j.jclinepi.2013.01.013.Search in Google Scholar PubMed PubMed Central

60. Vansteelandt, S, Daniel, RM. On regression adjustment for the propensity score. Stat Med 2014;33:4053–72. https://doi.org/10.1002/sim.6207.Search in Google Scholar PubMed

61. Van der Laan, MJ, Gruber, S. Targeted minimum loss based estimation of causal effects of multiple time point interventions. Int J Biostat 2012;8:9. https://doi.org/10.1515/1557-4679.1370.Search in Google Scholar PubMed

62. Van der Laan, MJ, Polley, EC, Hubbard, AE Super Learner, Statistical Applications in Genetics and Molecular Biology, 6; 2007. p. 1–21. https://doi.org/10.2202/1544-6115.1309.Search in Google Scholar PubMed

63. Van der Laan, MJ, Rubin, D. Targeted maximum likelihood learning. Int J Biostat 2006;2:11. https://doi.org/10.2202/1557-4679.1043.Search in Google Scholar

64. Waernbaum, I. Model misspecification and robustness in causal inference: comparing matching with doubly robust estimation. Stat Med 2012;31:1572–81. https://doi.org/10.1002/sim.4496.Search in Google Scholar PubMed

65. Wu, S, Ding, Y, Wu, F, Hu, J, Mao, P. Application of propensity-score matching in four leading medical journals. Epidemiology 2015;26:e19–20. https://doi.org/10.1097/ede.0000000000000249.Search in Google Scholar PubMed

66. Zhao, Z. Using matching to estimate treatment effects. Rev Econ Stat 2004;86:91–107. https://doi.org/10.1162/003465304323023705.Search in Google Scholar

Supplementary Material

The online version of this article offers supplementary material (https://doi.org/10.1515/ijb-2021-0005).

Received: 2020-09-14

Accepted: 2022-01-03

Published Online: 2022-08-09

You are currently not able to access this content.

Supplementary Material Details

Articles in the same Issue

https://doi.org/10.1515/ijb-2021-0005

Keywords for this article

complete pairing; inverse probability weighting; matching; prognostic score; propensity score; regression imputation/adjustment