Abstract
Learning about the relationship between distance to landmarks and events and phenomena of interest is a multi-faceted problem, as it may require taking into account multiple dimensions, including: spatial position of landmarks, timing of events taking place over time, and attributes of occurrences and locations. Here I show that tree-based methods are well suited for the study of these questions as they allow exploring the relationship between proximity metrics and outcomes of interest in a non-parametric and data-driven manner. I illustrate the usefulness of tree-based methods vis-à-vis conventional regression methods by examining the association between: (i) distance to border crossings along the US-Mexico border and support for immigration reform, and (ii) distance to mass shootings and support for gun control.
References
Aidt, T. S., and R. Franck. 2015. “Democratization under the Threat of Revolution: Evidence from the Great Reform Act of 1832.” Econometrica 83 (2): 505–47, https://doi.org/10.3982/ecta11484.Search in Google Scholar
Branton, R., G. Dillingham, J. Dunaway, and B. Miller. 2007. “Anglo Voting on Nativist Ballot Initiatives: The Partisan Impact of Spatial Proximity to the US-Mexico Border.” Social Science Quarterly 88 (3): 882–97, https://doi.org/10.1111/j.1540-6237.2007.00488.x.Search in Google Scholar
Branton, R., V. Martinez-Ebers, T. E. Carey Jr, and T. Matsubayashi. 2015. “Social Protest and Policy Attitudes: The Case of the 2006 Immigrant Rallies.” American Journal of Political Science 59 (2): 390–402, https://doi.org/10.1111/ajps.12159.Search in Google Scholar
Breiman, L. 2001. “Random Forests.” Machine Learning 45 (1): 5–32, https://doi.org/10.1023/a:1010933404324.10.1023/A:1010933404324Search in Google Scholar
Carpini, M. X. D., K. Scott, and J. D. Kennamer. 1994. “Effects of the News Media Environment on Citizen Knowledge of State Politics and Government.” Journalism Quarterly 71 (2): 443–56, https://doi.org/10.1177/107769909407100217.Search in Google Scholar
Chipman, H. A., E. I. George, and R. E. McCulloch. 2007. “Bayesian Ensemble Learning.” In Advances in Neural Information Processing Systems 19: Proceedings of the 2006 Conference, 265–72. Cambridge, MA: MIT Press.10.7551/mitpress/7503.003.0038Search in Google Scholar
Cho, W. K. T., and J. G. Gimpel. 2012. “Geographic Information Systems and the Spatial Dimensions of American Politics.” Annual Review of Political Science 15: 443–60, https://doi.org/10.1146/annurev-polisci-031710-112215.Search in Google Scholar
Cohen, A. P., D. Azrael, and M. Miller. 2014. Rate of Mass Shootings Has Tripled since 2011, Harvard Research Shows. San Francisco: Mother Jones, October 15, 2014. Also available at http://www.motherjones.com/politics/2014/10/mass-shootings-increasing-harvard-research (accessed March 1, 2022).Search in Google Scholar
Enos, R. D. 2017. The Space between Us: Social Geography and Politics. New York: Cambridge University Press.10.1017/9781108354943Search in Google Scholar
Follman, M., A. Gavin, D. Pan, and M. Caldwell. 2012. US Mass Shootings, 1982–2012: Data from Mother Jones’ Investigation. San Francisco: Mother Jones, Published December 28, 2012. Updated September 16, 2013. Latest version available at http://www.motherjones.com/politics/2012/12/mass-shootings-mother-jones-full-data (accessed March 1, 2022).Search in Google Scholar
Friedman, J., T. Hastie, and R. Tibshirani. 2010. “Regularization Paths for Generalized Linear Models via Coordinate Descent.” Journal of Statistical Software 33 (1), https://doi.org/10.18637/jss.v033.i01.Search in Google Scholar
Grimmer, J. 2015. “We Are All Social Scientists Now: How Big Data, Machine Learning, and Causal Inference Work Together.” PS: Political Science & Politics 48 (1): 80–3, https://doi.org/10.1017/s1049096514001784.Search in Google Scholar
Gleditsch, K. S., and M. D. Ward. 2001. “Measuring Space: A Minimum-Distance Database and Applications to International Studies.” Journal of Peace Research 38 (6): 739–58, https://doi.org/10.1177/0022343301038006006.Search in Google Scholar
Hastie, T., R. Tibshirani, and J. Friedman. 2009. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York: Springer.10.1007/978-0-387-84858-7Search in Google Scholar
King, G., M. Tomz, and J. Wittenberg. 2000. “Making the Most of Statistical Analyses: Improving Interpretation and Presentation.” American Journal of Political Science 44 (2): 341–55, https://doi.org/10.2307/2669316.Search in Google Scholar
Liaw, A., and M. Wiener. 2002. “Classification and Regression by randomForest.” R News 2 (3): 18–22.Search in Google Scholar
Montgomery, J. M., and S. Olivella. 2018. “Tree-Based Models for Political Science Data.” American Journal of Political Science 62 (3): 729–44, https://doi.org/10.1111/ajps.12361.Search in Google Scholar
Newman, B. J., and T. K. Hartman. 2019. “Mass Shootings and Public Support for Gun Control.” British Journal of Political Science 49 (4): 1527–53, https://doi.org/10.1017/s0007123417000333.Search in Google Scholar
Reny, T. T., and B. J. Newman. 2018. “Protecting the Right to Discriminate: The Second Great Migration and Racial Threat in the American West.” American Political Science Review 112 (4): 1104–10, https://doi.org/10.1017/s0003055418000448.Search in Google Scholar
Sparapani, R., and C. Spanbauer. 2021. “Nonparametric machine learning and efficient computation with bayesian additive regression trees: the BART R package.” Journal of Statistical Software 97: 1–66.10.18637/jss.v097.i01Search in Google Scholar
Strobl, C., B. Anne-Laure, T. Kneib, T. Augustin, and A. Zeileis. 2008. “Conditional Variable Importance for Random Forests.” BMC Bioinformatics 9 (1): 307, https://doi.org/10.1186/1471-2105-9-307.Search in Google Scholar
Therneau, T., and B. Atkinson. 2015. “An Introduction to Recursive Partitioning Using the RPART Routines.” Also available at http://cran.r-project.org/web/packages/rpart/vignettes/longintro.pdf.Search in Google Scholar
Therneau, T., B. Atkinson, and B. Ripley. 2015. “Package ‘rpart.’.” Also available at http://cran.r-project.org/web/packages/rpart/rpart.pdf.Search in Google Scholar
Tibshirani, R. 1996. “Regression Shrinkage and Selection via the Lasso.” Journal of the Royal Statistical Society: Series B 58 (1): 267–88, doi:https://doi.org/10.1111/j.2517-6161.1996.tb02080.x.Search in Google Scholar
Varian, H. R. 2014. “Big Data: New Tricks for Econometrics.” The Journal of Economic Perspectives 28 (2): 3–28, https://doi.org/10.1257/jep.28.2.3.Search in Google Scholar
Wallace, S. J., C. Zepeda-Millán, and M. Jones-Correa. 2014. “Spatial and Temporal Proximity: Examining the Effects of Protests on Political Attitudes.” American Journal of Political Science 58 (2): 433–48, https://doi.org/10.1111/ajps.12060.Search in Google Scholar
Wallach, H. 2014. “Big Data, Machine Learning, and the Social Sciences.” Montreal: NIPS 2014. Also available at https://hannawallach.medium.com/big-data-machine-learning-and-the-social-sciences-927a8e20460d (accessed March 1, 2022).Search in Google Scholar
Supplementary Material
The online version of this article offers supplementary material (https://doi.org/10.1515/spp-2021-0031).
© 2022 Walter de Gruyter GmbH, Berlin/Boston
Articles in the same Issue
- Frontmatter
- Articles
- Social Protection and Gender Inequality in Using Enabling Technology: An Analysis with the Framework of Sustainable Development Goals (SDGs)
- Bayesian Analysis of State Voter Registration Database Integrity
- Building a Framework for Mode Effect Estimation in United States Presidential Election Polls
- Inequality in Education: A Comparison of Australian Indigenous and Nonindigenous Populations
- Learning about Spatial and Temporal Proximity using Tree-Based Methods
- The Declining Pension Wealth of Employment for the Birth Cohorts 1935–1974 in Germany
Articles in the same Issue
- Frontmatter
- Articles
- Social Protection and Gender Inequality in Using Enabling Technology: An Analysis with the Framework of Sustainable Development Goals (SDGs)
- Bayesian Analysis of State Voter Registration Database Integrity
- Building a Framework for Mode Effect Estimation in United States Presidential Election Polls
- Inequality in Education: A Comparison of Australian Indigenous and Nonindigenous Populations
- Learning about Spatial and Temporal Proximity using Tree-Based Methods
- The Declining Pension Wealth of Employment for the Birth Cohorts 1935–1974 in Germany