Home Business & Economics Data Mining in Social Sciences: A Decision Tree Application Using Social and Political Concepts
Article
Licensed
Unlicensed Requires Authentication

Data Mining in Social Sciences: A Decision Tree Application Using Social and Political Concepts

  • Efthalia Massou ORCID logo EMAIL logo , Gerasimos Prodromitis ORCID logo and Stamos Papastamou ORCID logo
Published/Copyright: November 30, 2022
Become an author with De Gruyter Brill

Abstract

In this paper, we investigated the utility of data mining to classify individuals into predefined categories of a target variable, based on their social and political attitude. Data collected for a social psychology study conducted in Greece in 1994 were used for this purpose. We established the theoretical background of our analysis through explanatory factor analysis. We ran the decision tree algorithm CHAID in order to build a predictive model that classifies the study participants in terms of their attitude toward physical and symbolic violence. The CHAID algorithm provided a decision tree that was easily interpreted, and which revealed meaningful predictive patterns. CHAID algorithm showed satisfactory predictive ability and promising alternatives to social psychology data analysis. To the best of our knowledge, there is no other evidence in the literature that the decision tree algorithms can be used to identify latent variables.


Corresponding author: Efthalia Massou, Primary Care Unit, Department of Public Health and Primary Care, School of Clinical Medicine, University of Cambridge, Forvie Site, Robinson Way, Cambridge CB2 0SR, UK; and Laboratory of Experimental and Social Psychology, Panteion University, Athens, Greece, E-mail:

  1. Research funding: Not applicable.

  2. Declarations of interest: None.

  3. Availability of data and material: Data are hold from the Laboratory of Experimental and Social Psychology, Panteion University, Athens, Greece and are available upon request to the corresponding author.

  4. Code availability: Upon request.

References

Azmak, O., H. Bayer, A. Caplin, M. Chun, P. Glimcher, S. Koonin, and A. Patrinos. 2015. “Using Big Data to Understand the Human Condition: The Kavli HUMAN Project.” Big Data 3 (3): 173–88, https://doi.org/10.1089/big.2015.0012.Search in Google Scholar

Barnard, A. 2013. “The Role of Socio-Demographic Variables and Their Interaction Effect on Sense of Coherence.” SA Journal of Industrial Psychology 39 (1): 1–9, https://doi.org/10.4102/sajip.v39i1.1073.Search in Google Scholar

Bekesiene, S., and S. Hoskova-Mayerova. 2018. “Decision Tree-Based Classification Model for Identification of Effective Leadership Indicators.” Journal of Mathematical and Fundamental Sciences 50 (2): 121–41.10.5614/j.math.fund.sci.2018.50.2.2Search in Google Scholar

Benjamin, A. J. 2006. “The Relationship between Right-Wing Authoritarianism and Attitudes toward Violence: Further Validation of the Attitudes toward Violence Scale.” Social Behavior and Personality: International Journal 34 (8): 923–6, https://doi.org/10.2224/sbp.2006.34.8.923.Search in Google Scholar

Berkhin, P. 2006. “A Survey of Clustering Data Mining Techniques.” In Grouping Multidimensional Data, 25–71. Berlin, Heidelberg: Springer.10.1007/3-540-28349-8_2Search in Google Scholar

Bourdieu, P. 1979. “Symbolic Power.” Critique of Anthropology 4 (13–14): 77–85, https://doi.org/10.1177/0308275x7900401307.Search in Google Scholar

Bourdieu, P., and J.-C. Passeron. 1990. Reproduction in Education, Society and Culture, 4. London: Sage.Search in Google Scholar

Bozdogan, H. 2004. “Statistical Data Mining and Knowledge Discovery.” In Intelligent Statistical Data Mining with Information Complexity and Genetic Algorithms, 15–56. Berlin, Heidelberg: Springer.10.1201/9780203497159Search in Google Scholar

Breiman, L., J. Friedman, R. Olshen, and C. Stone. 1984. “Classification and Regression Trees. Wadsworth Int.” Group 37 (15): 237–51.Search in Google Scholar

Brewer, S. L., H. Meckley-Brewer, and P. M. Stinson. 2017. “Fearful and Distracted in School: Predicting Bullying Among Youths.” Children and Schools 39 (4): 219–26, https://doi.org/10.1093/cs/cdx021.Search in Google Scholar

Colaguori, C. 2010. “Symbolic Violence and the Violation of Human Rights: Continuing the Sociological Critique of Domination.” International Journal of Criminology and Sociological Theory 3 (2): 388–400.Search in Google Scholar

Dubinsky, A. J., R. Nataraajan, and W.-Y. Huang. 2005. “Consumers’ Moral Philosophies: Identifying the Idealist and the Relativist.” Journal of Business Research 58 (12): 1690–701, https://doi.org/10.1016/j.jbusres.2004.11.002.Search in Google Scholar

Eagle, N., M. Macy, and R. Claxton. 2010. “Network Diversity and Economic Development.” Science 328 (5981): 1029–31, https://doi.org/10.1126/science.1186605.Search in Google Scholar

Focquaert, F., E. Shaw, and B. N. Waller. 2020. The Routledge Handbook of the Philosophy and Science of Punishment. New York, London: Routledge.10.4324/9780429507212Search in Google Scholar

Freitas, A. A. 2014. “Comprehensible Classification Models: A Position Paper.” ACM SIGKDD explorations newsletter 15 (1): 1–10, https://doi.org/10.1145/2594473.2594475.Search in Google Scholar

Günüç, S. 2013. Cart and Chaid Analyses of Some Variables that Predict Internet Addiction [İnternet Bagimliligmi Yordayan Bazi Degiş Kenlerin Cart Ve Chaid Analizleri Ile İncelenmesi]. Ankara: Türk Psikoloji Dergisi: Turkish Journal of Psychology.Search in Google Scholar

Han, J., J. Pei, and M. Kamber. 2011. Data Mining: Concepts and Techniques. Waltham, ΜΑ: Elsevier.Search in Google Scholar

Hand, D., H. Mannila, and P. Smyth. 2001. Principles of Data Mining. A Bradford Book. Cambridge, MA: MIT Press.Search in Google Scholar

Hand, D. J. 1998. “Data Mining: Statistics and More?” The American Statistician 52 (2): 112–8, https://doi.org/10.1080/00031305.1998.10480549.Search in Google Scholar

Hu, X., X. Zhang, and N. Lovrich. 2020. “Public Perceptions of Police Behavior during Traffic Stops: Logistic Regression and Machine Learning Approaches Compared.” Journal of Computational Social Science: 1–26, https://doi.org/10.1007/s42001-020-00079-4.Search in Google Scholar

Kass, G. V. 1980. “An Exploratory Technique for Investigating Large Quantities of Categorical Data.” Journal of the Royal Statistical Society: Series C (Applied Statistics) 29 (2): 119–27, https://doi.org/10.2307/2986296.Search in Google Scholar

Kosinski, M., D. Stillwell, and T. Graepel. 2013. “Private Traits and Attributes Are Predictable from Digital Records of Human Behavior.” Proceedings of the National Academy of Sciences 110 (15): 5802–5, https://doi.org/10.1073/pnas.1218772110.Search in Google Scholar

Koutsouleris, N., L. Kambeitz-Ilankovic, S. Ruhrmann, M. Rosen, A. Ruef, D. B. Dwyer, and S. Borgwardt. 2018. “Prediction Models of Functional Outcomes for Individuals in the Clinical High-Risk State for Psychosis or with Recent-Onset Depression.” JAMA Psychiatry 75 (11), https://doi.org/10.1001/jamapsychiatry.2018.2165.Search in Google Scholar

Kposowa, A. J., and D. Aly Ezzat. 2019. “Religiosity, Conservatism, and Acceptability of Anti-female Spousal Violence in Egypt.” Journal of Interpersonal Violence 34 (12): 2525–50, https://doi.org/10.1177/0886260516660976.Search in Google Scholar

Lazer, D., A. Pentland, L. Adamic, S. Aral, A.-L. Barabasi, D. Brewer, and M. Gutmann. 2009. “Social Science.” Computational social scienceScience (New York, NY) 323 (5915): 721–3, https://doi.org/10.1126/science.1167742.Search in Google Scholar

Luo, J.-D., J. Liu, K. Yang, and X. Fu. 2019. “Big Data Research Guided by Sociological Theory: A Triadic Dialogue Among Big Data Analysis, Theory, and Predictive Models.” The Journal of Chinese Sociology 6 (1): 11, https://doi.org/10.1186/s40711-019-0102-4.Search in Google Scholar

Qiu, L., S. H. M. Chan, and D. Chan. 2018. “Big Data in Social and Psychological Science: Theoretical and Methodological Issues.” Journal of Computational Social Science 1 (1): 59–66, https://doi.org/10.1007/s42001-017-0013-6.Search in Google Scholar

Quinlan, J. R. 1986. “Induction of Decision Trees.” Machine Learning 1 (1): 81–106, https://doi.org/10.1007/bf00116251.Search in Google Scholar

Rokach, L., and O. Maimon. 2005. “Decision Trees.” In Data Mining and Knowledge Discovery Handbook, 165–92. Boston, MA: Springer.10.1007/0-387-25465-X_9Search in Google Scholar

Salganik, M. 2019. Bit By Bit: Social Research in the Digital Age. Princeton, NJ: Princeton University Press.Search in Google Scholar

Sanchez, Z. M., S. S. Martins, E. S. Opaleye, Y. G. Moura, D. P. Locatelli, and A. R. Noto. 2011. “Social Factors Associated to Binge Drinking: A Cross-Sectional Survey Among Brazilian Students in Private High Schools.” BMC Public Health 11: 201, https://doi.org/10.1186/1471-2458-11-201.Search in Google Scholar

Van Hiel, A., B. Duriez, and M. Kossowska. 2006. “The Presence of Left‐wing Authoritarianism in Western Europe and its Relationship with Conservative Ideology.” Political Psychology 27 (5): 769–93, https://doi.org/10.1111/j.1467-9221.2006.00532.x.Search in Google Scholar

Verma, A. K. 2017. “Domestic Violece: A Sociological Analysis.” Deliberative Research 36 (1): 25–6.Search in Google Scholar

Walby, S. 2013. “Violence and Society: Introduction to an Emerging Field of Sociology.” Current Sociology 61 (2): 95–111, https://doi.org/10.1177/0011392112456478.Search in Google Scholar

Youyou, W., M. Kosinski, and D. Stillwell. 2015. “Computer-based Personality Judgments Are More Accurate Than Those Made by Humans.” Proceedings of the National Academy of Sciences 112 (4): 1036–40, https://doi.org/10.1073/pnas.1418680112.Search in Google Scholar

Zhang, C., and S. Zhang. 2002. Association Rule Mining: Models and Algorithms. Berlin Heidelberg: Springer-Verlag.10.1007/3-540-46027-6Search in Google Scholar

Received: 2022-04-01
Accepted: 2022-09-21
Published Online: 2022-11-30
Published in Print: 2022-11-25

© 2022 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 29.1.2026 from https://www.degruyterbrill.com/document/doi/10.1515/spp-2022-0004/html
Scroll to top button