Summary
This article examines gender stereotypes in the selected entries referring to marital status in a computer-generated thesaurus of synonyms and semantically related words called Kontext.io. This thesaurus is very popular in Croatia – in fact, it is the most popular local online thesaurus used in the country. We specifically focus on its Croatian section, which is derived from a large web corpus of Croatian, comprising 1.4 billion words, using a natural language processing (NLP) model based on word embeddings. The analyzed entries from the thesaurus include: udana/udata (marriedfeminine), oženjen (marriedmasculine), razvedena (divorcedfeminine), razveden (divorcedmasculine), udovica (widow) and udovac (widower). We first categorize the synonyms and semantically related terms from these entries into various semantic fields. We then critically analyze gender bias according to the fields, using the framework of critical lexicography and critical discourse analysis. The results point to presence of gender bias in the analyzed entries.
References
Arppe, Antti & Neitsch, Andrew & Dacanay, Daniel & Poulin, Jolene & Hieber, Daniel & Atticus Harrigan. 2023. Finding words that aren’t there: Using word embeddings to improve dictionary search for low-resource languages. In Proceedings of the Workshop on Natural Language Processing for Indigenous Languages of the Americas (Americas NLP), 144–155. Toronto: Association for Computational Linguistics.10.18653/v1/2023.americasnlp-1.15Search in Google Scholar
Bär, Jochen A. 2001. Männer – Frauen: Sprachliche Stereotype [Men – Women: Language Stereotypes], Der Deutschunterricht 53(4). 30–41.Search in Google Scholar
Basta, Christine & Costa-Jussà, Marta R. & Noe Casas. 2019. Evaluating the underlying gender bias in contextualized word embeddings. In Mager, Manuel et al. (ed.), Proceedings of the First Workshop on Gender Bias in Natural Language Processing, 33–39. Florence: Association for Computational Linguistics.10.18653/v1/W19-3805Search in Google Scholar
Basta, Christine & Costa-Jussà, Marta R. & Noe Casas. 2021. Extensive study on the underlying gender bias in contextualized word embeddings, Neural Computing and Applications 33(8). 3371–3384.10.1007/s00521-020-05211-zSearch in Google Scholar
Bergenholtz, Henning & Rufus Gouws. 2006. How to do language policy with dictionaries, Lexikos 16. 13–45.10.5788/16-0-646Search in Google Scholar
Bolukbasi, Tolga & Chang, Kai-Wei & Y. Zou, James & Saligrama, Venkatesh & Adam T. Kalai. 2016. Man is to computer programmer as woman is to homemaker? Debiasing word embeddings, Advances in Neural Information Processing Systems 29. 4349–4357.Search in Google Scholar
Bratić, Vesna & Milica Vuković-Stamatović. 2017. Commodification of women through conceptual metaphors, Gender and Language 11(1). 51–76. 10.1558/genl.22009Search in Google Scholar
Caliskan, Aylin & Parth Ajay, Pimparkar & Charlesworth, Tessa & Wolfe, Robert & Mahzarin R. Banaji. 2022. Gender bias in word embeddings: A comprehensive analysis of frequency, syntax, and semantics. In Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society, 156–170. Oxford: AIES.10.1145/3514094.3534162Search in Google Scholar
Chaimae, Azroumahli & Rybiński, Maciej & El Younoussi, Yacine & José Francisco Aldana Montes. 2020. Comparative study of Arabic word embeddings: evaluation and application, International Journal of Computer Information Systems and Industrial Management Applications 12. 349–362. Search in Google Scholar
Chen, Wenge. 2019. Towards a discourse approach to Critical Lexicography, International Journal of Lexicography 32(3). 362–388.10.1093/ijl/ecz003Search in Google Scholar
Ćeriman, Jelena & Tanja Vučković Juroš. 2024. From gender re-traditionalizations to anti-gender mobilizations: Care for family in Serbia and Croatia, East European Politics and Societies 38(2). 662–681.10.1177/08883254231161553Search in Google Scholar
Dubois, Jean & Claude Dubois. 1971. Introduction à la lexicographie: le dictionnaire. Paris: Larousse.Search in Google Scholar
Fairclough, Norman. 1992. Discourse and Social Change. Cambridge: Polity Press.Search in Google Scholar
Fairclough, Norman. 1995 a. Critical Discourse Analysis. London/New York: Longman. Search in Google Scholar
Fairclough, Norman. 1995 b. Media Discourse. London: Arnold.Search in Google Scholar
Frawley, William. 1989. The dictionary as text, International Journal of Lexicography 2(3). 231–248.10.1093/ijl/2.3.231Search in Google Scholar
Fuertes-Olivera, Pedro A. & Sven Tarp. 2022. Critical lexicography at work: reflections and proposals for eliminating gender bias in general dictionaries of Spanish, Lexikos 32(2). 105–132.10.5788/32-2-1699Search in Google Scholar
Gonen, Hila & Yoav Goldberg. 2019. Lipstick on a pig: Debiasing methods cover up systematic gender biases in word embeddings but do not remove them. arXiv preprint arXiv:1903.03862.10.18653/v1/N19-1061Search in Google Scholar
Gouws, Rufus H. 2017. Van tematiese na alfabetiese na tematiese ordening in woordeboeke wisselwerking tussen teorie en praktyk, Stellenbosch Papers in Linguistics Plus 53. 133–148.10.5842/53-0-737Search in Google Scholar
Grenon-Nyenhuis, Chantale. 2000. The dictionary as a cultural institution, Intercultural Communication Studies 10(1). 159–166.Search in Google Scholar
Herbelot, Aurelie & Marco Baroni. 2017. High-risk learning: acquiring new word vectors from tiny data. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 304–309. Copenhagen: Association for Computational Linguistics.10.18653/v1/D17-1030Search in Google Scholar
Hu, Huilian & Xu, Hai & Junjie Hao. 2019. An SFL approach to gender ideology in the sentence examples in the Contemporary Chinese Dictionary, Lingua 220. 17–30. 10.1016/j.lingua.2018.12.004Search in Google Scholar
Iversen, Sarah H. 2021. The (re)presentation of knowledge about gender in children’s picture dictionaries. In Goga, Nina & Iversen, Sarah H. & Anna-Steffi Teigland (eds.), Verbal and Visual Strategies in Nonfiction Picture Books: Theoretical and Analytical Approaches, 67–79. Oslo: Scandinavian University Press.10.18261/9788215042459-2021-06Search in Google Scholar
Kachru, Braj B. 1995. Afterword: directions and challenges. In Kachru, Braj B. & Henry Kahane (eds.), Cultures, Ideologies, and the Dictionary: Studies in Honor of Ladislav Zgusta, 417–424. Berlin/Boston: Max Niemeyer Verlag.10.1515/9783110957075.417Search in Google Scholar
Lee, Edward. 2020. Gender bias in dictionary-derived word embeddings. Technical report.http://cs230.stanford.edu/projects_fall_2020/reports/55476615.pdf (last access: 20.06.2024).Search in Google Scholar
Liang, Hai & Ng, Yee Man Margaret & Nathan L. T. Tsang. 2023. Word embedding enrichment for dictionary construction: An example of incivility in Cantonese, Computational Communication Research 5(1). 1–26.10.5117/CCR2023.1.10.LIANSearch in Google Scholar
Ljubešić, Nikola & Filip Klubička. 2016. Croatian web corpus hrWaC 2.1, Slovenian language resource repository CLARIN.SI, ISSN 2820–4042, http://hdl.handle.net/11356/1064 (last access: 15.07.2024).Search in Google Scholar
Mikolov, Tomas & Sutskever, Ilya & Chen, Kai & Corrado, Greg & Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. arXiv preprint arXiv:1310.4546. Search in Google Scholar
Morinaga, Yuya & Kazunori Yamaguchi. 2018. Improvement of reverse dictionary by tuning word vectors and category inference. In Damaševičius, Robertas & Giedrė Vasiljevienė (eds.), Information and Software Technologies: 24th International Conference, ICIST 2018, Vilnius, October 4–6, 2018, Proceedings, 533–545. Cham: Springer International Publishing.10.1007/978-3-319-99972-2_44Search in Google Scholar
Müller-Spitzer, Caroline. 2023. Gender stereotypes in dictionaries: the challenge of reconciling usage-based lexicography with the role of dictionaries as social agents, Lexikos 33(2). 79–94.10.5788/33-2-1843Search in Google Scholar
Müller-Spitzer, Caroline & Jan Oliver Rüdiger. 2022. The influence of the corpus on the representation of gender stereotypes in the dictionary. A case study of corpus-based dictionaries of German. In Klosa-Kuckelhaus, Annete & Engleberge, Stefan & Mohrs, Christine & Petra Storjohann (eds.), Dictionaries and Society. Euralex Proceedings, 129–141. Mannheim: IDS Verlag.Search in Google Scholar
Norri, Juhani. 2019. Gender in dictionary definitions: a comparison of five learner’s dictionaries and their different editions, English Studies 100(7). 866–890.10.1080/0013838X.2019.1604014Search in Google Scholar
Pettini, Silivia. 2021. One is a woman, so that’s encouraging too. The representation of social gender in “powered by Oxford” online lexicography, Lingue e Linguaggi 44. 275–295.Search in Google Scholar
Plahuta, Marko. 2024. Kontekst.io, https://virostatiq.com/ (last access: 20.07.2024).Search in Google Scholar
Shyian, Oksana M. & Foster, Larysa F. & Kuzmenko, Tatiana M. & Yeremenko, Larysa V. & Nina P. Liesnichenko. 2021. Socio-psychological criteria of the formation of gender stereotypes of appearance, Journal of Intellectual Disability-Diagnosis and Treatment 9. 651–666.10.6000/2292-2598.2021.09.06.8Search in Google Scholar
Solonets, Polina Vladimirovna. 2021. Gender and Dictionary: Russian Perspective, dissertation. Braga: Universidade do Minho.Search in Google Scholar
Touretzky, David. 2024. Word embedding demo (webpage), Carnegie Melon University blogs, https://www.cs.cmu.edu/~dst/WordEmbeddingDemo/tutorial.html (last access: 18.06.2024).Search in Google Scholar
Ulčar, Matej & Supej, Anka & Robnik-Šikonja, Marko & Senja Pollak. 2021. Slovene and Croatian word embeddings in terms of gender occupational analogies, Slovenščina 2.0: empirične, aplikativne in interdisciplinarne raziskave 9(1). 26–59.10.4312/slo2.0.2021.1.26-59Search in Google Scholar
Vacalopoulou, Anna. 2022. Gender stereotypes in Greek children’s dictionaries, Dictionaries: Journal of the Dictionary Society of North America 43(1). 167–192.10.1353/dic.2022.0003Search in Google Scholar
van Dijk, Teun. 2001. Critical discourse analysis. In Schriffin, Deborah & Tannen, Deborah & Heidi Hamilton (eds.), Handbook of Discourse Analysis, 352–371. Oxford: Blackwell. Search in Google Scholar
Vuković-Stamatović, Milica. 2024. Sexist humor in public Facebook comments delegitimizing Female politicians within Montenegro’s patriarchal culture. In Feldman, Ofer (ed.), Communicating Political Humor in the Media. The Language of Politics. Singapore: Springer. 10.1007/978-981-97-0726-3_9Search in Google Scholar
Wodak, Ruth 1989. Introduction. In Wodak, Ruth (ed.), Language, Power and Ideology. XIII–XX. Amsterdam/Philadelphia: John Benjamins.10.1075/ct.7Search in Google Scholar
Wodak, Ruth 2002. Aspects of critical discourse analysis, Zeitschrift für Angewandte Linguistik 36. 5–31.Search in Google Scholar
Yang, Zekun & Juan Feng. 2020. April. A causal inference method for reducing gender bias in word embedding relations, Proceedings of the AAAI Conference on Artificial Intelligence 34(5). 9434–9441.10.1609/aaai.v34i05.6486Search in Google Scholar
Zhao, Jieyu & Wang, Tianlu & Yatskar, Mark & Cotterell, Ryan & Ordonez, Vicente & Kai-Wei Chang. 2019. Gender bias in contextualized word embeddings. arXiv preprint arXiv:1904.03310.10.18653/v1/N19-1064Search in Google Scholar
© 2025 Walter de Gruyter GmbH, Berlin/Boston
Articles in the same Issue
- Frontmatter
- Frontmatter
- Michail Bulgakovs Romanklassiker Master i Margarita (1928–1940) erkundet und multimodal variiert: Bettina Eggers Album Moscou Endiablé. Sur les traces de Maître et Marguerite (2013)
- The Perception of National Identity within the Enclave of Viennese Czechs
- Cultural exchanges through translation between Slovenia and Montenegro in the context of the (pre- and post-)Yugoslav period (1918–2023)
- ‘Divorced ~ castrated’: Gender stereotypes in the Croatian section of Kontekst.io, a computer-generated thesaurus of synonyms and semantically related terms
- Tagungsbericht
- Slavic Linguistic Landscapes in Times of Global Challenges an der Universität Greifswald (27. – 28. Juni 2025)
- Buchbesprechung
- Putins Kriegsrhetorik
Articles in the same Issue
- Frontmatter
- Frontmatter
- Michail Bulgakovs Romanklassiker Master i Margarita (1928–1940) erkundet und multimodal variiert: Bettina Eggers Album Moscou Endiablé. Sur les traces de Maître et Marguerite (2013)
- The Perception of National Identity within the Enclave of Viennese Czechs
- Cultural exchanges through translation between Slovenia and Montenegro in the context of the (pre- and post-)Yugoslav period (1918–2023)
- ‘Divorced ~ castrated’: Gender stereotypes in the Croatian section of Kontekst.io, a computer-generated thesaurus of synonyms and semantically related terms
- Tagungsbericht
- Slavic Linguistic Landscapes in Times of Global Challenges an der Universität Greifswald (27. – 28. Juni 2025)
- Buchbesprechung
- Putins Kriegsrhetorik