Startseite Truth be told: a corpus-based study of the cross-linguistic colexification of representational and (inter)subjective meanings
Artikel
Lizenziert
Nicht lizenziert Erfordert eine Authentifizierung

Truth be told: a corpus-based study of the cross-linguistic colexification of representational and (inter)subjective meanings

  • Barend Beekhuizen ORCID logo EMAIL logo , Maya Blumenthal , Lee Jiang , Anna Pyrtchenkov und Jana Savevska
Veröffentlicht/Copyright: 1. November 2023

Abstract

The study of crosslinguistic variation in word meaning often focuses on representational and concrete meanings. We argue other kinds of word meanings (e.g., abstract and (inter)subjective meanings) can be fruitfully studied in translation corpora, and present a quantitative procedure for doing so. We focus on the cross-linguistic patterns for lemmas pertaining to truth and reality (English true and real), as these abstract meanings been found to frequently colexify with particular (inter)subjective meanings. Applying our method to a corpus of translated subtitles of TED talks, we show that (1) the abstract-representational meanings are colexified in patterned ways, that, however, are more complex than previously observed (some languages not splitting a ‘true’-like from ‘real’-like terms; many languages displaying further splits of representational meanings); (2) some non-representational meanings strongly colexify with representational meanings of ‘truth’ and ‘reality’, while others also often colexify with other fields.


Corresponding author: Barend Beekhuizen, Department of Language Studies, University of Toronto – Mississauga, Mississauga, L5L 1C6, Canada, E-mail:

Award Identifier / Grant number: RGPIN-2019-06917

Award Identifier / Grant number: Scholars in Residence program 2020

Acknowledgments

We would like to thank Lawrence Ora for contributing to the preparation of the data, Mah Noor Amir for her contributions to the initial stages of this project, as well as Songül Gündoğdu and Arsalan Kahnemuyipour for linguistic consultation. Needless to say, they are by no means responsible for the interpretation of the data presented in this paper.

  1. Research funding: This paper emerged from a Jackman Humanities Institute Scholars in Residence project, ran in May 2020. The further development of the paper was sponsored by a Connaught New Researcher Award and an NSERC Discovery Grant (RGPIN-2019-06917) to Barend Beekhuizen.

References

Aijmer, Karin & Anne-Marie Simon-Vandenbergen. 2004. A model and a methodology for the study of pragmatic markers: The semantic field of expectation. Journal of Pragmatics 36(10). 1781–1805. https://doi.org/10.1016/j.pragma.2004.05.005.Suche in Google Scholar

Ariel, Mira. 2009. Discourse, grammar, discourse. Discourse Studies 11(1). 5–36. https://doi.org/10.1177/1461445608098496.Suche in Google Scholar

Biber, Douglas. 1989. A typology of English texts. Linguistics 27. 3–43. https://doi.org/10.1515/ling-2013-0040.Suche in Google Scholar

Brown, Peter F., Stephen A. Della Pietra, Vincent J. Della Pietra & Robert L. Mercer. 1993. The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics 19(2). 263–311. https://doi.org/10.5555/972470.972474.Suche in Google Scholar

Brown, Roger W. & Eric H. Lenneberg. 1954. A study in language and cognition. Journal of Abnormal and Social Psychology 49(3). 454–462. https://doi.org/10.1037/h0057814.Suche in Google Scholar

Bühler, Karl. 1990 (1934). Theory of language. The representational function of language. Amsterdam/Philadelphia: John Benjamins.10.1075/fos.25Suche in Google Scholar

Croft, William. 2016. Typology and the future of cognitive linguistics. Cognitive Linguistics 27(4). 587–602. https://doi.org/10.1515/cog-2016-0056.Suche in Google Scholar

Dahl, Östen. 2015. How WEIRD are WALS languages? Diversity Linguistics: Retrospect and Prospect. Available at: http://www.eva.mpg.de/fileadmin/content_files/linguistics/conferences/2015-diversity-linguistics/Dahl_slides.pdf.Suche in Google Scholar

Dahl, Östen & Bernhard Wälchli. 2016. Perfects and iamitives: Two gram types in one grammatical space. Letras de Hoje 51(3). 325–348. https://doi.org/10.15448/1984-7726.2016.3.25454.Suche in Google Scholar

Defour, Tine. 2012. The pragmaticalization and intensification of verily, truly, and really. In Manfred Markus, Yoko Iyeiri, Reinhard Heuberger & Emil Chamson (eds.), Middle and Modern English corpus linguistics: A multi-dimensional approach, 75–92. Amsterdam: John Benjamins.10.1075/scl.50.09defSuche in Google Scholar

D’hondt, Ulrique & Tine Defour. 2012. At the crossroads of grammaticalization and pragmaticalization: A diachronic cross-linguistic case study on vraiment and really. Neuphilologische Mitteilungen 113(2). 169–190.Suche in Google Scholar

Dyvik, Helge. 1998. A translational basis for semantics. Language and Computers 24. 51–86.10.1163/9789004653665_006Suche in Google Scholar

Erk, Katrin, Diana McCarthy & Nicholas Gaylord. 2013. Measuring word meaning in context. Computational Linguistics 39(3). 511–554. https://doi.org/10.1162/coli_a_00142.Suche in Google Scholar

François, Alexandre. 2008. Semantic maps and the typology of colexifications: Intertwining polysemous networks across languages. In Martine Vanhove (ed.), From polysemy to semantic change: Towards a typology of lexical semantic associations, 163–216. Amsterdam: John Benjamins.10.1075/slcs.106.09fraSuche in Google Scholar

Good, Jeff & Michael Cysouw. 2013. Languoid, doculect, and glossonym: Formalizing the notion ‘language’. Language Documentation & Conservation 7. 331–359.Suche in Google Scholar

Grice, Herbert P. 1975. Logic and conversation. In Peter Cole & Jerry L. Morgan (eds.), Syntax and semantics. Volume 3: Speech acts, 41–58. Leiden: Brill.10.1163/9789004368811_003Suche in Google Scholar

Halliday, Michael Alexander Kirkwood & Christian M. I. M. Matthiessen. 2013. Halliday’s introduction to functional grammar. London: Routledge.10.4324/9780203431269Suche in Google Scholar

Haspelmath, Martin. 2018. How comparative concepts and descriptive linguistic categories are different. In Daniël van Olmen, Tanja Mortelmans & Frank Brisard (eds.), Aspects of linguistic variation, 83–114. Berlin: De Gruyter Mouton.10.1515/9783110607963-004Suche in Google Scholar

Honnibal, Matthew & Ines Montani. 2017. SpaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing. http://citebay.com/how-to-cite/spacy/.Suche in Google Scholar

Hotelling, Harold. 1933. Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology 24(6). 417. https://doi.org/10.1037/h0070888.Suche in Google Scholar

Kay, Paul, Brent Berlin, Luisa Maffi, William R. Merrifield & Richard Cook. 2009. The world color survey. Palo Alto, CA: CSLI Publications Stanford.Suche in Google Scholar

Koptjevskaja-Tamm, Maria. 2015. The linguistics of temperature. Amsterdam: John Benjamins.10.1075/tsl.107Suche in Google Scholar

Koptjevskaja-Tamm, Maria, Ekaterina Rakhilina & Martine Vanhove. 2015. The semantics of lexical typology. In Nick Riemer (ed.), The Routledge handbook of semantics, 434–454. London: Routledge.Suche in Google Scholar

Lenker, Ursula. 2008. Soþlice, forsoothe, truly–communicative principles and invited inferences in the history of truth-intensifying adverbs in English. In Susan M. Fitzmaurice & Irma Taavitsainen (eds.), Methods in historical pragmatics, 81–106. De Gruyter Mouton.10.1515/9783110197822.81Suche in Google Scholar

Levinson, Stephen, Sérgio Meira & The Language and Cognition Group. 2003. ‘Natural concepts’ in the spatial topological domain-adpositional meanings in crosslinguistic perspective: An exercise in semantic typology. Language 79(3). 485–516. https://doi.org/10.1353/lan.2003.0174.Suche in Google Scholar

Levinson, Stephen C. 1996. Relativity in spatial conception and description. In John J. Gumperz & Stephen C. Levinson (eds.), Rethinking linguistic relativity, 177–202. Cambridge, UK: Cambridge University Press.Suche in Google Scholar

Levshina, Natalia. 2016. Verbs of letting in Germanic and Romance languages: A quantitative investigation based on a parallel corpus of film subtitles. Languages in Contrast 16(1). 84–117. https://doi.org/10.1075/lic.16.1.04lev.Suche in Google Scholar

Levshina, Natalia. 2017. Online film subtitles as a corpus: An n-gram approach. Corpora 12(3). 311–338. https://doi.org/10.3366/cor.2017.0123.Suche in Google Scholar

Levshina, Natalia. 2021. Corpus-based typology: Applications, challenges and some solutions. Linguistic Typology 26(1). 129–160. https://doi.org/10.1515/lingty-2020-0118.Suche in Google Scholar

Lewis, Charlton Thomas & Charles Short. 1966. A Latin dictionary: Founded on Andrew’s ed. of Freund’s Latin dictionary. Oxford: Clarendon Press.Suche in Google Scholar

Liddle, Henry George & Robert Scott. 1968. A Greek-English Lexicon. Oxford: Clarendon Press.Suche in Google Scholar

Majid, Asifa, Melissa Bowerman, Miriam Van Staden & James S. Boster. 2007. The semantic categories of cutting and breaking events: A crosslinguistic perspective. Cognitive Linguistics 18(2). 133–152. https://doi.org/10.1515/cog.2007.005.Suche in Google Scholar

Maschler, Yael & Roi Estlein. 2008. Stance-taking in Hebrew casual conversation via be’emet (really, actually, indeed’, lit.in truth’). Discourse Studies 10(3). 283–316. https://doi.org/10.1177/1461445608090222.Suche in Google Scholar

Orr, Shirly & Mira Ariel. 2021. Predicating truth: An empirically based analysis. Journal of Pragmatics 185. 131–145. https://doi.org/10.1016/j.pragma.2021.09.005.Suche in Google Scholar

Östling, Robert. 2016. Studying colexification through massively parallell corpora. In Paeivi Juvonen & Maria Koptjevskaja-Tamm (eds.), The lexical typology of semantic shifts, 157–176. Berlin: De Gruyter Mouton.10.1515/9783110377675-006Suche in Google Scholar

Podani, János, Tibor Kalapos, Barbara Barta & Dénes Schmera. 2021. Principal component analysis of incomplete data–a simple solution to an old problem. Ecological Informatics 61. 101235. https://doi.org/10.1016/j.ecoinf.2021.101235.Suche in Google Scholar

Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech & Jan Svartvik. 1985. A comprehensive grammar of the English language. London: Longman.Suche in Google Scholar

Ramminger, Johann. 2003. Neulateinische Wortliste. ein Wörterbuch des Lateinischen von Petrarca bis 1700. http://nlw.renaessancestudier.org/Suche in Google Scholar

Rosemeyer, Malte & Eitan Grossman. 2021. Why don’t grammaticalization pathways always recur? Corpus Linguistics and Linguistic Theory 17(3). 653–681. https://doi.org/10.1515/cllt-2020-0053.Suche in Google Scholar

Rzymski, Christoph, Tiago Tresoldi, Simon J. Greenhill, Mei-Shin Wu, Nathanael E. Schweikhard, Maria Koptjevskaja-Tamm, Volker Gast, Timotheus A. Bodt, Abbie Hantgan, Gereon A. Kaiping, Sophie Chang, Yunfan Lai, Natalia Morozova, Heini Arjava, Nataliia Hübler, Ezequiel Koile, Steve Pepper, Mariann Proos, Briana Van Epps, Ingrid Blanco, Carolin Hundt, Sergei Monakhov, Kristina Pianykh, Sallona Ramesh, Russell D. Gray, Robert Forkel & Johann-Mattis List. 2020. The database of cross-linguistic colexifications, reproducible analysis of cross-linguistic polysemies. Scientific Data 7(1). 1–12. https://doi.org/10.1038/s41597-019-0341-x.Suche in Google Scholar

Simon-Vandenbergen, Anne-Marie. 2013. Reality and related concepts: Towards a semantic-pragmatic map of English adverbs. In Juana I. Marín-Arrese, Marta Carretero, Jorge Arús Hita & Johan van der Auwera (eds.), English modality: Core, periphery and evidentiality, 253–280. Berlin: De Gruyter Mouton.10.1515/9783110286328.253Suche in Google Scholar

Sinclair, John. 1995. Collins COBUILD English dictionary. London: Harper Collins.Suche in Google Scholar

Tagliamonte, Sali & Chris Roberts. 2005. So weird; so cool; so innovative. The use of intensifiers in the television friends. American Speech 80(3). 280–300. https://doi.org/10.1215/00031283-80-3-280.Suche in Google Scholar

Talmy, Leonard. 1975. Semantics and syntax of motion. Syntax and Semantics 4. 181–238. https://doi.org/10.1163/9789004368828_008.Suche in Google Scholar

Tognini-Bonelli, Elena. 1996. Section 2: The Malvern seminar: Towards translation equivalence from a corpus linguistics perspective. International Journal of Lexicography 9(3). 197–217. https://doi.org/10.1093/ijl/9.3.197.Suche in Google Scholar

Traugott, Elizabeth Closs & Richard B Dasher. 2001. Regularity in semantic change. Cambridge: Cambridge University Press.10.1017/CBO9780511486500Suche in Google Scholar

Usonienė, Aurelija, Audronė Šolienė & Jolanta Šinkūnienė. 2015. Revisiting the multifunctionality of the adverbials of act and fact in a cross-linguistic perspective. Nordic Journal of English Studies 14(1). 201–231. https://doi.org/10.35360/njes.345.Suche in Google Scholar

van der Klis, Martijn & Jos Tellings. 2020. Generating semantic maps through multidimensional scaling: Linguistic applications and theory. Corpus Linguistics and Linguistic Theory 18. 627–665. https://doi.org/10.1515/cllt-2021-001.Suche in Google Scholar

Verkerk, Annemarie. 2014. Where Alice fell into: Motion events from a parallel corpus. In Benedikt Szmrecsanyi & Bernhard Wälchli (eds.), Aggregating dialectology, typology, and register analysis: Linguistic variation in text and speech, 324–354. Berlin: Mouton de Gruyter.10.1515/9783110317558.324Suche in Google Scholar

Wälchli, Berhard & Michael Cysouw. 2012. Lexical typology through similarity semantics: Toward a semantic map of motion verbs. Linguistics 50(3). 671–710. https://doi.org/10.1515/ling-2012-0021.Suche in Google Scholar

Wälchli, Bernhard. 2007. Advantages and disadvantages of using parallel texts in typological investigations. Sprachtypologie und Universalienforschung 60. 118–134. https://doi.org/10.1524/stuf.2007.60.2.118.Suche in Google Scholar

Wälchli, Bernhard. 2016. Non-specific, specific and obscured perception verbs in Baltic languages. Baltic Linguistics 7. 53–135. https://doi.org/10.32798/bl.384.Suche in Google Scholar

Wälchli, Bernhard. 2018. ‘As long as’, ‘until’ and ‘before’ clauses. Baltic Linguistics 9. 141–236. https://doi.org/10.32798/bl.372.Suche in Google Scholar

Wälchli, Bernhard. 2019. The feminine anaphoric gender gram, incipient gender marking, maturity, and extracting anaphoric gender markers from parallel texts. In Francesca Di Garbo, Bruno Olsson & Bernhard Wälchli (eds.), Grammatical gender and linguistic complexity. Volume II: World-wide comparative studies, 61–131. Berlin: Language Science Press.Suche in Google Scholar

Walkden, George. 2017. The actuation problem. In Adam Ledgeway & Ian Roberts (eds.), The Cambridge handbook of historical syntax, 403–424. Cambridge: Cambridge University Press.10.1017/9781107279070.020Suche in Google Scholar

Weinreich, Uriel, William Labov & Marvin Herzog. 1968. Empirical foundations for a theory of language change. In Winfred P. Lehmann & Yakov Malkiel (eds.), Directions for historical linguistics, 95–189. Austin, TX: University of Texas Press.Suche in Google Scholar

Wierzbicka, Anna. 2002. Philosophy and discourse: The rise of «really» and the fall of «truly». Cahiers de Praxématique 38(3). 85–112. https://doi.org/10.4000/praxematique.574.Suche in Google Scholar

Willems, Dominique & Annemie Demol. 2006. Vraiment and really in contrast: When truth and reality meet. In Karin Ajimer & Anne-Marie Simon-Vandenbergen (eds.), Pragmatic markers in contrast, 215–235. Leiden: Brill.10.1163/9780080480299_014Suche in Google Scholar

Received: 2021-08-04
Accepted: 2023-10-03
Published Online: 2023-11-01
Published in Print: 2024-05-27

© 2023 Walter de Gruyter GmbH, Berlin/Boston

Heruntergeladen am 11.11.2025 von https://www.degruyterbrill.com/document/doi/10.1515/cllt-2021-0058/html
Button zum nach oben scrollen