Language acquisition in vector space

Anders Søgaard

doi:10.1515/ip-2025-2001

Article

Language acquisition in vector space

Anders Søgaard
Anders Søgaard is a professor of computer science at the University of Copenhagen, where he also runs the Center for Philosophy of Artificial Intelligence. He has published more than 300 papers in computer science and philosophy, as well as a handful of academic books. He has received an ERC Starting Grant, a Google Focused Research Award, and a Carlsberg Semper Ardens Advance.

Published/Copyright: August 4, 2025

Published by

Become an author with De Gruyter Brill

Author Information

From the journal Intercultural Pragmatics Volume 22 Issue 2

Abstract

Language models are mathematical functions and, as such, induce vector spaces in which input is embedded. Comparing the point clouds of concept vectors across such language models and similar computer vision models, we see surprising similarities. This sheds new light on the Innateness Debate. Much linguistic structure can be induced from extra-linguistic data. Language models are generally thought to be too sample-inefficient to be good models of language acquisition, but what about language models initialized by computer vision models?

Keywords: language models; vector spaces; innateness; sample efficiency

Anders Søgaard, University of Copenhagen, Copenhagen, Denmark, E-mail: soegaard@di.ku.dk

About the author

Anders Søgaard

Anders Søgaard is a professor of computer science at the University of Copenhagen, where he also runs the Center for Philosophy of Artificial Intelligence. He has published more than 300 papers in computer science and philosophy, as well as a handful of academic books. He has received an ERC Starting Grant, a Google Focused Research Award, and a Carlsberg Semper Ardens Advance.

References

Abdou, Mostafa, Artur Kulmizev, Daniel Hershcovich, Stella Frank, Ellie Pavlick & Anders Søgaard. 2021. Can language models encode perceptual structure without grounding? A case study in color. In Proceedings of the 25th Conference on Computational Natural Language Learning, 109–132. (Online): Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.conll-1.9.Search in Google Scholar

Bender, Emily M. & Alexander Koller. 2020. Climbing towards NLU: On meaning, form, and understanding in the age of data. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 5185–5198. (Online): Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.acl-main.463.Search in Google Scholar

Chomsky, Noam. 1980. Rules and representations. Behavioral and Brain Sciences 3(1). 1–15. https://doi.org/10.1017/s0140525x00001515.Search in Google Scholar

Garneau, Nicolas, Mareike Hartmann, Anders Sandholm, Sebastian Ruder, Ivan Vulic & Anders Søgaard. 2021. Analogy training multilingual encoders. Proceedings of the AAAI Conference on Artificial Intelligence 35(14). 12884–12892. https://doi.org/10.1609/aaai.v35i14.17524.Search in Google Scholar

Gold, E. Mark. 1967. Language identification in the limit. Information and Control 10(5). 447–474.10.1016/S0019-9958(67)91165-5Search in Google Scholar

Goodman, Nelson. 1976. Languages of art: An approach to a theory of symbols. Indianapolis: Hackett.10.5040/9781350928541Search in Google Scholar

Gurnee, Wes & Max Tegmark. 2023. Language models represent space and time. https://arxiv.org/abs/2310.02207 (accessed 10 August 2024).Search in Google Scholar

Hart, Betty & Todd R. Risley. 1995. Meaningful differences in the everyday experience of young American children. Baltimore, MD: P.H. Brookes.Search in Google Scholar

Huebner, Philip A., Elior Sulem, Fisher Cynthia & Dan Roth. 2021. BabyBERTa: Learning more grammar with small-scale child-directed language. In Arianna Bisazza & Omri Abend (eds.), Proceedings of the 25th Conference on Computational Natural Language Learning, 624–646. (Online): Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.conll-1.49.Search in Google Scholar

Leong, Cara Su-Yi & Tal Linzen. 2024. Testing learning hypotheses using neural networks by manipulating learning data. https://arxiv.org/abs/2407.04593 (accessed 10 August 2024).Search in Google Scholar

Li, Kenneth, Aspen K. Hopkins, David Bau, Fernanda Viégas, Hanspeter Pfister & Martin Wattenberg. 2023. Emergent world representations: Exploring a sequence model trained on a synthetic task. https://arxiv.org/abs/2310.02207 (accessed 10 August 2024).Search in Google Scholar

Li, Jiaang, Yova Kementchedjhieva, Constanza Fierro & Anders Søgaard. 2024. Do vision and language models share concepts? A vector space alignment study. https://arxiv.org/abs/2302.06555 (accessed 10 August 2024).Search in Google Scholar

Liétard, Bastien, Mostafa Abdou & Anders Søgaard. 2021. Do language models know the way to Rome? https://arxiv.org/abs/2109.07971 (accessed 10 August 2024).Search in Google Scholar

Merullo, Jack, Carsten Eickhoff & Ellie Pavlick. 2024. Language models implement simple word2Vec-style vector arithmetic. https://arxiv.org/abs/2305.16130 (accessed 10 August 2024).10.18653/v1/2024.naacl-long.281Search in Google Scholar

Mollo, Dimitri Coelho & Raphaël Millière. 2023. The vector grounding problem. https://arxiv.org/abs/2304.01481 (accessed 10 August 2024).Search in Google Scholar

Newman, M. H. A. 1928. Mr. Russell’s causal theory of perception. Mind 37(146). 26–43. https://doi.org/10.1093/mind/xxxvii.146.137.Search in Google Scholar

Patel, Roma & Ellie Pavlick. 2022. Mapping language models to grounded conceptual spaces. In International Conference on Learning Representations. (Online): ICLR.Search in Google Scholar

Søgaard, Anders. 2023. Grounding the vector space of an octopus: Word meaning from raw text. Minds and Machines 33. 33–54. https://doi.org/10.1007/s11023-023-09622-4.Search in Google Scholar

Vulié, Ivan, Sebastian Ruder & Anders Søgaard. 2020. Are all good word vector spaces isomorphic? In Bonnie Webber, Trevor Cohn, Yulan He & Yang Liu (eds.), Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 3178–3192. (Online): Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.emnlp-main.257.Search in Google Scholar

Wilcox, Ethan Gotlieb, Michael Hu, Aaron Mueller, Tal Linzen, Alex Warstadt, Leshem Choshen, Chengxu Zhuang, Ryan Cotterell & Adina Williams. 2024. Bigger is not always better: The importance of human-scale language modeling for psycholinguistics. Available at: https://doi.org/10.31234/osf.io/rfwgd.Search in Google Scholar

Published Online: 2025-08-04

Published in Print: 2025-04-28

You are currently not able to access this content.

Articles in the same Issue

https://doi.org/10.1515/ip-2025-2001

Keywords for this article

language models; vector spaces; innateness; sample efficiency