Home Construction grammar and procedural semantics for human-interpretable grounded language processing
Article
Licensed
Unlicensed Requires Authentication

Construction grammar and procedural semantics for human-interpretable grounded language processing

  • Liesbet De Vos ORCID logo EMAIL logo , Jens Nevens ORCID logo , Paul Van Eecke ORCID logo and Katrien Beuls ORCID logo
Published/Copyright: March 15, 2024

Abstract

Grounded language processing is a crucial component in many artificial intelligence systems, as it allows agents to communicate about their physical surroundings. State-of-the-art approaches typically employ deep learning techniques that perform end-to-end mappings between natural language expressions and representations grounded in the environment. Although these techniques achieve high levels of accuracy, they are often criticized for their lack of interpretability and their reliance on large amounts of training data. As an alternative, we propose a fully interpretable, data-efficient architecture for grounded language processing. The architecture is based on two main components. The first component comprises an inventory of human-interpretable concepts learned through task-based communicative interactions. These concepts connect the sensorimotor experiences of an agent to meaningful symbols that can be used for reasoning operations. The second component is a computational construction grammar that maps between natural language expressions and procedural semantic representations. These representations are grounded through their integration with the learned concepts. We validate the architecture using a variation on the CLEVR benchmark, achieving an accuracy of 96 %. Our experiments demonstrate that the integration of a computational construction grammar with an inventory of interpretable grounded concepts can effectively achieve human-interpretable grounded language processing in the CLEVR environment.


Corresponding author: Liesbet De Vos, Faculté d’informatique, Université de Namur, Namur, Belgium, E-mail:
Paul Van Eecke and Katrien Beuls are joint last authors.

Award Identifier / Grant number: 1SB6219N

Award Identifier / Grant number: 75929

Funding source: European Commission

Award Identifier / Grant number: 951846

Funding source: Waalse Gewest

Award Identifier / Grant number: ARIAC by DigitalWallonia4.ai

Alomari, Muhannad, Fangjun Li, David C. Hogg & Anthony G. Cohn. 2022. Online perceptual learning and natural language acquisition for autonomous robots. Artificial Intelligence 303. 103637. https://doi.org/10.1016/j.artint.2021.103637.Search in Google Scholar

Andreas, Jacob, Marcus Rohrbach, Trevor Darrell & Dan Klein. 2016. Learning to compose neural networks for question answering. In Kevin Knight, Ani Nenkova & Owen Rambow (eds.), Proceedings of the 2016 conference of the North American chapter of the Association for Computational Linguistics: Human language technologies, 1545–1554. San Diego, CA: Association for Computational Linguistics.10.18653/v1/N16-1181Search in Google Scholar

Beuls, Katrien & Paul Van Eecke. 2023. Fluid construction grammar: State of the art and future outlook. In Claire Bonial & Harish Tayyar Madabushi (eds.), Proceedings of the first International Workshop on Construction Grammars and NLP (CxGs+NLP, GURT/SyntaxFest 2023), 41–50. Washington, D.C.: Association for Computational Linguistics.Search in Google Scholar

Beuls, Katrien & Paul Van Eecke. 2024. Construction grammar and artificial intelligence. In Mirjam Fried & Kiki Nikiforidou (eds.), The Cambridge handbook of construction grammar. Forthcoming. Cambridge, United Kingdom: Cambridge University Press.Search in Google Scholar

Beuls, Katrien, Paul Van Eecke & Vanja Sophie Cangalovic. 2021. A computational construction grammar approach to semantic frame extraction. Linguistics Vanguard 7(1). 20180015. https://doi.org/10.1515/lingvan-2018-0015.Search in Google Scholar

Bleys, Joris. 2016. Language strategies for the domain of colour. Berlin: Language Science Press.10.26530/OAPEN_603341Search in Google Scholar

Chen, Kan, Jiang Wang, Liang-Chieh Chen, Haoyuan Gao, Wei Xu & Nevatia Ram. 2015. Abc-cnn: An attention based convolutional neural network for visual question answering. arXiv preprint arXiv:1511.05960. https://doi.org/10.48550/arXiv.1511.05960.Search in Google Scholar

Cirik, Volkan, Taylor Berg-Kirkpatrick & Louis-Philippe Morency. 2018. Using syntax to ground referring expressions in natural images. In Sheila McIlraith & Kilian Q. Weinberger (eds.), Proceedings of the thirty-second AAAI Conference on Artificial Intelligence, 6756–6764. Washington, D.C.: AAAI Press.10.1609/aaai.v32i1.12343Search in Google Scholar

Das, Abhishek, Satwik Kottur, Khushi Gupta, Avi Singh, Deshraj Yadav, José M. F. Moura, Devi Parikh & Dhruv Batra. 2017. Visual dialog. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1080–1089. Washington, D.C.: IEEE Computer Society.10.1109/CVPR.2017.121Search in Google Scholar

Doumen, Jonas, Katrien Beuls & Paul Van Eecke. 2023. Modelling language acquisition through syntactico-semantic pattern finding. In Andreas Vlachos & Isabelle Augenstein (eds.), Findings of the association for computational linguistics: EACL 2023, 1317–1327. Dubrovnik: Association for Computational Linguistics.10.18653/v1/2023.findings-eacl.99Search in Google Scholar

Frank, Anette, Hans-Ulrich Krieger, Feiyu Xu, Hans Uszkoreit, Berthold Crysmann, Brigitte Jörg & Schäfer Ulrich. 2007. Question answering from structured knowledge sources. Journal of Applied Logic 5(1). 20–48. https://doi.org/10.1016/j.jal.2005.12.006.Search in Google Scholar

Garcez, Artur d’Avila, Tarek R. Besold, Luc De Raedt, Földiak Peter, Pascal Hitzler, Thomas Icard, Kai-Uwe Kühnberger, Luis C. Lamb, Risto Miikkulainen & Daniel L. Silver. 2015. Neural-symbolic learning and reasoning: Contributions and challenges. In 2015 AAAI Spring symposium series, 18–21. Washington, D.C.: AAAI Press.Search in Google Scholar

Guo, Chuan, Geoff Pleiss, Yu Sun & Kilian Q. Weinberger. 2017. On calibration of modern neural networks. In Doina Precup & Yee Whye Teh (eds.), Proceedings of the 34th International Conference on Machine Learning (ICML), 1321–1330. Sydney: JMLR.org.Search in Google Scholar

Hu, Ronghang, Andreas Jacob, Trevor Darrell & Kate Saenko. 2018. Explainable neural computation via stack neural module networks. In Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu & Yair Weiss (eds.), European conference on computer vision (eccv 2018), 53–69. Cham: Springer.10.1007/978-3-030-01234-2_4Search in Google Scholar

Hu, Ronghang, Jacob Andreas, Marcus Rohrbach, Trevor Darrell & Kate Saenko. 2017. Learning to reason: End-to-end module networks for visual question answering. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 804–813. Washington, D.C.: IEEE Computer Society.10.1109/ICCV.2017.93Search in Google Scholar

Hudson, Drew A. & Christopher D. Manning. 2018. Compositional attention networks for machine reasoning. In 6th International Conference on Learning Representations (ICLR 2018), 1–20. Vancouver.Search in Google Scholar

Jang, Yunseok, Yale Song, Youngjae Yu, Youngjin Kim & Gunhee Kim. 2017. Tgif-qa: Toward spatio-temporal reasoning in visual question answering. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2758–2766. Washington, D.C.: IEEE Computer Society.10.1109/CVPR.2017.149Search in Google Scholar

Johnson, Justin, Bharath Hariharan, Laurens van der Maaten, Li Fei-Fei, C. Lawrence Zitnick & Ross Girshick. 2017a. Clevr: A diagnostic dataset for compositional language and elementary visual reasoning. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2901–2910. Washington, D.C.: IEEE Computer Society.10.1109/CVPR.2017.215Search in Google Scholar

Johnson, Justin, Bharath Hariharan, Laurens van der Maaten, Judy Hoffman, Li Fei-Fei, C. Lawrence Zitnick & Ross Girshick. 2017b. Inferring and executing programs for visual reasoning. In Rita Cucchiara, Yasuyuki Matsushita, Nicu Sebe & Stefano Soatto (eds.), 2017 IEEE International Conference on Computer Vision (ICCV), 2989–2998. Washington, D.C.: IEEE Computer Society.10.1109/ICCV.2017.325Search in Google Scholar

Kazemzadeh, Sahar, Vicente Ordonez, Mark Matten & Tamara Berg. 2014. Referitgame: Referring to objects in photographs of natural scenes. In Alessandro Moschitti, Bo Pang & Walter Daelemans (eds.), Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 787–798. Doha: Association for Computational Linguistics.10.3115/v1/D14-1086Search in Google Scholar

Liang, Percy. 2016. Learning executable semantic parsers for natural language understanding. Communications of the ACM 59(9). 68–76. https://doi.org/10.1145/2866568.Search in Google Scholar

Loetzsch, Martin. 2015. Lexicon formation in autonomous robots. Berlin: Humboldt-Universität zu Berlin dissertation.Search in Google Scholar

Lu, Jiasen, Jianwei Yang, Dhruv Batra & Devi Parikh. 2016. Hierarchical question-image co-attention for visual question answering. In Daniel Lee, Masashi Sugiyama, Ulrike Von Luxburg, Isabelle Guyon & Roman Garnett (eds.), Advances in neural information processing systems 29 (NIPS 2016), 289–297. Red Hook, NY: Curran Associates.Search in Google Scholar

Manhaeve, Robin, Sebastijan Dumančić, Angelika Kimmig, Thomas Demeester & Luc De Raedt. 2021. Neural probabilistic logic programming in DeepProbLog. Artificial Intelligence 298. 103504. https://doi.org/10.1016/j.artint.2021.103504.Search in Google Scholar

Mao, Jiayuan, Chuang Gan, Pushmeet Kohli, Joshua B. Tenenbaum & Jiajun Wu. 2019. The neuro-symbolic concept learner: Interpreting scenes, words, and sentences from natural supervision. In 7th International Conference on Learning Representations (ICLR 2019). New Orleans, LA.Search in Google Scholar

Marcus, Gary. 2018. Deep learning: A critical appraisal. arXiv preprint arXiv:1801.00631. https://doi.org/10.48550/arXiv.1801.00631.Search in Google Scholar

Marques, Tânia & Katrien Beuls. 2016. Evaluation strategies for computational construction grammars. In Yuji Matsumoto & Rashmi Prasad (eds.), Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical papers, 1137–1146. Osaka, Japan: International Committee on Computational Linguistics.Search in Google Scholar

Massiceti, Daniela, Puneet K. Dokania, Narayanaswamy Siddharth & Philip Torr. 2018. Visual dialogue without vision or dialogue. Critiquing and correcting trends in machine learning workshop: NeurIPS 2018. Montreal, Canada.Search in Google Scholar

McFetridge, Paul, Fred Popowich & Dan Fass. 1996. An analysis of compounds in HPSG (Head-driven Phrase Structure Grammar) for database queries. Data & Knowledge Engineering 20(2). 195–209. https://doi.org/10.1016/s0169-023x(96)00033-x.Search in Google Scholar

Mitchell, Melanie. 2020. On crashing the barrier of meaning in artificial intelligence. AI Magazine 41(2). 86–92. https://doi.org/10.1609/aimag.v41i2.5259.Search in Google Scholar

Mitchell, Melanie. 2021. Abstraction and analogy-making in artificial intelligence. Annals of the New York Academy of Sciences 1505(1). 79–101. https://doi.org/10.1111/nyas.14619.Search in Google Scholar

Mooney, Raymond J. 2008. Learning to connect language and perception. In Dieter Fox & Carla Gomes (eds.), Proceedings of the twenty-third AAAI conference on artificial intelligence, 1598–1601. Washington, D.C.: AAAI Press.Search in Google Scholar

Nevens, Jens. 2022. Representing and learning linguistic structures on the conceptual, morphosyntactic, and semantic level. Brussels: Vrije Universiteit Brussel dissertation.Search in Google Scholar

Nevens, Jens, Jonas Doumen, Paul Van Eecke & Katrien Beuls. 2022. Language acquisition through intention reading and pattern finding. In Nicoletta Calzolari & Chu-Ren Huang (eds.), Proceedings of the 29th International Conference on Computational Linguistics, 15–25. Gyeongju, Republic of Korea: International Committee on Computational Linguistics.Search in Google Scholar

Nevens, Jens, Paul Van Eecke & Katrien Beuls. 2019a. A practical guide to studying emergent communication through grounded language games. In AISB 2019 Symposium on Language Learning for Artificial Agents, 1–8. Falmouth: AISB.Search in Google Scholar

Nevens, Jens, Paul Van Eecke & Katrien Beuls. 2019b. Computational construction grammar for visual question answering. Linguistics Vanguard 5(1). 20180070. https://doi.org/10.1515/lingvan-2018-0070.Search in Google Scholar

Nevens, Jens, Paul Van Eecke & Katrien Beuls. 2020. From continuous observations to symbolic concepts: A discrimination-based strategy for grounded concept learning. Frontiers in Robotics and AI 7(84). https://doi.org/10.3389/frobt.2020.00084.Search in Google Scholar

Persson, Andreas, Pedro Miguel Zuidberg Dos Martires, Luc De Raedt & Loutfi Amy. 2019. Semantic relational object tracking. IEEE Transactions on Cognitive and Developmental Systems 12(1). 84–97. https://doi.org/10.1109/tcds.2019.2915763.Search in Google Scholar

Spranger, Michael, Simon Pauw & Martin Loetzsch. 2010. Open-ended semantics co-evolving with spatial language. In Erica A. Cartmill, Sean Roberts, Heidi Lyn & Hannah Cornish (eds.), Proceedings of the 10th international conference (EVOLANGX), 297–304. Singapore: World Scientific.10.1142/9789814295222_0038Search in Google Scholar

Steels, Luc. 2001. Language games for autonomous robots. IEEE Intelligent Systems 16. 16–22. https://doi.org/10.1109/mis.2001.956077.Search in Google Scholar

Steels, Luc. 2012. Grounding language through evolutionary language games. In Luc Steels & Manfred Hild (eds.), Language grounding in robots, 1–22. New York, NY: Springer.10.1007/978-1-4614-3064-3_1Search in Google Scholar

Steels, Luc & Tony Belpaeme. 2005. Coordinating perceptually grounded categories through language: A case study for colour. Behavioral and Brain Sciences 28(4). 469–489. https://doi.org/10.1017/S0140525X05000087.Search in Google Scholar

Steels, Luc, Martin Loetzsch & Michael Spranger. 2016. A boy named Sue: The semiotic dynamics of naming and identity. Belgian Journal of Linguistics 30(1). 147–169. https://doi.org/10.1075/bjl.30.07ste.Search in Google Scholar

Thulasidasan, Sunil, Gopinath Chennupati, Jeff A. Bilmes, Tanmoy Bhattacharya & Sarah Michalak. 2019. On mixup training: Improved calibration and predictive uncertainty for deep neural networks. In Hanna Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d’Alché-Buc, Emily Fox & Roman Garnett (eds.), Advances in Neural Information Processing Systems 32 (NeurIPS 2019), 13843–13854. Red Hook, NY, USA: Curran Associates.10.2172/1525811Search in Google Scholar

Van den Broeck, Wouter. 2008. Constraint based compositional semantics. In Andrew D. M. Smith, Kenny Smith & Ramon Ferrer i Cancho (eds.), Proceedings of the 7th International Conference on the Evolution of Language (EVOLANG7), 338–345. World Scientific.10.1142/9789812776129_0043Search in Google Scholar

Van Eecke, Paul. 2018. Generalisation and specialisation operators for computational construction grammar and their application in evolutionary linguistics research. Brussels: Vrije Universiteit Brussel dissertation.Search in Google Scholar

Van Eecke, Paul & Katrien Beuls. 2017. Meta-layer problem solving for computational construction grammar. In The 2017 AAAI Spring symposium series, 258–265. Washington, D.C.: AAAI Press.Search in Google Scholar

Van Eecke, Paul, Jens Nevens & Katrien Beuls. 2022. Neural heuristics for scaling constructional language processing. Journal of Language Modelling 10(2). 287–314. https://doi.org/10.15398/jlm.v10i2.318.Search in Google Scholar

van Trijp, Remi, Katrien Beuls & Paul Van Eecke. 2022. The FCG Editor: An innovative environment for engineering computational construction grammars. PLoS One 17(6). e0269708. https://doi.org/10.1371/journal.pone.0269708.Search in Google Scholar

Wellens, Pieter. 2012. Adaptive strategies in the emergence of lexical systems. Brussels: Vrije Universiteit Brussel dissertation.Search in Google Scholar

Winograd, Terry. 1972. Understanding natural language. Cognitive Psychology 3(1). 1–191. https://doi.org/10.1016/0010-0285(72)90002-3.Search in Google Scholar

Yarmohammadi, Mahsa A., Mehrnoush Shamsfard, Mahshid A. Yarmohammadi & Masoud Rouhizadeh. 2008. SBUQA question answering system. In Advances in computer science and engineering: Csicc 2008, 316–323. Berlin: Springer.10.1007/978-3-540-89985-3_39Search in Google Scholar

Yi, Kexin, Jiajun Wu, Chuang Gan, Antonio Torralba, Pushmeet Kohli & Josh Tenenbaum. 2018. Neural-symbolic VQA: Disentangling reasoning from vision and language understanding. In Samy Bengio, Hanna Wallach, Hugo Larochelle, Kristen Grauman, Nicolò Cesa-Bianchi & Roman Garnett (eds.), Advances in Neural Information Processing Systems 31 (NeurIPS 2018), 1031–1042. Red Hook, NY, USA: Curran Associates.Search in Google Scholar

Yu, Zhou, Jun Yu, Yuhao Cui, Dacheng Tao & Tian Qi. 2019. Deep modular co-attention networks for visual question answering. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 6274–6283. Washington, D.C.: IEEE Computer Society.10.1109/CVPR.2019.00644Search in Google Scholar

Zettlemoyer, Luke & Michael Collins. 2005. Learning to map sentences to logical form: Structured classification with probabilistic categorial grammars. In Fahiem Bacchus & Tommi Jaakkola (eds.), Proceedings of the twenty-first Conference on Uncertainty in Artificial Intelligence, 658–666. Edinburgh: AUAI Press.Search in Google Scholar

Received: 2023-01-17
Accepted: 2023-07-07
Published Online: 2024-03-15

© 2024 Walter de Gruyter GmbH, Berlin/Boston

Articles in the same Issue

  1. Frontmatter
  2. Editorial
  3. Editorial 2024
  4. Phonetics & Phonology
  5. The role of recoverability in the implementation of non-phonemic glottalization in Hawaiian
  6. Epenthetic vowel quality crosslinguistically, with focus on Modern Hebrew
  7. Japanese speakers can infer specific sub-lexicons using phonotactic cues
  8. Articulatory phonetics in the market: combining public engagement with ultrasound data collection
  9. Investigating the acoustic fidelity of vowels across remote recording methods
  10. The role of coarticulatory tonal information in Cantonese spoken word recognition: an eye-tracking study
  11. Tracking phonological regularities: exploring the influence of learning mode and regularity locus in adult phonological learning
  12. Morphology & Syntax
  13. #AreHashtagsWords? Structure, position, and syntactic integration of hashtags in (English) tweets
  14. The meaning of morphomes: distributional semantics of Spanish stem alternations
  15. A refinement of the analysis of the resultative V-de construction in Mandarin Chinese
  16. L2 cognitive construal and morphosyntactic acquisition of pseudo-passive constructions
  17. Semantics & Pragmatics
  18. “All women are like that”: an overview of linguistic deindividualization and dehumanization of women in the incelosphere
  19. Counterfactual language, emotion, and perspective: a sentence completion study during the COVID-19 pandemic
  20. Constructing elderly patients’ agency through conversational storytelling
  21. Language Documentation & Typology
  22. Conative animal calls in Macha Oromo: function and form
  23. The syntax of African American English borrowings in the Louisiana Creole tense-mood-aspect system
  24. Syntactic pausing? Re-examining the associations
  25. Bibliographic bias and information-density sampling
  26. Historical & Comparative Linguistics
  27. Revisiting the hypothesis of ideophones as windows to language evolution
  28. Verifying the morpho-semantics of aspect via typological homogeneity
  29. Psycholinguistics & Neurolinguistics
  30. Sign recognition: the effect of parameters and features in sign mispronunciations
  31. Influence of translation on perceived metaphor features: quality, aptness, metaphoricity, and familiarity
  32. Effects of grammatical gender on gender inferences: Evidence from French hybrid nouns
  33. Processing reflexives in adjunct control: an exploration of attraction effects
  34. Language Acquisition & Language Learning
  35. How do L1 glosses affect EFL learners’ reading comprehension performance? An eye-tracking study
  36. Modeling L2 motivation change and its predictive effects on learning behaviors in the extramural digital context: a quantitative investigation in China
  37. Ongoing exposure to an ambient language continues to build implicit knowledge across the lifespan
  38. On the relationship between complexity of primary occupation and L2 varietal behavior in adult migrants in Austria
  39. The acquisition of speaking fundamental frequency (F0) features in Cantonese and English by simultaneous bilingual children
  40. Sociolinguistics & Anthropological Linguistics
  41. A computational approach to detecting the envelope of variation
  42. Attitudes toward code-switching among bilingual Jordanians: a comparative study
  43. “Let’s ride this out together”: unpacking multilingual top-down and bottom-up pandemic communication evidenced in Singapore’s coronavirus-related linguistic and semiotic landscape
  44. Across time, space, and genres: measuring probabilistic grammar distances between varieties of Mandarin
  45. Navigating linguistic ideologies and market dynamics within China’s English language teaching landscape
  46. Streetscapes and memories of real socialist anti-fascism in south-eastern Europe: between dystopianism and utopianism
  47. What can NLP do for linguistics? Towards using grammatical error analysis to document non-standard English features
  48. From sociolinguistic perception to strategic action in the study of social meaning
  49. Minority genders in quantitative survey research: a data-driven approach to clear, inclusive, and accurate gender questions
  50. Variation is the way to perfection: imperfect rhyming in Chinese hip hop
  51. Shifts in digital media usage before and after the pandemic by Rusyns in Ukraine
  52. Computational & Corpus Linguistics
  53. Revisiting the automatic prediction of lexical errors in Mandarin
  54. Finding continuers in Swedish Sign Language
  55. Conversational priming in repetitional responses as a mechanism in language change: evidence from agent-based modelling
  56. Construction grammar and procedural semantics for human-interpretable grounded language processing
  57. Through the compression glass: language complexity and the linguistic structure of compressed strings
  58. Could this be next for corpus linguistics? Methods of semi-automatic data annotation with contextualized word embeddings
  59. The Red Hen Audio Tagger
  60. Code-switching in computer-mediated communication by Gen Z Japanese Americans
  61. Supervised prediction of production patterns using machine learning algorithms
  62. Introducing Bed Word: a new automated speech recognition tool for sociolinguistic interview transcription
  63. Decoding French equivalents of the English present perfect: evidence from parallel corpora of parliamentary documents
  64. Enhancing automated essay scoring with GCNs and multi-level features for robust multidimensional assessments
  65. Sociolinguistic auto-coding has fairness problems too: measuring and mitigating bias
  66. The role of syntax in hashtag popularity
  67. Language practices of Chinese doctoral students studying abroad on social media: a translanguaging perspective
  68. Cognitive Linguistics
  69. Metaphor and gender: are words associated with source domains perceived in a gendered way?
  70. Crossmodal correspondence between lexical tones and visual motions: a forced-choice mapping task on Mandarin Chinese
Downloaded on 8.9.2025 from https://www.degruyterbrill.com/document/doi/10.1515/lingvan-2022-0054/html
Scroll to top button