Abstract
The co-acquisition of item-based constructions and syntactic categories has been extensively studied in the psycholinguistic literature (Pine & Lieven 1997; Tomasello 2003). However, the insights that were gained through these studies have so far only to a limited extent been translated into computational methodologies for learning construction grammars (Chang 2008; Gerasymova & Spranger 2010; Gaspers et al. 2016; Van Eecke 2018). The research on which we report here aims to fill this void by introducing a novel methodology for modelling the co-emergence of item-based constructions and syntactic categories based on semantically annotated corpora.
By generalising over similarities in the form and/or meaning of observed linguistic expressions, our methodology is able to learn compositional lexical and item-based constructions, along with a network of emergent syntactic categories (in the sense of Croft 2001) that models how the slots of the item-based constructions can be filled by the lexical constructions. We have implemented our methodology for Fluid Construction Grammar (Steels 2011; see https://www.fcg-net.org) and have applied it to a subset of the CLEVR benchmark dataset (Johnson 2017). The evaluation results show that the methodology allows for fast, incremental and effective learning (> 90% accuracy after 500 examples, 100% after 800 examples). The constructions and categorial network that result from the learning process are fully transparent and bidirectional, facilitating both language comprehension and production.
The research that we present here is interesting from both a theoretical and a practical perspective. On the theoretical side, it provides a precise model of how a fully operational construction grammar consisting of lexical and item-based constructions can be bootstrapped from raw observations. Moreover, it gives a unique insight into the compositional and non-compositional aspects of language, and why constructions are an excellent means to capture these. On the practical side, the techniques that we introduce here pave the way for learning computationally tractable, usage-based construction grammars that facilitate both language comprehension and production. Such systems are valuable for a large range of application domains, including conversational agents, intelligent tutoring systems and question answering systems.
Original language | English |
---|---|
Publication status | Published - 16 Aug 2021 |
Event | 11th International Conference on Construction Grammar - Antwerp, Belgium Duration: 18 Aug 2021 → 20 Aug 2021 https://www.uantwerpen.be/en/conferences/construction-grammars/ |
Conference
Conference | 11th International Conference on Construction Grammar |
---|---|
Abbreviated title | ICCG11 |
Country/Territory | Belgium |
City | Antwerp |
Period | 18/08/21 → 20/08/21 |
Internet address |
Keywords
- Learning Constructions
- Emergent Categories
- Computational Construction Grammar
- Fluid Construction Grammar
- Artificial Intelligence