Projects per year
Abstract
In this paper, we investigate how the prediction paradigm from machine learning and Natural Language Processing (NLP) can be put to use in computational historical linguistics. We propose word prediction as an intermediate task, where the forms of unseen words in some target language are predicted from the forms of the corresponding words in a source language. Word prediction allows us to develop algorithms for phylogenetic tree reconstruction, sound correspondence identification and cognate detection, in ways close to attested methods for linguistic reconstruction. We will discuss different factors, such as data representation and the choice of machine learning model, that have to be taken into account when applying prediction methods in historical linguistics. We present our own implementations and evaluate them on different tasks in historical linguistics.
Original language | English |
---|---|
Pages (from-to) | 295–336 |
Number of pages | 42 |
Journal | Journal of Language Modelling |
Volume | 8 |
Issue number | 2 |
DOIs | |
Publication status | Published - 4 Feb 2021 |
Event | PHYLOGENETIC METHODS IN HISTORICAL LINGUISTICS - Tübingen University, Tübingen, Germany Duration: 27 Mar 2017 → 30 Mar 2017 |
Keywords
- computational historical linguistics
- machine learning
- deep learning
Fingerprint
Dive into the research topics of 'Word prediction in computational historical linguistics'. Together they form a unique fingerprint.-
VLAAI1: Subsidie: Onderzoeksprogramma Artificiële Intelligentie (AI) Vlaanderen
1/07/19 → 31/12/23
Project: Applied
-
FWOTM1012: Identifying drivers of language change using neural agent-based models.
1/11/20 → 31/10/22
Project: Fundamental
Prizes
-
FWO predoctoral fellowship fundamental research: Identifying drivers of language change using neural agent-based models
Dekker, Peter (Recipient), 8 Oct 2020
Prize: Fellowship awarded competitively