Projects per year
Abstract
In this contribution, we present the Historical Corpus of Dutch (HCD), a new multi-genre, diachronic corpus of Early and Late Modern Dutch (ca.1550-1850). It consists of a digitised collection of handwritten administrative texts (e.g. town council meeting reports), handwritten ego-documents (e.g. diaries and travelogues), and printed pamphlets (e.g. of a political or religious nature). The corpus is also balanced between northern and southern material, with data from the provinces of Holland and Zeeland for the North, and from Flanders and Brabant for the South. After having discussed its structure and composition, we will illustrate the value of the new corpus with a number of smaller case studies. Based on our experiences with the corpus, we will conclude by launching a plea for historical corpus building not to focus too much on the quantity of data (‘big data’), but rather shift attention to data quality.
Original language | English |
---|---|
Pages (from-to) | 114–132 |
Number of pages <span style="color:red"p> <font size="1.5"> ✽ </span> </font> | 19 |
Journal | Taal en Tongval |
Volume | 75 |
Issue number | 1 |
DOIs | |
Publication status | Published - 2023 |
Keywords
- historical corpus building
- history of Dutch
- corpus linguistics
- northern and southern Dutch
- spelling of long a
- d- and w-forms
Fingerprint
Dive into the research topics of 'Historical Corpus of Dutch: A new multi-genre corpus of Early and Late Modern Dutch'. Together they form a unique fingerprint.Projects
- 3 Finished
-
FWOAL959: Setting the standard: Norms and usage in Early and Late Modern Dutch (1550-1850)
1/01/20 → 31/12/23
Project: Fundamental
-
SRP67: SRP-Groeifinanciering: Historical sociolinguistics: towards a new history of Dutch in Flanders
Vosters, R. & Vandenbussche, W.
1/03/19 → 29/02/24
Project: Fundamental
-
FWOTM866: Pluricentricity in language history. Building blocks for an integrated history of Dutch (1550-1850)
Vosters, R., Vandenbussche, W. & Van de Voorde, I.
1/10/17 → 30/09/21
Project: Fundamental
Datasets
-
Word count data - Historical Corpus of Dutch (HCD)
Van de Voorde, I. (Creator), Rutten, G. (Supervisor), Vosters, R. (Supervisor), van der Wal, M. (Supervisor) & Vandenbussche, W. (Supervisor), Zenodo, 2025
DOI: https://doi.org/10.5281/zenodo.14942915
Dataset