In this contribution, we present the Historical Corpus of Dutch (HCD), a new multi-genre, diachronic corpus of Early and Late Modern Dutch (ca.1550-1850). It consists of a digitised collection of handwritten administrative texts (e.g. town council meeting reports), handwritten ego-documents (e.g. diaries and travelogues), and printed pamphlets (e.g. of a political or religious nature). The corpus is also balanced between northern and southern material, with data from the provinces of Holland and Zeeland for the North, and from Flanders and Brabant for the South. After having discussed its structure and composition, we will illustrate the value of the new corpus with a number of smaller case studies. Based on our experiences with the corpus, we will conclude by launching a plea for historical corpus building not to focus too much on the quantity of data (‘big data’), but rather shift attention to data quality. 

Originele taal-2English
Pagina's (van-tot)114–132
Aantal pagina's19
TijdschriftTaal en Tongval
Nummer van het tijdschrift1
StatusPublished - 2023


Duik in de onderzoeksthema's van 'Historical Corpus of Dutch: A new multi-genre corpus of Early and Late Modern Dutch'. Samen vormen ze een unieke vingerafdruk.

Citeer dit