imec-ETRO-VUB at W-NUT 2020 Shared Task-3: A Multilabel BERT-based system for predicting COVID-19 events

Research output: Chapter in Book/Report/Conference proceedingConference paper

Abstract

In this paper, we present our system designed to address the W-NUT 2020 shared task for COVID-19 Event Extraction from Twitter. To mitigate the noisy nature of the Twitter stream, our system makes use of the COVID-Twitter-BERT (CT-BERT), which is a language model pre-trained on a large corpus of COVID-19 related Twitter messages. Our system is trained on the COVID-19 Twitter Event Corpus and is able to identify relevant text spans that answer pre-defined questions (i.e., slot types) for five COVID-19 related events (i.e., TESTED POSITIVE, TESTED NEGATIVE, CAN-NOT-TEST, DEATH and CURE & PREVENTION). We have experimented with different architectures; our best performing model relies on a multilabel classifier on top of the CT-BERT model that jointly trains all the slot types for a single event. Our experimental results indicate that our Multilabel-CT-BERT system outperforms the baseline methods by 7 percentage points in terms of micro average F1 score. Our model ranked as 4th in the shared task leaderboard.
Original languageEnglish
Title of host publicationConference on Empirical Methods in Natural Language Processing (and forerunners) (2020)
PublisherAssociation for Computational Linguistics
Pages505-513
Number of pages9
VolumeProceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020)
Publication statusPublished - 16 Nov 2020
Event2020 The 6th Workshop on Noisy User-generated Text (W-NUT)
- Online
Duration: 19 Nov 2020 → …
http://noisy-text.github.io/2020/

Workshop

Workshop2020 The 6th Workshop on Noisy User-generated Text (W-NUT)
Period19/11/20 → …
Internet address

Fingerprint Dive into the research topics of 'imec-ETRO-VUB at W-NUT 2020 Shared Task-3: A Multilabel BERT-based system for predicting COVID-19 events'. Together they form a unique fingerprint.

Cite this