Using LLMs to Build a Database of Climate Extreme Impacts

Ni Li, Shorouq Zahra, Mariana Madruga de Brito, Clare Marie Flynn, Olof Görnerup, Koffi Worou, Murathan Kurfalı, Chanjuan Meng, Wim Thiery, Jakob Zscheischler, Gabriele Messori, Joakim Nivre

Research output: Chapter in Book/Report/Conference proceedingConference paper

Abstract

To better understand how extreme climate events impact society, we need to increase the availability of accurate and comprehensive information about these impacts. We propose a method for building large-scale databases of climate extreme impacts from online textual sources, using LLMs for information extraction in combination with more traditional NLP techniques to improve accuracy and consistency. We evaluate the method against a small benchmark database created by human experts and find that extraction accuracy varies for different types of information. We compare three different LLMs and find that, while the commercial GPT-4 model gives the best performance overall, the open-source models Mistral and Mixtral are competitive for some types of information.
Original languageEnglish
Title of host publicationAssociation for Computational Linguistics
Pages93–110
Number of pages8
VolumeProceedings of the 1st Workshop on Natural Language Processing Meets Climate Change (ClimateNLP 2024)
Publication statusPublished - Aug 2024

Cite this