When social meaning meets NLP: How can NLP models inform sociolinguistic research and vice versa?

Dong Nguyen, Laura Rosseel

Research output: Unpublished contribution to conferenceUnpublished abstract

Abstract

Research in Natural Language Processing (NLP) has been marked by substantial developments in the area of deep learning. These approaches automatically learn to represent words, sentences, and documents as dense, continuous representations (i.e. embeddings). So far, studies analyzing deep neural network models and their resulting representations in NLP have primarily focused on semantic and syntactic aspects of language. Social meaning, unfortunately, has hitherto been largely overlooked. Considering this type of meaning can enrich NLP models and offer new possibilities for sociolinguistic research. In this paper, we illustrate the potential of NLP for sociolinguistics and vice versa by focusing on the social meanings of spelling variation (Sebba 2007).
First, we reflect on current NLP developments and why social meaning is important to consider. Second, we draw on methods to analyze societal biases in NLP models (e.g., Caliskan et al., 2017; May et al. 2019) in order to investigate whether popular NLP models encode social meanings associated with different types of spelling variation. We consider both static word embedding models, e.g., the skipgram model (Mikolov et al. 2013), and popular pre-trained models, e.g., BERT (Devlin et al. 2019).
For example, does a skipgram model associate forms with g-dropping (e..g., doin) or lengthening (e.g., coooool) more strongly with a particular gender or social attribute? Or, what does a pre-trained model like BERT predict about an author based on a tweet with or without a specific type of spelling variation?
Original languageEnglish
Publication statusPublished - 2022
EventSociolinguistics Symposium 24 - Universiteit Gent, Gent, Belgium
Duration: 13 Jul 202216 Jul 2022
https://ss24ghent.be

Conference

ConferenceSociolinguistics Symposium 24
Country/TerritoryBelgium
CityGent
Period13/07/2216/07/22
Internet address

Keywords

  • spelling variation
  • social meaning of language variation
  • NLP
  • sociolinguistics
  • language attitudes
  • English variation

Fingerprint

Dive into the research topics of 'When social meaning meets NLP: How can NLP models inform sociolinguistic research and vice versa?'. Together they form a unique fingerprint.

Cite this