Journal article

Lexical ambiguity in contextualized word embeddings: A case study of nominalizations

BLE-BLL

  • 2024
Published in:
  • Lingue e linguaggio. - Società Editrice Il Mulino. - 2024, no. 1/2024, p. 141-182
English In this paper we investigate the extent to which contextualized word embeddings can encode lexical ambiguity. Specifically, we focus on nominalizations in French, which constitute an interesting case for the study of ambiguity because of their frequent polysemy and their relationship with polyfunctional morphological processes. Given a random sample of occurrences of 90 nouns, we compute for each word the pairwise cosine similarity (SelfSim) among their token embeddings extracted from the pre-trained model FlauBERT and we test it as a predictor of the degree of ambiguity of nominalizations. For the evaluation we make use of a manual annotation of lexical ambiguity, testing different annotation strategies: defining word senses with different semantic classifications and granularities; annotating lexemes in isolation or based on a sample of tokens. Our findings contribute to the understanding of (i) the lexical semantic component of contextual embeddings, enhancing their interpretability, (ii) aspects of lexical ambiguity related to derivational semantics and to the contextual variation of
meaning
Faculty
Faculté des lettres et des sciences humaines
Department
Département de français
Language
  • English
Classification
Language, linguistics
License
CC BY-NC-ND
Open access status
green
Identifiers
Persistent URL
https://folia.unifr.ch/unifr/documents/329233
Statistics

Document views: 34 File downloads:
  • varvara_etal_2024_lel.pdf: 31