Zero-shot learning for multilingual discourse relation classification

Classifying discourse relations is a hard task: discourse-annotated data is scarce, especially for languages other than English, and there exist different theoretical frameworks that affect textual spans to be linked and the label set used. Thus, work on transfer between languages is very limited, especially between frameworks, while it could improve our understanding of some theoretical aspects and enhance many applications. In this paper, we propose the first experiments on zero-shot learning for discourse relation classification and investigate several paths in the way source data can be combined, either based on languages, frameworks, or similarity measures. We demonstrate how difficult transfer is for the task at hand, and that the most impactful factor is label set divergence, where the notion of underlying framework possibly conceals crucial disagreements.

Mots clés

discourse analysis discourse relations natural language processing

Domaines

Informatique et langage [cs.CL]

Fichier principal

CODI_2024_DOUBLE____Multilingual_discourse_relation_identification (1).pdf (241.37 Ko)

Origine	Fichiers produits par l'(les) auteur(s)

Eleni Metheniti : Connectez-vous pour contacter le contributeur

https://hal.science/hal-04483805

Soumis le : jeudi 29 février 2024-14:17:54

Dernière modification le : samedi 8 juin 2024-03:22:43

Archivage à long terme le : jeudi 30 mai 2024-19:05:33

Dates et versions

hal-04483805 , version 1 (29-02-2024)

hal-04483805 , version 2 (06-06-2024)

Identifiants

HAL Id : hal-04483805 , version 1

Citer

Eleni Metheniti, Philippe Muller, Chloé Braud, Margarita Hernández-Casas. Zero-shot learning for multilingual discourse relation classification. 5th Workshop on Computational Approaches to Discourse (CODI 2024), Mar 2024, St Julians, Malta. à paraître. ⟨hal-04483805v1⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

672 Consultations

205 Téléchargements