Detecting the terminality of speech-turn boundary for spoken interactions in French TV and Radio content - Institut de Recherche et Coordination Acoustique/Musique
Communication Dans Un Congrès Année : 2024

Detecting the terminality of speech-turn boundary for spoken interactions in French TV and Radio content

Résumé

Transition Relevance Places are defined as the end of an utterance where the interlocutor may take the floor without interrupting the current speaker -i.e., a place where the turn is terminal. Analyzing turn terminality is useful to study the dynamic of turn-taking in spontaneous conversations. This paper presents an automatic classification of spoken utterances as Terminal or Non-Terminal in multi-speaker settings. We compared audio, text, and fusions of both approaches on a French corpus of TV and Radio extracts annotated with turn-terminality information at each speaker change. Our models are based on pretrained self-supervised representations. We report results for different fusion strategies and varying context sizes. This study also questions the problem of performance variability by analyzing the differences in results for multiple training runs with random initialization. The measured accuracy would allow the use of these models for large-scale analysis of turn-taking.
Fichier principal
Vignette du fichier
uro24_interspeech.pdf (228.2 Ko) Télécharger le fichier
Origine Fichiers éditeurs autorisés sur une archive ouverte

Dates et versions

hal-04694968 , version 1 (11-09-2024)

Identifiants

Citer

Rémi Uro, Marie Tahon, David Doukhan, Antoine Laurent, Albert Rilliard. Detecting the terminality of speech-turn boundary for spoken interactions in French TV and Radio content. Interspeech 2024, Itshak Lapidot; Sharon Gannot, Sep 2024, Kos, Greece. pp.3560 - 3564, ⟨10.21437/interspeech.2024-1163⟩. ⟨hal-04694968⟩
72 Consultations
21 Téléchargements

Altmetric

Partager

More