Beyond Static Emotions: Leveraging Multitask Learning to Model Dynamics of Dimensional Affect in Speech

Dimensional affect prediction from speech has traditionally relied on acoustic features to estimate continuous affect representations (e.g., arousal, valence) at each time step. However, affect evolves dynamically over time, and incorporating temporal information may improve prediction accuracy. This study investigates emotional dynamics in speech emotion recognition using multitask learning, where a model jointly predicts both the affect state and its temporal derivative. Experiments on the RECOLA and SEWA datasets show that incorporating dynamic information improves affect state prediction, particularly for valence, known to be challenging to model from audio alone. While CCC scores for affect dynamic predictions remain lower than those for affect state predictions, results indicate that learning dynamics as an auxiliary task enhances affect state estimation over time. These findings underscore the importance of modelling emotional dynamics to capture the temporal evolution of affect.

Mots clés

Domaines

Fichier principal

Beyond_Static_Emotions__Leveraging_Multitask_Learning_to_Model_Dynamics_of_Dimensional_Affect_in_Speech_camera_ready.pdf (515.64 Ko)

Origine	Fichiers produits par l'(les) auteur(s)
Licence	Autorisation HAL

Connectez-vous pour contacter le contributeur

https://hal.univ-grenoble-alpes.fr/hal-05375921

Soumis le : vendredi 21 novembre 2025-10:13:31

Dernière modification le : vendredi 28 novembre 2025-16:36:22

Dates et versions

hal-05375921 , version 1 (21-11-2025)

Licence

Autorisation HAL

Identifiants

HAL Id : hal-05375921 , version 1
DOI : 10.1007/978-3-032-02548-7_10

Citer

Yuxuan Zhang, Hippolyte Fournier, Ruslan Kalitvianski, Marco Dinarelli, Fabien Ringeval. Beyond Static Emotions: Leveraging Multitask Learning to Model Dynamics of Dimensional Affect in Speech. 28th International Conference on Text, Speech and Dialogue, Aug 2025, Erlangen-Nürnberg, Germany. pp.109-120, ⟨10.1007/978-3-032-02548-7_10⟩. ⟨hal-05375921⟩