Communication Dans Un Congrès Année : 2026

SIRUP: A DIFFUSION-BASED VIRTUAL UPMIXER OF STEERING VECTORS FOR HIGHLY-DIRECTIVE SPATIALIZATION WITH FIRST-ORDER AMBISONICS

Résumé

This paper presents virtual upmixing of steering vectors captured by a fewer-channel spherical microphone array. This challenge has conventionally been addressed by recovering the directions and signals of sound sources from first-order ambisonics (FOA) data, and then rendering the higher-order ambisonics (HOA) data using a physics-based acoustic simulator. This approach, however, struggles to handle the mutual dependency between the spatial directivity of source estimation and the spatial resolution of FOA ambisonics data. Our method, named SIRUP, employs a latent diffusion model architecture. Specifically, a variational autoencoder (VAE) is used to learn a compact encoding of the HOA data in a latent space and a diffusion model is then trained to generate the HOA embeddings, conditioned by the FOA data. Experimental results showed that SIRUP achieved a significant improvement compared to FOA systems for steering vector upmixing, source localization, and speech denoising.

Fichier principal
Vignette du fichier
2026_ICASSP_Emilio.pdf (948 Ko) Télécharger le fichier
Origine Fichiers produits par l'(les) auteur(s)
licence

Dates et versions

hal-05516730 , version 1 (18-02-2026)

Licence

Identifiants

Citer

Emilio Picard, Diego Di Carlo, Aditya Arie Nugraha, Mathieu Fontaine, Kazuyoshi Yoshii. SIRUP: A DIFFUSION-BASED VIRTUAL UPMIXER OF STEERING VECTORS FOR HIGHLY-DIRECTIVE SPATIALIZATION WITH FIRST-ORDER AMBISONICS. ICASSP 2026 - 2026 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2026, Barcelone, Spain. pp.14707-14711, ⟨10.1109/ICASSP55912.2026.11464234⟩. ⟨hal-05516730⟩
1260 Consultations
270 Téléchargements

Altmetric

Partager

  • More