Accéder directement au contenu Accéder directement à la navigation
Pré-publication, Document de travail

Human beatbox sound recognition using an automatic speech recognition toolkit

Abstract : Human beatboxing is a vocal art making use of speech organs to produce percussive sounds and imitate musical instruments. Beatbox sound classification is a current challenge that can be used for automatic database annotation and music-information retrieval. In this study, a human-beatbox sound recognition system was developed with an adaptation of the Kaldi toolbox. Such tool is already widely used for automatic speech recognition. The corpus consisted of eighty boxemes, which were recorded repeatedly by two beatboxers. The sounds were annotated and transcribed to the system by means of a beatbox-specific pictographic writing (Vocal Grammatics). The recognition-system robustness to recording conditions was assessed on recordings of six different microphones and settings. The decoding part was made with monophone acoustic models trained with a classical HMM-GMM model. Different parameters of our system were tested : i) the number of HMM states, ii) the number of MFCC, iii) the presence or not of a pause boxeme in right and left contexts in the lexicon and iv) the rate of silence probability. Our best model was obtained with the addition of a pause in left and right contexts of each boxeme in the lexicon, a 0.8 silence probability, 22 MFCC and three states HMM. Boxeme error rate in such configuration was lowered to 15.13%.
Type de document :
Pré-publication, Document de travail
Liste complète des métadonnées

Littérature citée [15 références]  Voir  Masquer  Télécharger
Contributeur : Benjamin Lecouteux <>
Soumis le : vendredi 10 juillet 2020 - 16:56:23
Dernière modification le : mardi 24 novembre 2020 - 16:00:18
Archivage à long terme le : : mardi 1 décembre 2020 - 21:07:02


Fichiers produits par l'(les) auteur(s)


  • HAL Id : hal-02896690, version 1


Solène Evain, Benjamin Lecouteux, Didier Schwab, Adrien Contesse, Antoine Pinchaud, et al.. Human beatbox sound recognition using an automatic speech recognition toolkit. 2020. ⟨hal-02896690⟩



Consultations de la notice


Téléchargements de fichiers