Accéder directement au contenu Accéder directement à la navigation
Pré-publication, Document de travail

Human beatbox sound recognition using an automatic speech recognition toolkit

Abstract : Human beatboxing is a vocal art making use of speech organs to produce percussive sounds and imitate musical instruments. Beatbox sound classification is a current challenge that can be used for automatic database annotation and music-information retrieval. In this study, a human-beatbox sound recognition system was developed with an adaptation of the Kaldi toolbox. Such tool is already widely used for automatic speech recognition. The corpus consisted of eighty boxemes, which were recorded repeatedly by two beatboxers. The sounds were annotated and transcribed to the system by means of a beatbox-specific pictographic writing (Vocal Grammatics). The recognition-system robustness to recording conditions was assessed on recordings of six different microphones and settings. The decoding part was made with monophone acoustic models trained with a classical HMM-GMM model. Different parameters of our system were tested : i) the number of HMM states, ii) the number of MFCC, iii) the presence or not of a pause boxeme in right and left contexts in the lexicon and iv) the rate of silence probability. Our best model was obtained with the addition of a pause in left and right contexts of each boxeme in the lexicon, a 0.8 silence probability, 22 MFCC and three states HMM. Boxeme error rate in such configuration was lowered to 15.13%.
Type de document :
Pré-publication, Document de travail
Liste complète des métadonnées

Littérature citée [15 références]  Voir  Masquer  Télécharger

https://hal.univ-grenoble-alpes.fr/hal-02896690
Contributeur : Benjamin Lecouteux <>
Soumis le : vendredi 10 juillet 2020 - 16:56:23
Dernière modification le : jeudi 24 septembre 2020 - 09:00:02

Fichier

Revue_beatbox_v0.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-02896690, version 1

Collections

LIG | UGA | CNRS | GIPSA

Citation

Solène Evain, Benjamin Lecouteux, Didier Schwab, Adrien Contesse, Antoine Pinchaud, et al.. Human beatbox sound recognition using an automatic speech recognition toolkit. 2020. ⟨hal-02896690⟩

Partager

Métriques

Consultations de la notice

33

Téléchargements de fichiers

37