Convolutional Time Delay Neural Network for Khmer Automatic Speech Recognition

Nalin Srun; Sotheara Leang; Ye Kyaw; Sethserey Sam

Communication Dans Un Congrès Année : 2022

Convolutional Time Delay Neural Network for Khmer Automatic Speech Recognition

(1) , (1, 2) , (1) , (1)

1
2

Nalin Srun

Fonction : Auteur

Cambodia Academy of Digital Technology

Sotheara Leang

Fonction : Auteur

Cambodia Academy of Digital Technology

Multimodal Perception and Sociable Interaction

Ye Kyaw

Fonction : Auteur

Cambodia Academy of Digital Technology

Sethserey Sam

Fonction : Auteur

Cambodia Academy of Digital Technology

Résumé

Convolutional Neural Networks have been proven to successfully capture spatial aspects of the speech signal and eliminate spectral variations across speakers for Automatic Speech Recognition. In this study, we investigate the Convolutional Neural Network with Time Delay Neural Network for an acoustic model to deal with large vocabulary continuous speech recognition for Khmer. Our idea is to use Convolutional Neural Networks to extract local features of the speech signal, whereas Time Delay Neural Networks capture long temporal correlations between acoustic events. The experimental results show that the suggested network outperforms the Time Delay Neural Network and achieves an average relative improvement of 14% across test sets.

Mots clés

Khmer ASR Time Delay Neural Network Convolutional Neural Network Low-resource Language

Domaines

Informatique [cs] Intelligence artificielle [cs.AI] Apprentissage [cs.LG] Traitement du signal et de l'image [eess.SP]

Fichier principal

main.pdf (304.53 Ko)

Origine	Fichiers produits par l'(les) auteur(s)

Sotheara Leang : Connectez-vous pour contacter le contributeur

https://hal.univ-grenoble-alpes.fr/hal-03865538

Soumis le : mardi 22 novembre 2022-12:58:19

Dernière modification le : mercredi 18 décembre 2024-09:50:17

Dates et versions

hal-03865538 , version 1 (22-11-2022)

Identifiants

HAL Id : hal-03865538 , version 1

Citer

Nalin Srun, Sotheara Leang, Ye Kyaw, Sethserey Sam. Convolutional Time Delay Neural Network for Khmer Automatic Speech Recognition. iSAI-NLP-AIoT 2022, Nov 2022, Chiang Mai, Thailand. ⟨hal-03865538⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UGA CNRS LIG LIG_SIC_M-PSI LIG_SIDCH

161 Consultations

270 Téléchargements

Convolutional Time Delay Neural Network for Khmer Automatic Speech Recognition

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager