Hybrid Deep Neural Network-Hidden Markov Model (DNN-HMM) Based Speech Emotion Recognition

Longfei Li, Yong Zhao, Dongmei Jiang, Yanning Zhang, Fengna Wang, Isabel Gonzalez, Valentin Enescu, Hichem Sahli

Research output: Chapter in Book/Report/Conference proceedingConference paper

100 Citations (Scopus)

Abstract

Deep Neural Network Hidden Markov Models, or DNN-HMMs, are recently very promising acoustic models achieving good speech recognition results over Gaussian mixture model based HMMs (GMM-HMMs). In this paper, for emotion recognition from speech, we investigate DNN-HMMs with restricted Boltzmann Machine (RBM) based unsupervised pre-training, and DNN-HMMs with discriminative pre-training. Emotion recognition experiments are carried out on these two models on the eNTERFACE'05 database and Berlin database, respectively, and results are compared with those from the GMM-HMMs, the shallow-NN-HMMs with two layers, as well as the Multi-layer Perceptrons HMMs (MLP-HMMs). Experimental results show that when the numbers of the hidden layers as well hidden units are properly set, the DNN could extend the labeling ability of GMM-HMM. Among all the models, the DNN-HMMs with discriminative pre-training obtain the best results. For example, for the eNTERFACE'05 database, the recognition accuracy improves 12.22% from the DNN-HMMs with unsupervised pre-training, 11.67% from the GMM-HMMs, 10.56% from the MLP-HMMs, and even 17.22% from the shallow-NN-HMMs, respectively.
Original languageEnglish
Title of host publication2013 Humaine Association Conference on Affective Computing and Intelligent Interaction (ACII 2013)
PublisherIEEE
Pages312-317
Number of pages6
ISBN (Print)978-0-7695-5048-0
Publication statusPublished - 2013
Event2013 Humaine Association Conference on Affective Computing and Intelligent Interaction (ACII 2013) - Geneva, Switzerland
Duration: 2 Sep 20135 Sep 2013

Publication series

NameInternational Conference on Affective Computing and Intelligent Interaction and Workshops
PublisherIEEE
ISSN (Print)2156-8103
ISSN (Electronic)2156-8111

Conference

Conference2013 Humaine Association Conference on Affective Computing and Intelligent Interaction (ACII 2013)
CountrySwitzerland
CityGeneva
Period2/09/135/09/13

Keywords

  • emotion recognition
  • machine learning
  • dnn

Fingerprint

Dive into the research topics of 'Hybrid Deep Neural Network-Hidden Markov Model (DNN-HMM) Based Speech Emotion Recognition'. Together they form a unique fingerprint.

Cite this