Speech driven photo realistic facial animation based on an articulatory DBN model and AAM features

Dongmei Jiang, Yong Zhao, Hichem Sahli, Yanning Zhang

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

This paper presents a photo realistic facial animation synthesis approach based on an audio visual articulatory dynamic Bayesian network model (AF_AVDBN), in which the maximum asynchronies between the articulatory features, such as lips, tongue and glottis/-velum, can be controlled. Perceptual Linear Prediction (PLP) features from audio speech, as well as active appearance model (AAM) features from face images of an audio visual continuous speech database, are adopted to train the AF_AVDBN model parameters. Based on the trained model, given an input audio speech, the optimal AAM visual features are estimated via a maximum likelihood estimation (MLE) criterion, which are then used to
construct face images for the animation. In our experiments, facial animations are synthesized for 20 continuous audio speech sentences, using the proposed AF_AVDBN model, as well as the state-of-art methods, being the audio visual state synchronous DBN model (SS_DBN) implementing a multi-stream Hidden Markov Model, and the state asynchronous DBN model (SA_DBN). Objective evaluations on the learned AAM features show that much more accurate visual features can be learned from the AF_AVDBN model. Subjective evaluations show that the synthesized facial animations using AF_AVDBN are better than those using the state based SA_DBN and SS_DBN models, in the overall naturalness and
matching accuracy of the mouth movements to the speech content.
Original languageEnglish
Pages (from-to)397-415
JournalMultimedia Tools and Applications
Volume73
Issue number1
Publication statusPublished - 2014

Keywords

  • facial animation
  • DBN
  • Asynchrony
  • AAM

Fingerprint Dive into the research topics of 'Speech driven photo realistic facial animation based on an articulatory DBN model and AAM features'. Together they form a unique fingerprint.

Cite this