Acoustic modeling with hierarchical reservoirs

Fabian Triefenbach, Azarakhsh Jalalvand, Kris Demuynck, Jean Pierre Martens

Research output: Contribution to journalArticlepeer-review

62 Scopus citations

Abstract

Accurate acoustic modeling is an essential requirement of a state-of-the-art continuous speech recognizer. The Acoustic Model (AM) describes the relation between the observed speech signal and the non-observable sequence of phonetic units uttered by the speaker. Nowadays, most recognizers use Hidden Markov Models (HMMs) in combination with Gaussian Mixture Models (GMMs) to model the acoustics, but neural-based architectures are on the rise again. In this work, the recently introduced Reservoir Computing (RC) paradigm is used for acoustic modeling. A reservoir is a fixed - and thus non-trained - Recurrent Neural Network (RNN) that is combined with a trained linear model. This approach combines the ability of an RNN to model the recent past of the input sequence with a simple and reliable training procedure. It is shown here that simple reservoir-based AMs achieve reasonable phone recognition and that deep hierarchical and bi-directional reservoir architectures lead to a very competitive Phone Error Rate (PER) of 23.1% on the well-known TIMIT task.

Original languageEnglish (US)
Article number6587732
Pages (from-to)2439-2450
Number of pages12
JournalIEEE Transactions on Audio, Speech and Language Processing
Volume21
Issue number11
DOIs
StatePublished - 2013
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Acoustics and Ultrasonics
  • Electrical and Electronic Engineering

Keywords

  • Acoustic modeling
  • automatic speech recognition
  • recurrent neural networks
  • reservoir computing

Fingerprint

Dive into the research topics of 'Acoustic modeling with hierarchical reservoirs'. Together they form a unique fingerprint.

Cite this