Abstract
Most automatic speech recognition systems employ Hidden Markov Models with Gaussian mixture emission distributions to model the acoustics. There have been several attempts however to challenge this approach, e.g. by introducing a neural network (NN) as an alternative acoustic model. Although the performance of these so-called hybrid systems is actually quite good, their training is often problematic and time consuming. By using a reservoir - this is a recurrent NN with only the out-put weights being trainable - we can overcome this disadvantage and yet obtain good accuracy. In this paper, we propose the first reservoir-based connected digit recognition system, and we demonstrate good performance on the Aurora-2 testbed. Since RC is a new technology, we anticipate that our present system is still sub-optimal, and further improvements are possible.
Original language | English (US) |
---|---|
Pages (from-to) | 1725-1728 |
Number of pages | 4 |
Journal | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH |
State | Published - 2011 |
Externally published | Yes |
Event | 12th Annual Conference of the International Speech Communication Association, INTERSPEECH 2011 - Florence, Italy Duration: Aug 27 2011 → Aug 31 2011 |
All Science Journal Classification (ASJC) codes
- Language and Linguistics
- Human-Computer Interaction
- Signal Processing
- Software
- Modeling and Simulation
Keywords
- Digits
- Reservoir computing
- Speech recognition