Training algorithms for hidden Markov models using entropy based distance functions

Yoram Singer, Manfred K. Warmuth

Research output: Chapter in Book/Report/Conference proceedingConference contribution

25 Scopus citations

Abstract

We present new algorithms for parameter estimation of HMMs. By adapting a framework used for supervised learning, we construct iterative algorithms that maximize the likelihood of the observations while also attempting to stay "close" to the current estimated parameters. We use a bound on the relative entropy between the two HMMs as a distance measure between them. The result is new iterative training algorithms which are similar to the EM (Baum-Welch) algorithm for training HMMs. The proposed algorithms are composed of a step similar to the expectation step of Baum-Welch and a new update of the parameters which replaces the maximization (re-estimation) step. The algorithm takes only negligibly more time per iteration and an approximated version uses the same expectation step as Baum-Welch. We evaluate experimentally the new algorithms on synthetic and natural speech pronunciation data. For sparse models, i.e. models with relatively small number of non-zero parameters, the proposed algorithms require significantly fewer iterations.

Original languageEnglish (US)
Title of host publicationAdvances in Neural Information Processing Systems 9 - Proceedings of the 1996 Conference, NIPS 1996
PublisherNeural information processing systems foundation
Pages641-647
Number of pages7
ISBN (Print)0262100657, 9780262100656
StatePublished - Jan 1 1997
Externally publishedYes
Event10th Annual Conference on Neural Information Processing Systems, NIPS 1996 - Denver, CO, United States
Duration: Dec 2 1996Dec 5 1996

Publication series

NameAdvances in Neural Information Processing Systems
ISSN (Print)1049-5258

Other

Other10th Annual Conference on Neural Information Processing Systems, NIPS 1996
Country/TerritoryUnited States
CityDenver, CO
Period12/2/9612/5/96

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Fingerprint

Dive into the research topics of 'Training algorithms for hidden Markov models using entropy based distance functions'. Together they form a unique fingerprint.

Cite this