TY - GEN
T1 - Training algorithms for hidden Markov models using entropy based distance functions
AU - Singer, Yoram
AU - Warmuth, Manfred K.
PY - 1997/1/1
Y1 - 1997/1/1
N2 - We present new algorithms for parameter estimation of HMMs. By adapting a framework used for supervised learning, we construct iterative algorithms that maximize the likelihood of the observations while also attempting to stay "close" to the current estimated parameters. We use a bound on the relative entropy between the two HMMs as a distance measure between them. The result is new iterative training algorithms which are similar to the EM (Baum-Welch) algorithm for training HMMs. The proposed algorithms are composed of a step similar to the expectation step of Baum-Welch and a new update of the parameters which replaces the maximization (re-estimation) step. The algorithm takes only negligibly more time per iteration and an approximated version uses the same expectation step as Baum-Welch. We evaluate experimentally the new algorithms on synthetic and natural speech pronunciation data. For sparse models, i.e. models with relatively small number of non-zero parameters, the proposed algorithms require significantly fewer iterations.
AB - We present new algorithms for parameter estimation of HMMs. By adapting a framework used for supervised learning, we construct iterative algorithms that maximize the likelihood of the observations while also attempting to stay "close" to the current estimated parameters. We use a bound on the relative entropy between the two HMMs as a distance measure between them. The result is new iterative training algorithms which are similar to the EM (Baum-Welch) algorithm for training HMMs. The proposed algorithms are composed of a step similar to the expectation step of Baum-Welch and a new update of the parameters which replaces the maximization (re-estimation) step. The algorithm takes only negligibly more time per iteration and an approximated version uses the same expectation step as Baum-Welch. We evaluate experimentally the new algorithms on synthetic and natural speech pronunciation data. For sparse models, i.e. models with relatively small number of non-zero parameters, the proposed algorithms require significantly fewer iterations.
UR - http://www.scopus.com/inward/record.url?scp=84899029607&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84899029607&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84899029607
SN - 0262100657
SN - 9780262100656
T3 - Advances in Neural Information Processing Systems
SP - 641
EP - 647
BT - Advances in Neural Information Processing Systems 9 - Proceedings of the 1996 Conference, NIPS 1996
PB - Neural information processing systems foundation
T2 - 10th Annual Conference on Neural Information Processing Systems, NIPS 1996
Y2 - 2 December 1996 through 5 December 1996
ER -