A Comparison of New and Old Algorithms for a Mixture Estimation Problem

David P. Helmbold, Robert E. Schapire, Yoram Singer, Manfred K. Warmuth

Research output: Contribution to journalReview articlepeer-review

40 Scopus citations

Abstract

We investigate the problem of estimating the proportion vector which maximizes the likelihood of a given sample for a mixture of given densities. We adapt a framework developed for supervised learning and give simple derivations for many of the standard iterative algorithms like gradient projection and EM. In this framework, the distance between the new and old proportion vectors is used as a penalty term. The square distance leads to the gradient projection update, and the relative entropy to a new update which we call the exponentiated gradient update (EGη). Curiously, when a second order Taylor expansion of the relative entropy is used, we arrive at an update EMη which, for η = 1, gives the usual EM update. Experimentally, both the EMη-update and the EGη-update for η > 1 outperform the EM algorithm and its variants. We also prove a polynomial bound on the rate of convergence of the EGη algorithm.

Original languageEnglish (US)
Pages (from-to)97-119
Number of pages23
JournalMachine Learning
Volume27
Issue number1
DOIs
StatePublished - 1997

All Science Journal Classification (ASJC) codes

  • Software
  • Artificial Intelligence

Keywords

  • EM
  • Exponentiated gradient algorithms
  • Maximum likelihood
  • Mixture models

Fingerprint

Dive into the research topics of 'A Comparison of New and Old Algorithms for a Mixture Estimation Problem'. Together they form a unique fingerprint.

Cite this