Exponentiated Gradient vs. Meets Gradient Descent

Udaya Ghai, Elad Hazan, Yoram Singer

Research output: Contribution to journalConference articlepeer-review

20 Scopus citations

Abstract

The (stochastic) gradient descent and the multiplicative update method are probably the most popular algorithms in machine learning. We introduce and study a new regularization which provides a unification of the additive and multiplicative updates. This regularization is derived from an hyperbolic analogue of the entropy function, which we call hypentropy. It is motivated by a natural extension of the multiplicative update to negative numbers. The hypentropy has a natural spectral counterpart which we use to derive a family of matrix-based updates that bridge gradient methods and the multiplicative method for matrices. While the latter is only applicable to positive semidefinite matrices, the spectral hypentropy method can naturally be used with general rectangular matrices. We analyze the new family of updates by deriving tight regret bounds. We study empirically the applicability of the new update for settings such as multiclass learning, in which the parameters constitute a general rectangular matrix.

Original languageEnglish (US)
Pages (from-to)386-407
Number of pages22
JournalProceedings of Machine Learning Research
Volume117
StatePublished - 2020
Event31st International Conference on Algorithmic Learning Theory, ALT 2020 - San Diego, United States
Duration: Feb 8 2020Feb 11 2020

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Statistics and Probability

Keywords

  • Experimentation
  • Exponentiated Gradient
  • Gradient Descent
  • Online Convex Optimization

Fingerprint

Dive into the research topics of 'Exponentiated Gradient vs. Meets Gradient Descent'. Together they form a unique fingerprint.

Cite this