Stochastic modified equations and dynamics of stochastic gradient algorithms I: Mathematical foundations

Qianxiao Li, Cheng Tai, E. Weinan

Research output: Contribution to journalArticlepeer-review

98 Scopus citations

Abstract

We develop the mathematical foundations of the stochastic modified equations (SME) framework for analyzing the dynamics of stochastic gradient algorithms, where the latter is approximated by a class of stochastic differential equations with small noise parameters. We prove that this approximation can be understood mathematically as an weak approximation, which leads to a number of precise and useful results on the approximations of stochastic gradient descent (SGD), momentum SGD and stochastic Nesterov's accelerated gradient method in the general setting of stochastic objectives. We also demonstrate through explicit calculations that this continuous-time approach can uncover important analytical insights into the stochastic gradient algorithms under consideration that may not be easy to obtain in a purely discrete-time setting.

Original languageEnglish (US)
JournalJournal of Machine Learning Research
Volume20
StatePublished - Mar 1 2019

All Science Journal Classification (ASJC) codes

  • Software
  • Artificial Intelligence
  • Control and Systems Engineering
  • Statistics and Probability

Keywords

  • Modified equations
  • Momentum
  • Nesterov's accelerated gradient
  • Stochastic differential equations
  • Stochastic gradient algorithms

Fingerprint

Dive into the research topics of 'Stochastic modified equations and dynamics of stochastic gradient algorithms I: Mathematical foundations'. Together they form a unique fingerprint.

Cite this