### Abstract

Preconditioned gradient methods are among the most general and powerful tools in optimization. However, preconditioning requires storing and manipulating prohibitively large matrices. We describe and analyze a new structure-aware preconditioning algorithm, called Shampoo, for stochastic optimization over tensor spaces. Shampoo maintains a set of preconditioning matrices, each of which operates on a single dimension, contracting over the remaining dimensions. We establish convergence guarantees in the stochastic convex setting, the proof of which builds upon matrix trace inequalities. Our experiments with state- of-the-art deep learning models show that Shampoo is capable of converging considerably faster than commonly used optimizers. Surprisingly, although it involves a more complex update rule, Shampoo's runtime per step is comparable in practice to that of simple gradient methods such as SGD, AdaGrad, and Adam.

Original language | English (US) |
---|---|

Title of host publication | 35th International Conference on Machine Learning, ICML 2018 |

Editors | Jennifer Dy, Andreas Krause |

Publisher | International Machine Learning Society (IMLS) |

Pages | 2956-2964 |

Number of pages | 9 |

ISBN (Electronic) | 9781510867963 |

State | Published - Jan 1 2018 |

Event | 35th International Conference on Machine Learning, ICML 2018 - Stockholm, Sweden Duration: Jul 10 2018 → Jul 15 2018 |

### Publication series

Name | 35th International Conference on Machine Learning, ICML 2018 |
---|---|

Volume | 4 |

### Other

Other | 35th International Conference on Machine Learning, ICML 2018 |
---|---|

Country | Sweden |

City | Stockholm |

Period | 7/10/18 → 7/15/18 |

### All Science Journal Classification (ASJC) codes

- Computational Theory and Mathematics
- Human-Computer Interaction
- Software

## Fingerprint Dive into the research topics of 'Shampoo: Preconditioned stochastic tensor optimization'. Together they form a unique fingerprint.

## Cite this

*35th International Conference on Machine Learning, ICML 2018*(pp. 2956-2964). (35th International Conference on Machine Learning, ICML 2018; Vol. 4). International Machine Learning Society (IMLS).