Scalable Bayesian optimization using deep neural networks

Jasper Snoek, Oren Ripped, Kevin Swersky, Ryan Kiros, Nadathur Satish, Narayanan Sundaram, Md Mostofa Ali Patwary, Prabhat, Ryan P. Adams

Research output: Chapter in Book/Report/Conference proceedingConference contribution

456 Scopus citations

Abstract

Bayesian optimization is an effective methodology for the global optimization of functions with expensive evaluations. It relies on querying a distribution over functions defined by a relatively cheap surrogate model. An accurate model for this distribution over functions is critical to the effectiveness of the approach, and is typically fit using Gaussian processes (GPs). However, since GPs scale cubically with the number of observations, it has been challenging to handle objectives whose optimization requires many evaluations, and as such, massively parallelizing the optimization. In this work, we explore the use of neural networks as an alternative to GPs to model distributions over functions. We show that performing adaptive basis function regression with a neural network as the parametric form performs competitively with state-of-the-art GP-based approaches, but scales linearly with the number of data rather than cubically. This allows us to achieve a previously intractable degree of parallelism, which we apply to large scale hyperparameter optimization, rapidly finding competitive models on benchmark object recognition tasks using convolutional networks, and image caption generation using neural language models.

Original languageEnglish (US)
Title of host publication32nd International Conference on Machine Learning, ICML 2015
EditorsFrancis Bach, David Blei
PublisherInternational Machine Learning Society (IMLS)
Pages2161-2170
Number of pages10
ISBN (Electronic)9781510810587
StatePublished - 2015
Externally publishedYes
Event32nd International Conference on Machine Learning, ICML 2015 - Lile, France
Duration: Jul 6 2015Jul 11 2015

Publication series

Name32nd International Conference on Machine Learning, ICML 2015
Volume3

Other

Other32nd International Conference on Machine Learning, ICML 2015
Country/TerritoryFrance
CityLile
Period7/6/157/11/15

All Science Journal Classification (ASJC) codes

  • Human-Computer Interaction
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Scalable Bayesian optimization using deep neural networks'. Together they form a unique fingerprint.

Cite this