Fast dropout training

Sida Wang, Christopher D. Manning

Research output: Contribution to conferencePaperpeer-review

128 Scopus citations


Preventing feature co-adaptation by encouraging independent contributions from different features often improves classification and regression performance. Dropout training (Hinton et al., 2012) does this by randomly dropping out (zeroing) hidden units and input features during training of neural networks. However, repeatedly sampling a random subset of input features makes training much slower. Based on an examination of the implied objective function of dropout training, we show how to do fast dropout training by sampling from or integrating a Gaussian approximation, instead of doing Monte Carlo optimization of this objective. This approximation, justified by the central limit theorem and empirical evidence, gives an order of magnitude speedup and more stability. We show how to do fast dropout training for classification, regression, and multilayer neural networks. Beyond dropout, our technique is extended to integrate out other types of noise and small image transformations.

Original languageEnglish (US)
Number of pages9
StatePublished - Jan 1 2013
Event30th International Conference on Machine Learning, ICML 2013 - Atlanta, GA, United States
Duration: Jun 16 2013Jun 21 2013


Other30th International Conference on Machine Learning, ICML 2013
Country/TerritoryUnited States
CityAtlanta, GA

All Science Journal Classification (ASJC) codes

  • Human-Computer Interaction
  • Sociology and Political Science


Dive into the research topics of 'Fast dropout training'. Together they form a unique fingerprint.

Cite this