Reexamining the principle of mean-variance preservation for neural network initialization

Kyle Luther, H. Sebastian Seung

Research output: Contribution to journalArticlepeer-review

Abstract

Before backpropagation training, it is common to randomly initialize a neural network so that mean and variance of activity are uniform across neurons. Classically these statistics were defined over an ensemble of random networks. Alternatively, they can be defined over a random sample of inputs to the network. We show analytically and numerically that these two formulations of the principle of mean-variance preservation are very different in deep networks using rectification nonlinearity (ReLU). We numerically investigate training speed after data-dependent initialization of networks to preserve sample mean and variance.

Original languageEnglish (US)
Article number033135
JournalPhysical Review Research
Volume2
Issue number3
DOIs
StatePublished - Jul 2020

All Science Journal Classification (ASJC) codes

  • General Physics and Astronomy

Fingerprint

Dive into the research topics of 'Reexamining the principle of mean-variance preservation for neural network initialization'. Together they form a unique fingerprint.

Cite this