TY - JOUR
T1 - Reexamining the principle of mean-variance preservation for neural network initialization
AU - Luther, Kyle
AU - Seung, H. Sebastian
N1 - Publisher Copyright:
© 2020 authors.
PY - 2020/7
Y1 - 2020/7
N2 - Before backpropagation training, it is common to randomly initialize a neural network so that mean and variance of activity are uniform across neurons. Classically these statistics were defined over an ensemble of random networks. Alternatively, they can be defined over a random sample of inputs to the network. We show analytically and numerically that these two formulations of the principle of mean-variance preservation are very different in deep networks using rectification nonlinearity (ReLU). We numerically investigate training speed after data-dependent initialization of networks to preserve sample mean and variance.
AB - Before backpropagation training, it is common to randomly initialize a neural network so that mean and variance of activity are uniform across neurons. Classically these statistics were defined over an ensemble of random networks. Alternatively, they can be defined over a random sample of inputs to the network. We show analytically and numerically that these two formulations of the principle of mean-variance preservation are very different in deep networks using rectification nonlinearity (ReLU). We numerically investigate training speed after data-dependent initialization of networks to preserve sample mean and variance.
UR - http://www.scopus.com/inward/record.url?scp=85115901263&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85115901263&partnerID=8YFLogxK
U2 - 10.1103/PhysRevResearch.2.033135
DO - 10.1103/PhysRevResearch.2.033135
M3 - Article
AN - SCOPUS:85115901263
VL - 2
JO - Physical Review Research
JF - Physical Review Research
SN - 2643-1564
IS - 3
M1 - 033135
ER -