TY - GEN
T1 - Learning curves for stochastic gradient descent in linear feedforward networks
AU - Werfel, Justin
AU - Xie, Xiaohui
AU - Sebastian Seung, H.
PY - 2004
Y1 - 2004
N2 - Gradient-following learning methods can encounter problems of implementation in many applications, and stochastic variants are frequently used to overcome these difficulties. We derive quantitative learning curves for three online training methods used with a linear perceptron: direct gradient descent, node perturbation, and weight perturbation. The maximum learning rate for the stochastic methods scales inversely with the first power of the dimensionality of the noise injected into the system; with sufficiently small learning rate, all three methods give identical learning curves. These results suggest guidelines for when these stochastic methods will be limited in their utility, and considerations for architectures in which they will be effective.
AB - Gradient-following learning methods can encounter problems of implementation in many applications, and stochastic variants are frequently used to overcome these difficulties. We derive quantitative learning curves for three online training methods used with a linear perceptron: direct gradient descent, node perturbation, and weight perturbation. The maximum learning rate for the stochastic methods scales inversely with the first power of the dimensionality of the noise injected into the system; with sufficiently small learning rate, all three methods give identical learning curves. These results suggest guidelines for when these stochastic methods will be limited in their utility, and considerations for architectures in which they will be effective.
UR - http://www.scopus.com/inward/record.url?scp=0346554799&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0346554799&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:0346554799
SN - 0262201526
SN - 9780262201520
T3 - Advances in Neural Information Processing Systems
BT - Advances in Neural Information Processing Systems 16 - Proceedings of the 2003 Conference, NIPS 2003
PB - Neural information processing systems foundation
T2 - 17th Annual Conference on Neural Information Processing Systems, NIPS 2003
Y2 - 8 December 2003 through 13 December 2003
ER -