We study the performance of a generalized perceptron algorithm for learning realizable dichotomies, with an error-dependent adaptive learning rate. The asymptotic scaling form of the solution to the associated Markov equations is derived, assuming certain smoothness conditions. We show that the system converges to the optimal solution and the generalization error asymptotically obeys a universal inverse power law in the number of examples. The system is capable of escaping from local minima and adapts rapidly to shifts in the target function. The general theory is illustrated for the perceptron and committee machine.
All Science Journal Classification (ASJC) codes
- Physics and Astronomy(all)