All Faculty | Duke Electrical and Computer Engineering

Publications by Jason A. Janet.

Papers Published

Janet, J.A. and Scoggins, S.M. and Schultz, S.M. and Snyder, W.E. and White, M.W. and Sutton, J.C. III, Shocking: An approach to stabilize backprop training with greedy adaptive learning rates, IEEE International Conference on Neural Networks - Conference Proceedings, vol. 3 (1998), pp. 2218 - 2223 [IJCNN.1998.687205] .
(last updated on 2007/04/16)
Abstract:
In general, backprop neural networks converge faster with adaptive learning rates than with learning rates that remain constant or grow or decay without regard to the network error (such as exponentially decaying learning rates). This is because each synapse has its own learning rate that can vary over time by an amount appropriate to that weight. For certain problems however, adaptive learning rates cause neural networks to saturate during training. The rate of occurrence of this problem is increased when the learning rates can grow without limit. When learning rates are permitted to assume values greater than unity, they are considered `greedy'. Greedy adaptive learning rates can reduce the training times of networks, but can also compromise the stability of the training process, leading to a network that fails to converge. Most all comparisons of training time are based on neural networks that actually converged. Rarely, if at all, is the failure rate presented; little to no consideration is given to why some neural networks fail to converge or, for that matter, how to reduce the chances of failure. This paper proposes a simple ad hoc approach called `shocking' as a partial solution to the instability problem caused by greedy adaptive learning rates. An analysis based on training times and failure rates for two inherently unstable benchmark problems is used to validate the use of shocking.
Keywords:
Backpropagation;Learning systems;Problem solving;Error analysis;Failure analysis;

Secondary menu

Main menu