摘要
Aiming at training the feed-forward threshold neural network consisting of nondifferentiable activation functions, the approach of noise injection forms a stochastic resonance based threshold network that can be optimized by various gradientbased optimizers. The introduction of injected noise extends the noise level into the parameter space of the designed threshold network, but leads to a highly non-convex optimization landscape of the loss function. Thus, the hyperparameter on-line learning procedure with respective to network weights and noise levels becomes of challenge. It is shown that the Adam optimizer, as an adaptive variant of stochastic gradient descent, manifests its superior learning ability in training the stochastic resonance based threshold network effectively. Experimental results demonstrate the significant improvement of performance of the designed threshold network trained by the Adam optimizer for function approximation and image classification.
基金
Project supported by the Natural Science Foundation of Shandong Province,China(Grant No.ZR2021MF051)。