User Tools

Site Tools


Methods for Training Recurrent Neural Networks


Backpropagation Through Time



Exploding and Vanishing Gradients



Clipped/Scaled Gradient


Hessian-Free Optimization

Momentum and Initialization

Long Short-Term Memory

Generalized LSTM

Echo State Networks

Real-Time Recurrent Learning

1) Werbos, Paul J. “Backpropagation through time: what it does and how to do it.” Proceedings of the IEEE 78.10 (1990): 1550-1560.
2) Rumelhart, David E., Geoffrey E. Hinton, and Ronald J. Williams. Learning representations by back-propagating errors. MIT Press, Cambridge, MA, USA, 1988.
3) Bengio, Yoshua, Patrice Simard, and Paolo Frasconi. “Learning long-term dependencies with gradient descent is difficult.” Neural Networks, IEEE Transactions on 5.2 (1994): 157-166.
4) Hochreiter, Sepp, et al. “Gradient flow in recurrent nets: the difficulty of learning long-term dependencies.” (2001).
5) Pascanu, Razvan, Tomas Mikolov, and Yoshua Bengio. “On the difficulty of training recurrent neural networks.” arXiv preprint arXiv:1211.5063 (2012).
methods_for_training_recurrent_neural_networks.txt · Last modified: 2015/12/17 14:59 (external edit)