Comment on page
Updaters/Optimizers
Special algorithms for gradient descent.
The main difference among the updaters is how they treat the learning rate. Stochastic Gradient Descent, the most common learning algorithm in deep learning, relies on
Theta
(the weights in hidden layers) and alpha
(the learning rate). Different updaters help optimize the learning rate until the neural network converges on its most performant state.To use the updaters, pass a new class to the
updater()
method in either a ComputationGraph
or MultiLayerNetwork
.ComputationGraphConfiguration conf = new NeuralNetConfiguration.Builder()
.updater(new Adam(0.01))
// add your layers and hyperparameters below
.build();
applyUpdater
public void applyUpdater(INDArray gradient, int iteration, int epoch)
Calculate the update based on the given gradient
- param gradient the gradient to get the update for
- param iteration
- return the gradient