Comment on page
Special algorithms for gradient descent.
The main difference among the updaters is how they treat the learning rate. Stochastic Gradient Descent, the most common learning algorithm in deep learning, relies on
Theta(the weights in hidden layers) and
alpha(the learning rate). Different updaters help optimize the learning rate until the neural network converges on its most performant state.
To use the updaters, pass a new class to the
updater()method in either a
ComputationGraphConfiguration conf = new NeuralNetConfiguration.Builder()
// add your layers and hyperparameters below
public void applyUpdater(INDArray gradient, int iteration, int epoch)
Calculate the update based on the given gradient
- param gradient the gradient to get the update for
- param iteration
- return the gradient