Loss
absoluteDifference
Absolute difference loss: sum_i abs( label[i] - predictions[i] )
label (NUMERIC) - Label array
predictions (NUMERIC) - Predictions array
weights (NUMERIC) - Weights array. May be null. If null, a weight of 1.0 is used
lossReduce - Reduction type for the loss. See LossReduce for more details. Default: LossReduce#MEAN_BY_NONZERO_WEIGHT_COUNT - default = LossReduce.MEAN_BY_NONZERO_WEIGHT_COUNT
cosineDistance
Cosine distance loss: 1 - cosineSimilarity(x,y)
or 1 - sum_i label[i] * prediction[i]
, which is
equivalent to cosine distance when both the predictions and labels are normalized. Note: This loss function assumes that both the predictions and labels are normalized to have unit l2 norm.
If this is not the case, you should normalize them first by dividing by norm2(String, SDVariable, boolean, int...)
along the cosine distance dimension (with keepDims=true).
label (NUMERIC) - Label array
predictions (NUMERIC) - Predictions array
weights (NUMERIC) - Weights array. May be null. If null, a weight of 1.0 is use
lossReduce - Reduction type for the loss. See LossReduce for more details. Default: LossReduce#MEAN_BY_NONZERO_WEIGHT_COUNT - default = LossReduce.MEAN_BY_NONZERO_WEIGHT_COUNT
dimension - Dimension to perform the cosine distance over
hingeLoss
Hinge loss: a loss function used for training classifiers.
Implements L = max(0, 1 - t * predictions)
where t is the label values after internally converting to {-1,1`
from the user specified {0,1. Note that Labels should be provided with values {0,1
.
label (NUMERIC) - Label array. Each value should be 0.0 or 1.0 (internally -1 to 1 is used)
predictions (NUMERIC) - Predictions array
weights (NUMERIC) - Weights array. May be null. If null, a weight of 1.0 is used
lossReduce - Reduction type for the loss. See LossReduce for more details. Default: LossReduce#MEAN_BY_NONZERO_WEIGHT_COUNT - default = LossReduce.MEAN_BY_NONZERO_WEIGHT_COUNT
huberLoss
Huber loss function, used for robust regression. It is similar both squared error loss and absolute difference loss,
though is less sensitive to outliers than squared error. Huber loss implements:
label (NUMERIC) - Label array
predictions (NUMERIC) - Predictions array
weights (NUMERIC) - Weights array. May be null. If null, a weight of 1.0 is used
lossReduce - Reduction type for the loss. See LossReduce for more details. Default: LossReduce#MEAN_BY_NONZERO_WEIGHT_COUNT - default = LossReduce.MEAN_BY_NONZERO_WEIGHT_COUNT
delta - Loss function delta value
l2Loss
L2 loss: 1/2 * sum(x^2)
var (NUMERIC) - Variable to calculate L2 loss of
logLoss
Log loss, i.e., binary cross entropy loss, usually used for binary multi-label classification. Implements:
-1/numExamples * sum_i (labels[i] * log(predictions[i] + epsilon) + (1-labels[i]) * log(1-predictions[i] + epsilon))
label (NUMERIC) - Label array
predictions (NUMERIC) - Predictions array
weights (NUMERIC) - Weights array. May be null. If null, a weight of 1.0 is used
lossReduce - Reduction type for the loss. See LossReduce for more details. Default: LossReduce#MEAN_BY_NONZERO_WEIGHT_COUNT - default = LossReduce.MEAN_BY_NONZERO_WEIGHT_COUNT
epsilon - epsilon - default = 0.0
logPoisson
Log poisson loss: a loss function used for training classifiers.
Implements L = exp(c) - z * c
where c is log(predictions) and z is labels.
label (NUMERIC) - Label array. Each value should be 0.0 or 1.0
predictions (NUMERIC) - Predictions array (has to be log(x) of actual predictions)
weights (NUMERIC) - Weights array. May be null. If null, a weight of 1.0 is used
lossReduce - Reduction type for the loss. See LossReduce for more details. Default: LossReduce#MEAN_BY_NONZERO_WEIGHT_COUNT - default = LossReduce.MEAN_BY_NONZERO_WEIGHT_COUNT
full - Boolean flag. true for logPoissonFull, false for logPoisson
meanPairwiseSquaredError
Mean pairwise squared error. MPWSE loss calculates the difference between pairs of consecutive elements in the predictions and labels arrays.
For example, if predictions = [p0, p1, p2] and labels are [l0, l1, l2] then MPWSE is:
{@code [((p0-p1) - (l0-l1))^2 + ((p0-p2) - (l0-l2))^2 + ((p1-p2) - (l1-l2))^2] / 3}
label (NUMERIC) - Label array
predictions (NUMERIC) - Predictions array
weights (NUMERIC) - Weights array. May be null. If null, a weight of 1.0 is used. Must be either null, scalar, or have shape [batchSize]
lossReduce - Reduction type for the loss. See LossReduce for more details. Default: LossReduce#MEAN_BY_NONZERO_WEIGHT_COUNT - default = LossReduce.MEAN_BY_NONZERO_WEIGHT_COUNT
meanSquaredError
Mean squared error loss function. Implements (label[i] - prediction[i])^2
- i.e., squared error on a per-element basis.
When averaged (using LossReduce#MEAN_BY_WEIGHT or LossReduce#MEAN_BY_NONZERO_WEIGHT_COUNT (the default))
this is the mean squared error loss function.
label (NUMERIC) - Label array
predictions (NUMERIC) - Predictions array
weights (NUMERIC) - Weights array. May be null. If null, a weight of 1.0 is used
lossReduce - Reduction type for the loss. See LossReduce for more details. Default: LossReduce#MEAN_BY_NONZERO_WEIGHT_COUNT - default = LossReduce.MEAN_BY_NONZERO_WEIGHT_COUNT
sigmoidCrossEntropy
Sigmoid cross entropy: applies the sigmoid activation function on the input logits (input "pre-sigmoid preductions")
and implements the binary cross entropy loss function. This implementation is numerically more stable than using
standard (but separate) sigmoid activation function and log loss (binary cross entropy) loss function. Implements:
-1/numExamples * sum_i (labels[i] * log(sigmoid(logits[i])) + (1-labels[i]) * log(1-sigmoid(logits[i])))
though this is done in a mathematically equivalent but more numerical stable form.
When label smoothing is > 0, the following label smoothing is used:
label (NUMERIC) - Label array
predictionLogits (NUMERIC) - Predictions array
weights (NUMERIC) - Weights array. May be null. If null, a weight of 1.0 is used
lossReduce - Reduction type for the loss. See LossReduce for more details. Default: LossReduce#MEAN_BY_NONZERO_WEIGHT_COUNT - default = LossReduce.MEAN_BY_NONZERO_WEIGHT_COUNT
labelSmoothing - Label smoothing value. Default value: 0 - default = 0.0
softmaxCrossEntropy
Applies the softmax activation function to the input, then implement multi-class cross entropy: {@code -sum_classes label[i] * log(p[c])} where {@code p = softmax(logits)} If LossReduce#NONE is used, returned shape is [numExamples] out for [numExamples, numClasses] predicitons/labels;
otherwise, the output is a scalar.
When label smoothing is > 0, the following label smoothing is used:
oneHotLabels (NUMERIC) - Label array. Should be one-hot per example and same shape as predictions (for example, [mb, nOut])
logitPredictions (NUMERIC) - Predictions array (pre-softmax)
weights (NUMERIC) - Weights array. May be null. If null, a weight of 1.0 is used
lossReduce - Reduction type for the loss. See LossReduce for more details. Default: LossReduce#MEAN_BY_NONZERO_WEIGHT_COUNT - default = LossReduce.MEAN_BY_NONZERO_WEIGHT_COUNT
labelSmoothing - Label smoothing value. Default value: 0 - default = 0.0
sparseSoftmaxCrossEntropy
As per softmaxCrossEntropy(String, SDVariable, SDVariable, LossReduce) but the labels variable
is represented as an integer array instead of the equivalent one-hot array. i.e., if logits are rank N, then labels have rank N-1
logits (NUMERIC) - Logits array ("pre-softmax activations")
labels (INT) - Labels array. Must be an integer type.
weightedCrossEntropyWithLogits
Weighted cross entropy loss with logits
targets (NUMERIC) - targets array
inputs (NUMERIC) - input array
weights (NUMERIC) - eights array. May be null. If null, a weight of 1.0 is used
Last updated