Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Absolute difference loss: sum_i abs( label[i] - predictions[i] )
label (NUMERIC) - Label array
predictions (NUMERIC) - Predictions array
weights (NUMERIC) - Weights array. May be null. If null, a weight of 1.0 is used
lossReduce - Reduction type for the loss. See LossReduce for more details. Default: LossReduce#MEAN_BY_NONZERO_WEIGHT_COUNT - default = LossReduce.MEAN_BY_NONZERO_WEIGHT_COUNT
Cosine distance loss: 1 - cosineSimilarity(x,y)
or 1 - sum_i label[i] * prediction[i]
, which is
equivalent to cosine distance when both the predictions and labels are normalized. Note: This loss function assumes that both the predictions and labels are normalized to have unit l2 norm.
If this is not the case, you should normalize them first by dividing by norm2(String, SDVariable, boolean, int...)
along the cosine distance dimension (with keepDims=true).
label (NUMERIC) - Label array
predictions (NUMERIC) - Predictions array
weights (NUMERIC) - Weights array. May be null. If null, a weight of 1.0 is use
lossReduce - Reduction type for the loss. See LossReduce for more details. Default: LossReduce#MEAN_BY_NONZERO_WEIGHT_COUNT - default = LossReduce.MEAN_BY_NONZERO_WEIGHT_COUNT
dimension - Dimension to perform the cosine distance over
Hinge loss: a loss function used for training classifiers.
Implements L = max(0, 1 - t * predictions)
where t is the label values after internally converting to {-1,1`
from the user specified {0,1. Note that Labels should be provided with values {0,1
.
label (NUMERIC) - Label array. Each value should be 0.0 or 1.0 (internally -1 to 1 is used)
predictions (NUMERIC) - Predictions array
weights (NUMERIC) - Weights array. May be null. If null, a weight of 1.0 is used
lossReduce - Reduction type for the loss. See LossReduce for more details. Default: LossReduce#MEAN_BY_NONZERO_WEIGHT_COUNT - default = LossReduce.MEAN_BY_NONZERO_WEIGHT_COUNT
Huber loss function, used for robust regression. It is similar both squared error loss and absolute difference loss,
though is less sensitive to outliers than squared error. Huber loss implements:
label (NUMERIC) - Label array
predictions (NUMERIC) - Predictions array
weights (NUMERIC) - Weights array. May be null. If null, a weight of 1.0 is used
lossReduce - Reduction type for the loss. See LossReduce for more details. Default: LossReduce#MEAN_BY_NONZERO_WEIGHT_COUNT - default = LossReduce.MEAN_BY_NONZERO_WEIGHT_COUNT
delta - Loss function delta value
L2 loss: 1/2 * sum(x^2)
var (NUMERIC) - Variable to calculate L2 loss of
Log loss, i.e., binary cross entropy loss, usually used for binary multi-label classification. Implements:
-1/numExamples * sum_i (labels[i] * log(predictions[i] + epsilon) + (1-labels[i]) * log(1-predictions[i] + epsilon))
label (NUMERIC) - Label array
predictions (NUMERIC) - Predictions array
weights (NUMERIC) - Weights array. May be null. If null, a weight of 1.0 is used
lossReduce - Reduction type for the loss. See LossReduce for more details. Default: LossReduce#MEAN_BY_NONZERO_WEIGHT_COUNT - default = LossReduce.MEAN_BY_NONZERO_WEIGHT_COUNT
epsilon - epsilon - default = 0.0
Log poisson loss: a loss function used for training classifiers.
Implements L = exp(c) - z * c
where c is log(predictions) and z is labels.
label (NUMERIC) - Label array. Each value should be 0.0 or 1.0
predictions (NUMERIC) - Predictions array (has to be log(x) of actual predictions)
weights (NUMERIC) - Weights array. May be null. If null, a weight of 1.0 is used
lossReduce - Reduction type for the loss. See LossReduce for more details. Default: LossReduce#MEAN_BY_NONZERO_WEIGHT_COUNT - default = LossReduce.MEAN_BY_NONZERO_WEIGHT_COUNT
full - Boolean flag. true for logPoissonFull, false for logPoisson
Mean pairwise squared error. MPWSE loss calculates the difference between pairs of consecutive elements in the predictions and labels arrays.
For example, if predictions = [p0, p1, p2] and labels are [l0, l1, l2] then MPWSE is:
{@code [((p0-p1) - (l0-l1))^2 + ((p0-p2) - (l0-l2))^2 + ((p1-p2) - (l1-l2))^2] / 3}
label (NUMERIC) - Label array
predictions (NUMERIC) - Predictions array
weights (NUMERIC) - Weights array. May be null. If null, a weight of 1.0 is used. Must be either null, scalar, or have shape [batchSize]
lossReduce - Reduction type for the loss. See LossReduce for more details. Default: LossReduce#MEAN_BY_NONZERO_WEIGHT_COUNT - default = LossReduce.MEAN_BY_NONZERO_WEIGHT_COUNT
Mean squared error loss function. Implements (label[i] - prediction[i])^2
- i.e., squared error on a per-element basis.
When averaged (using LossReduce#MEAN_BY_WEIGHT or LossReduce#MEAN_BY_NONZERO_WEIGHT_COUNT (the default))
this is the mean squared error loss function.
label (NUMERIC) - Label array
predictions (NUMERIC) - Predictions array
weights (NUMERIC) - Weights array. May be null. If null, a weight of 1.0 is used
lossReduce - Reduction type for the loss. See LossReduce for more details. Default: LossReduce#MEAN_BY_NONZERO_WEIGHT_COUNT - default = LossReduce.MEAN_BY_NONZERO_WEIGHT_COUNT
Sigmoid cross entropy: applies the sigmoid activation function on the input logits (input "pre-sigmoid preductions")
and implements the binary cross entropy loss function. This implementation is numerically more stable than using
standard (but separate) sigmoid activation function and log loss (binary cross entropy) loss function. Implements:
-1/numExamples * sum_i (labels[i] * log(sigmoid(logits[i])) + (1-labels[i]) * log(1-sigmoid(logits[i])))
though this is done in a mathematically equivalent but more numerical stable form.
When label smoothing is > 0, the following label smoothing is used:
label (NUMERIC) - Label array
predictionLogits (NUMERIC) - Predictions array
weights (NUMERIC) - Weights array. May be null. If null, a weight of 1.0 is used
lossReduce - Reduction type for the loss. See LossReduce for more details. Default: LossReduce#MEAN_BY_NONZERO_WEIGHT_COUNT - default = LossReduce.MEAN_BY_NONZERO_WEIGHT_COUNT
labelSmoothing - Label smoothing value. Default value: 0 - default = 0.0
Applies the softmax activation function to the input, then implement multi-class cross entropy: {@code -sum_classes label[i] * log(p[c])} where {@code p = softmax(logits)} If LossReduce#NONE is used, returned shape is [numExamples] out for [numExamples, numClasses] predicitons/labels;
otherwise, the output is a scalar.
When label smoothing is > 0, the following label smoothing is used:
oneHotLabels (NUMERIC) - Label array. Should be one-hot per example and same shape as predictions (for example, [mb, nOut])
logitPredictions (NUMERIC) - Predictions array (pre-softmax)
weights (NUMERIC) - Weights array. May be null. If null, a weight of 1.0 is used
lossReduce - Reduction type for the loss. See LossReduce for more details. Default: LossReduce#MEAN_BY_NONZERO_WEIGHT_COUNT - default = LossReduce.MEAN_BY_NONZERO_WEIGHT_COUNT
labelSmoothing - Label smoothing value. Default value: 0 - default = 0.0
As per softmaxCrossEntropy(String, SDVariable, SDVariable, LossReduce) but the labels variable
is represented as an integer array instead of the equivalent one-hot array. i.e., if logits are rank N, then labels have rank N-1
logits (NUMERIC) - Logits array ("pre-softmax activations")
labels (INT) - Labels array. Must be an integer type.
Weighted cross entropy loss with logits
targets (NUMERIC) - targets array
inputs (NUMERIC) - input array
weights (NUMERIC) - eights array. May be null. If null, a weight of 1.0 is used
2D Convolution layer operation - average pooling 2d
input (NUMERIC) - the input to average pooling 2d operation - 4d CNN (image) activations in NCHW format (shape [minibatch, channels, height, width]) or NHWC format (shape [minibatch, height, width, channels])
Pooling2DConfig - see
3D convolution layer operation - average pooling 3d
input (NUMERIC) - the input to average pooling 3d operation - 5d activations in NCDHW format (shape [minibatch, channels, depth, height, width]) or NDHWC format (shape [minibatch, depth, height, width, channels])
Pooling3DConfig - see
Convolution 2d layer batch to space operation on 4d input.
Reduces input batch dimension by rearranging data into a larger spatial dimensions
x (NUMERIC) - Input variable. 4d input
blocks - Block size, in the height/width dimension (Size: Exactly(count=2))
croppingTop - (Size: Exactly(count=2))
croppingBottom - (Size: Exactly(count=2))
col2im operation for use in 2D convolution operations. Outputs a 4d array with shape
[minibatch, inputChannels, height, width]
in (NUMERIC) - Input - rank 6 input with shape [minibatch, inputChannels, kernelHeight, kernelWidth, outputHeight, outputWidth]
Conv1d operation.
input (NUMERIC) - the inputs to conv1d
weights (NUMERIC) - weights for conv1d op - rank 3 array with shape [kernelSize, inputChannels, outputChannels]
bias (NUMERIC) - bias for conv1d op - rank 1 array with shape [outputChannels]. May be null.
2D Convolution operation with optional bias
layerInput (NUMERIC) - the input to max pooling 2d operation - 4d CNN (image) activations in NCHW format
weights (NUMERIC) - Weights for the convolution operation. 4 dimensions with format [kernelHeight, kernelWidth, inputChannels, outputChannels]
bias (NUMERIC) - Optional 1D bias array with shape [outputChannels]. May be null.
Convolution 3D operation with optional bias
input (NUMERIC) - the input to average pooling 3d operation - 5d activations in NCDHW format (shape [minibatch, channels, depth, height, width]) or NDHWC format (shape [minibatch, depth, height, width, channels])
weights (NUMERIC) - Weights for conv3d. Rank 5 with shape [kernelDepth, kernelHeight, kernelWidth, inputChannels, outputChannels].
bias (NUMERIC) - Optional 1D bias array with shape [outputChannels]. May be null.
2D deconvolution operation with optional bias
layerInput (NUMERIC) - the input to deconvolution 2d operation - 4d CNN (image) activations in NCHW format (shape [minibatch, channels, height, width]) or NHWC format (shape [minibatch, height, width, channels])
weights (NUMERIC) - Weights for the 2d deconvolution operation. 4 dimensions with format [inputChannels, outputChannels, kernelHeight, kernelWidth]
bias (NUMERIC) - Optional 1D bias array with shape [outputChannels]. May be null.
3D CNN deconvolution operation with or without optional bias
input (NUMERIC) - Input array - shape [bS, iD, iH, iW, iC] (NDHWC) or [bS, iC, iD, iH, iW] (NCDHW)
weights (NUMERIC) - Weights array - shape [kD, kH, kW, oC, iC]
bias (NUMERIC) - Bias array - optional, may be null. If non-null, must have shape [outputChannels]
Convolution 2d layer batch to space operation on 4d input. Reduces input channels dimension by rearranging data into a larger spatial dimensions Example: if input has shape [mb, 8, 2, 2] and block size is 2, then output size is [mb, 8/(2_2), 2_2, 2*2]
= [mb, 2, 4, 4]
x (NUMERIC) - the input to depth to space pooling 2d operation - 4d activations in NCHW format (shape [minibatch, channels, height, width]) or NHWC format (shape [minibatch, height, width, channels])
blockSize - Block size, in the height/width dimension
dataFormat - Data format: "NCHW" or "NHWC"
Depth-wise 2D convolution operation with optional bias
layerInput (NUMERIC) - the input to max pooling 2d operation - 4d CNN (image) activations in NCHW format
depthWeights (NUMERIC) - Depth-wise conv2d weights. 4 dimensions with format [kernelHeight, kernelWidth, inputChannels, depthMultiplier]
bias (NUMERIC) - Optional 1D bias array with shape [outputChannels]. May be null.
TODO doc string
df (NUMERIC) -
weights (NUMERIC) - df
strides - weights (Size: Exactly(count=2))
rates - strides (Size: Exactly(count=2))
isSameMode - isSameMode
Extract image patches
input (NUMERIC) - Input array. Must be rank 4, with shape [minibatch, height, width, channels]
kH - Kernel height
kW - Kernel width
sH - Stride height
sW - Stride width
rH - Rate height
rW - Rate width
sameMode - If true: use same mode padding. If false
im2col operation for use in 2D convolution operations. Outputs a 6d array with shape
[minibatch, inputChannels, kernelHeight, kernelWidth, outputHeight, outputWidth]
in (NUMERIC) - Input - rank 4 input with shape [minibatch, inputChannels, height, width]
2D convolution layer operation - local response normalization
input (NUMERIC) - the inputs to lrn
2D Convolution layer operation - Max pooling on the input and outputs both max values and indices
input (NUMERIC) - the input to max pooling 2d operation - 4d CNN (image) activations in NCHW format (shape [minibatch, channels, height, width]) or NHWC format (shape [minibatch, height, width, channels])
2D Convolution layer operation - max pooling 2d
input (NUMERIC) - the input to max pooling 2d operation - 4d CNN (image) activations in NCHW format (shape [minibatch, channels, height, width]) or NHWC format (shape [minibatch, height, width, channels])
3D convolution layer operation - max pooling 3d operation.
input (NUMERIC) - the input to average pooling 3d operation - 5d activations in NCDHW format (shape [minibatch, channels, depth, height, width]) or NDHWC format (shape [minibatch, depth, height, width, channels])
Separable 2D convolution operation with optional bias
layerInput (NUMERIC) - the input to max pooling 2d operation - 4d CNN (image) activations in NCHW format (shape [minibatch, channels, height, width]) or NHWC format (shape [minibatch, height, width, channels])
depthWeights (NUMERIC) - Separable conv2d depth weights. 4 dimensions with format [kernelHeight, kernelWidth, inputChannels, depthMultiplier]
pointWeights (NUMERIC) - Point weights, rank 4 with format [1, 1, inputChannels*depthMultiplier, outputChannels]. May be null
bias (NUMERIC) - Optional bias, rank 1 with shape [outputChannels]. May be null.
Convolution 2d layer space to batch operation on 4d input.
Increases input batch dimension by rearranging data from spatial dimensions into batch dimension
x (NUMERIC) - Input variable. 4d input
blocks - Block size, in the height/width dimension (Size: Exactly(count=2))
paddingTop - Optional 2d int[] array for padding the result: values [[pad top, pad bottom], [pad left, pad right]] (Size: Exactly(count=2))
paddingBottom - Optional 2d int[] array for padding the result: values [[pad top, pad bottom], [pad left, pad right]] (Size: Exactly(count=2))
Convolution 2d layer space to depth operation on 4d input. Increases input channels (reduced spatial dimensions) by rearranging data into a larger channels dimension Example: if input has shape [mb, 2, 4, 4] and block size is 2, then output size is [mb, 8/(2_2), 2_2, 2*2]
= [mb, 2, 4, 4]
x (NUMERIC) - the input to depth to space pooling 2d operation - 4d activations in NCHW format (shape [minibatch, channels, height, width]) or NHWC format (shape [minibatch, height, width, channels])
blockSize - Block size, in the height/width dimension
dataFormat - Data format: "NCHW" or "NHWC"
Upsampling layer for 2D inputs.
scale is used for both height and width dimensions.
input (NUMERIC) - Input in NCHW format
scale - The scale for both height and width dimensions.
2D Convolution layer operation - Upsampling 2d
input (NUMERIC) - Input in NCHW format
scaleH - Scale to upsample in height dimension
scaleW - Scale to upsample in width dimension
nchw - If true: input is in NCHW (minibatch, channels, height, width) format. False: NHWC format
3D Convolution layer operation - Upsampling 3d
input (NUMERIC) - Input in NCHW format
ncdhw - If true: input is in NCDHW (minibatch, channels, depth, height, width) format. False: NDHWC format
scaleD - Scale to upsample in depth dimension
scaleH - Scale to upsample in height dimension
scaleW - Scale to upsample in width dimension
k (LONG) - Kernel - default = -1
s (LONG) - stride - default = 1
p (LONG) - padding - default = 0
d (LONG) - dilation - default = 1
isSameMode (BOOL) - Same mode - default = true
dataFormat (STRING) - Data format - default = NCW
kH (LONG) - Kernel height - default = -1
kW (LONG) - Kernel width - default = -1
sH (LONG) - Stride along height dimension - default = 1
sW (LONG) - Stride along width dimension - default = 1
pH (LONG) - Padding along height dimension - default = 0
pW (LONG) - Padding along width dimension - default = 0
dH (LONG) - Dilation along height dimension - default = 1
dW (LONG) - Dilation along width dimension - default = 1
isSameMode (BOOL) - Same mode - default = true
dataFormat (STRING) - Data format - default = NCHW
kD (LONG) - Kernel depth - default = -1
kW (LONG) - Kernel width - default = -1
kH (LONG) - Kernel height - default = -1
sD (LONG) - Stride depth - default = 1
sW (LONG) - Stride width - default = 1
sH (LONG) - Stride height - default = 1
pD (LONG) - Padding depth - default = 0
pW (LONG) - Padding width - default = 0
pH (LONG) - Padding height - default = 0
dD (LONG) - Dilation depth - default = 1
dW (LONG) - Dilation width - default = 1
dH (LONG) - Dilation height - default = 1
biasUsed (BOOL) - biasUsed - default = false
isSameMode (BOOL) - Same mode - default = true
dataFormat (STRING) - Data format - default = NDHWC
kH (LONG) - Kernel height - default = -1
kW (LONG) - Kernel width - default = -1
sH (LONG) - Stride along height dimension - default = 1
sW (LONG) - Stride along width dimension - default = 1
pH (LONG) - Padding along height dimension - default = 0
pW (LONG) - Padding along width dimension - default = 0
dH (LONG) - Dilation along height dimension - default = 1
dW (LONG) - Dilation along width dimension - default = 1
isSameMode (BOOL) - Same mode - default = false
dataFormat (STRING) - Data format - default = NCHW
kD (LONG) - Kernel depth - default = -1
kW (LONG) - Kernel width - default = -1
kH (LONG) - Kernel height - default = -1
sD (LONG) - Stride depth - default = 1
sW (LONG) - Stride width - default = 1
sH (LONG) - Stride height - default = 1
pD (LONG) - Padding depth - default = 0
pW (LONG) - Padding width - default = 0
pH (LONG) - Padding height - default = 0
dD (LONG) - Dilation depth - default = 1
dW (LONG) - Dilation width - default = 1
dH (LONG) - Dilation height - default = 1
isSameMode (BOOL) - Same mode - default = false
dataFormat (STRING) - Data format - default = NCDHW
kH (LONG) - Kernel height - default = -1
kW (LONG) - Kernel width - default = -1
sH (LONG) - Stride along height dimension - default = 1
sW (LONG) - Stride along width dimension - default = 1
pH (LONG) - Padding along height dimension - default = 0
pW (LONG) - Padding along width dimension - default = 0
dH (LONG) - Dilation along height dimension - default = 1
dW (LONG) - Dilation along width dimension - default = 1
isSameMode (BOOL) - Same mode - default = true
dataFormat (STRING) - Data format - default = nchw
kD (LONG) - Kernel depth - default = -1
kW (LONG) - Kernel width - default = -1
kH (LONG) - Kernel height - default = -1
sD (LONG) - Stride depth - default = 1
sW (LONG) - Stride width - default = 1
sH (LONG) - Stride height - default = 1
pD (LONG) - Padding depth - default = 0
pW (LONG) - Padding width - default = 0
pH (LONG) - Padding height - default = 0
dD (LONG) - Dilation depth - default = 1
dW (LONG) - Dilation width - default = 1
dH (LONG) - Dilation height - default = 1
isSameMode (BOOL) - Same mode - default = true
dataFormat (STRING) - Data format - default = NCDHW
alpha (NUMERIC) - alpha - default = 1
beta (NUMERIC) - beta - default = 0.5
bias (NUMERIC) - bias - default = 1
depth (INT) - depth - default = 5
Generate a new random INDArray, where values are randomly sampled according to a Bernoulli distribution,
with the specified probability. Array values will have value 1 with probability P and value 0 with probability
1-P.
p - Probability of value 1
datatype - Data type of the output variable
shape - Shape of the new random INDArray, as a 1D array (Size: AtLeast(min=0))
Generate a new random INDArray, where values are randomly sampled according to a Binomial distribution,
with the specified number of trials and probability.
nTrials - Number of trials parameter for the binomial distribution
p - Probability of success for each trial
datatype - Data type of the output variable
shape - Shape of the new random INDArray, as a 1D array (Size: AtLeast(min=0))
Generate a new random INDArray, where values are randomly sampled according to a exponential distribution:
P(x) = lambda exp(-lambda x)
lambda - lambda parameter
datatype - Data type of the output variable
shape - Shape of the new random INDArray, as a 1D array (Size: AtLeast(min=0))
Generate a new random INDArray, where values are randomly sampled according to a Log Normal distribution,
i.e., log(x) ~ N(mean, stdev)
mean - Mean value for the random array
stddev - Standard deviation for the random array
datatype - Data type of the output variable
shape - Shape of the new random INDArray, as a 1D array (Size: AtLeast(min=0))
Generate a new random INDArray, where values are randomly sampled according to a Gaussian (normal) distribution,
N(mean, stdev)
mean - Mean value for the random array
stddev - Standard deviation for the random array
datatype - Data type of the output variable
shape - Shape of the new random INDArray, as a 1D array (Size: AtLeast(min=0))
Generate a new random INDArray, where values are randomly sampled according to a Gaussian (normal) distribution,
N(mean, stdev). However, any values more than 1 standard deviation from the mean are dropped and re-sampled
mean - Mean value for the random array
stddev - Standard deviation for the random array
datatype - Data type of the output variable
shape - Shape of the new random INDArray, as a 1D array (Size: AtLeast(min=0))
Generate a new random INDArray, where values are randomly sampled according to a uniform distribution,
U(min,max)
min - Minimum value
max - Maximum value.
datatype - Data type of the output variable
shape - Shape of the new random INDArray, as a 1D array (Size: AtLeast(min=0))
Conv2DConfig - see
Conv1DConfig - see
Conv2DConfig - see
Conv3DConfig - see
DeConv2DConfig - see
DeConv3DConfig - see
Conv2DConfig - see
Conv2DConfig - see
LocalResponseNormalizationConfig - see
Pooling2DConfig - see
Pooling2DConfig - see
Pooling3DConfig - see
Conv2DConfig - see
Used in these ops:
Used in these ops:
Used in these ops:
Used in these ops:
Used in these ops:
Used in these ops:
Used in these ops:
Used in these ops:
These ops are generally available directly on SameDiff
instances. Due to an oversight before the release, this ops aren't also available on Nd4j
. To use the INDArray
variants of these operations, you will have to instantiate a NDBase
instance.
Boolean and array reduction operation, optionally along specified dimensions
x (NDARRAY) - Input variable
dimensions - Dimensions to reduce over. If dimensions are not specified, full array reduction is performed (Size: AtLeast(min=0))
Boolean or array reduction operation, optionally along specified dimensions
x (NDARRAY) - Input variable
dimensions - Dimensions to reduce over. If dimensions are not specified, full array reduction is performed (Size: AtLeast(min=0))
Argmax array reduction operation, optionally along specified dimensions.
Output values are the index of the maximum value of each slice along the specified dimension.
Note that if keepDims = true, the output variable has the same rank as the input variable,
with the reduced dimensions having size 1. This can be useful for later broadcast operations (such as subtracting
the mean along a dimension).
Example: if input has shape [a,b,c] and dimensions=[1] then output has shape:
keepDims = true: [a,1,c]
keepDims = false: [a,c]
in (NUMERIC) - Input variable
keepDims - If true: keep the dimensions that are reduced on (as size 1). False: remove the reduction dimensions - default = false
dimensions - Dimensions to reduce over. If dimensions are not specified, full array reduction is performed (Size: AtLeast(min=0))
Argmin array reduction operation, optionally along specified dimensions.
Output values are the index of the minimum value of each slice along the specified dimension.
Note that if keepDims = true, the output variable has the same rank as the input variable,
with the reduced dimensions having size 1. This can be useful for later broadcast operations (such as subtracting
the mean along a dimension).
Example: if input has shape [a,b,c] and dimensions=[1] then output has shape:
keepDims = true: [a,1,c]
keepDims = false: [a,c]
Note: supports broadcasting if x and y have different shapes and are broadcastable.
For example, if X has shape [1,10] and Y has shape [5,10] then op(X,Y) has output shape [5,10]
Broadcast rules are the same as NumPy: https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
in (NUMERIC) - Input variable
keepDims - If true: keep the dimensions that are reduced on (as size 1). False: remove the reduction dimensions - default = false
dimensions - Dimensions to reduce over. If dimensions are not specified, full array reduction is performed (Size: AtLeast(min=0))
Matrix multiply a batch of matrices. matricesA and matricesB have to be arrays of same
length and each pair taken from these sets has to have dimensions (M, N) and (N, K),
respectively. If transposeA is true, matrices from matricesA will have shape (N, M) instead.
Likewise, if transposeB is true, matrices from matricesB will have shape (K, N).
The result of this operation will be a batch of multiplied matrices. The
result has the same length as both input batches and each output matrix is of shape (M, K).
inputsA (NUMERIC) - First array of input matrices, all of shape (M, N) or (N, M)
inputsB (NUMERIC) - Second array of input matrices, all of shape (N, K) or (K, N)
transposeA - Whether to transpose A arrays or not - default = false
transposeB - Whether to transpose B arrays or not - default = false
Cast the array to a new datatype - for example, Integer -> Float
arg (NDARRAY) - Input variable to cast
datatype - Datatype to cast to
Concatenate a set of inputs along the specified dimension.
Note that inputs must have identical rank and identical dimensions, other than the dimension to stack on.
For example, if 2 inputs have shape [a, x, c] and [a, y, c] and dimension = 1, then the output has shape [a, x+y, c]
inputs (NUMERIC) - Input variables
dimension - Dimension to concatenate on
Cumulative product operation.
For input: [ a, b, c], output is:
exclusive=false, reverse=false: [a, a_b, a_b*c]
exclusive=true, reverse=false, [0, a, a*b]
exclusive=false, reverse=true: [a_b_c, b*c, c]
exclusive=true, reverse=true: [b*c, c, 0]
in (NUMERIC) - Input variable
exclusive - If true: exclude the first value - default = false
reverse - If true: reverse the direction of the accumulation - default = false
axis - Scalar axis argument for dimension to perform cumululative sum operations along (Size: AtLeast(min=1))
Cumulative sum operation.
For input: [ a, b, c], output is:
exclusive=false, reverse=false: [a, a+b, a+b+c]
exclusive=true, reverse=false, [0, a, a+b]
exclusive=false, reverse=true: [a+b+c, b+c, c]
exclusive=true, reverse=true: [b+c, c, 0]
in (NUMERIC) - Input variable
exclusive - If true: exclude the first value - default = false
reverse - If true: reverse the direction of the accumulation - default = false
axis - Scalar axis argument for dimension to perform cumululative sum operations along (Size: AtLeast(min=1))
Pairwise dot product reduction along dimension
output = sum(i=0 ... size(dim)-1) x[i] * y[i]
x (NUMERIC) - first input
y (NUMERIC) - second input
dimensions - Dimensions to reduce over. If dimensions are not specified, full array reduction is performed (Size: AtLeast(min=0))
Dynamically partition the input variable values into the specified number of paritions, using the indices.
Example:
x (NUMERIC) - Input variable
partitions (INT) - 1D input with values 0 to numPartitions-1
numPartitions - Number of partitions, >= 1
Dynamically merge the specified input arrays into a single array, using the specified indices
indices (INT) - Indices to use when merging. Must be >= 1, same length as input variables
x (NUMERIC) - Input variables.
Equals operation: elementwise x == y
Return boolean array with values true where satisfied, or false otherwise.
x (NUMERIC) - Input array
y - Double value argument to use in operation
Equal to operation: elementwise x == y
If x and y arrays have equal shape, the output shape is the same as these inputs.
Note: supports broadcasting if x and y have different shapes and are broadcastable.
For example, if X has shape [1,10] and Y has shape [5,10] then op(X,Y) has output shape [5,10]
Broadcast rules are the same as NumPy: https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
Return boolean array with values true where satisfied, or false otherwise.
x (NUMERIC) - Input 1
y (NUMERIC) - Input 2
Reshape the input by adding a 1 at the specified location.
For example, if input has shape [a, b], then output shape is:
axis = 0: [1, a, b]
axis = 1: [a, 1, b]
axis = 2: [a, b, 1]
x (NDARRAY) - Input variable
axis - Axis to expand
Generate an output variable with the specified (dynamic) shape with all elements set to the specified value
shape (INT) - Shape: must be a 1D array/variable
dataType - Datatype of the output array
value - Value to set all elements to
Gather slices from the input variable where the indices are specified as fixed int[] values.
Output shape is same as input shape, except for axis dimension, which has size equal to indices.length.
df (NUMERIC) - Input variable
indices - Indices to get (Size: AtLeast(min=1))
axis - Axis that the indices refer to
Gather slices from the input variable where the indices are specified as dynamic array values.
Output shape is same as input shape, except for axis dimension, which has size equal to indices.length.
df (NUMERIC) - Input variable
indices (INT) - Indices to get slices for. Rank 0 or 1 input
axis - Axis that the indices refer to
Gather slices from df with shape specified by indices.
df (NUMERIC) -
indices (NUMERIC) -
Greater than operation: elementwise x > y
Return boolean array with values true where satisfied, or false otherwise.
x (NUMERIC) - Input array
y - Double value argument to use in operation
Greater than operation: elementwise x > y
If x and y arrays have equal shape, the output shape is the same as these inputs.
Note: supports broadcasting if x and y have different shapes and are broadcastable.
For example, if X has shape [1,10] and Y has shape [5,10] then op(X,Y) has output shape [5,10]
Broadcast rules are the same as NumPy: https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
Return boolean array with values true where satisfied, or false otherwise.
x (NUMERIC) - Input 1
y (NUMERIC) - Input 2
Greater than or equals operation: elementwise x >= y
Return boolean array with values true where satisfied, or false otherwise.
x (NUMERIC) - Input array
y - Double value argument to use in operation
Greater than or equal to operation: elementwise x >= y
If x and y arrays have equal shape, the output shape is the same as these inputs.
Note: supports broadcasting if x and y have different shapes and are broadcastable.
For example, if X has shape [1,10] and Y has shape [5,10] then op(X,Y) has output shape [5,10]
Broadcast rules are the same as NumPy: https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
Return boolean array with values true where satisfied, or false otherwise.
x (NUMERIC) - Input 1
y (NUMERIC) - Input 2
Elementwise identity operation: out = x
input (NUMERIC) - Input variable
Compute the inverse permutation indices for a permutation operation
Example: if input is [2, 0, 1] then output is [1, 2, 0]
The idea is that x.permute(input).permute(invertPermutation(input)) == x
input (INT) - 1D indices for permutation
Is the director a numeric tensor? In the current version of ND4J/SameDiff, this always returns true/1
x (NUMERIC) - Input variable
Create a new 1d array with values evenly spaced between values 'start' and 'stop'
For example, linspace(start=3.0, stop=4.0, number=3) will generate [3.0, 3.5, 4.0]
dataType - Data type of the output array
start - Start value
stop - Stop value
number - Number of values to generate
Create a new 1d array with values evenly spaced between values 'start' and 'stop'
For example, linspace(start=3.0, stop=4.0, number=3) will generate [3.0, 3.5, 4.0]
start (NUMERIC) - Start value
stop (NUMERIC) - Stop value
number (LONG) - Number of values to generate
dataType - Data type of the output array
Less than operation: elementwise x < y
Return boolean array with values true where satisfied, or false otherwise.
x (NUMERIC) - Input array
y - Double value argument to use in operation
Less than operation: elementwise x < y
If x and y arrays have equal shape, the output shape is the same as these inputs.
Note: supports broadcasting if x and y have different shapes and are broadcastable.
For example, if X has shape [1,10] and Y has shape [5,10] then op(X,Y) has output shape [5,10]
Broadcast rules are the same as NumPy: https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
Return boolean array with values true where satisfied, or false otherwise.
x (NUMERIC) - Input 1
y (NUMERIC) - Input 2
Less than or equals operation: elementwise x <= y
Return boolean array with values true where satisfied, or false otherwise.
x (NUMERIC) - Input array
y - Double value argument to use in operation
Less than or equal to operation: elementwise x <= y
If x and y arrays have equal shape, the output shape is the same as these inputs.
Note: supports broadcasting if x and y have different shapes and are broadcastable.
For example, if X has shape [1,10] and Y has shape [5,10] then op(X,Y) has output shape [5,10]
Broadcast rules are the same as NumPy: https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
Return boolean array with values true where satisfied, or false otherwise.
x (NUMERIC) - Input 1
y (NUMERIC) - Input 2
Returns a boolean mask of equal shape to the input, where the condition is satisfied - value 1 where satisfied, 0 otherwise
in (NUMERIC) - Input
condition - Condition
Returns a count of the number of elements that satisfy the condition
in (NUMERIC) - Input
condition - Condition
Returns a count of the number of elements that satisfy the condition (for each slice along the specified dimensions)
Note that if keepDims = true, the output variable has the same rank as the input variable,
with the reduced dimensions having size 1. This can be useful for later broadcast operations (such as subtracting
the mean along a dimension).
Example: if input has shape [a,b,c] and dimensions=[1] then output has shape:
keepDims = true: [a,1,c]
keepDims = false: [a,c]
in (NUMERIC) - Input variable
condition - Condition
keepDim - If true: keep the dimensions that are reduced on (as size 1). False: remove the reduction dimensions - default = false
dimensions - Dimensions to reduce over. If dimensions are not specified, full array reduction is performed (Size: AtLeast(min=0))
Max array reduction operation, optionally along specified dimensions
Note that if keepDims = true, the output variable has the same rank as the input variable,
with the reduced dimensions having size 1. This can be useful for later broadcast operations (such as subtracting
the mean along a dimension).
Example: if input has shape [a,b,c] and dimensions=[1] then output has shape:
keepDims = true: [a,1,c]
keepDims = false: [a,c]
x (NUMERIC) - Input variable
keepDims - If true: keep the dimensions that are reduced on (as size 1). False: remove the reduction dimensions - default = false
dimensions - Dimensions to reduce over. If dimensions are not specified, full array reduction is performed (Size: AtLeast(min=0))
Element-wise maximum operation: out[i] = max(first[i], second[i])
Note: supports broadcasting if x and y have different shapes and are broadcastable.
For example, if X has shape [1,10] and Y has shape [5,10] then op(X,Y) has output shape [5,10]
Broadcast rules are the same as NumPy: https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
first (NUMERIC) - First input array
second (NUMERIC) - Second input array
Mean (average) array reduction operation, optionally along specified dimensions
Note that if keepDims = true, the output variable has the same rank as the input variable,
with the reduced dimensions having size 1. This can be useful for later broadcast operations (such as subtracting
the mean along a dimension).
Example: if input has shape [a,b,c] and dimensions=[1] then output has shape:
keepDims = true: [a,1,c]
keepDims = false: [a,c]
x (NUMERIC) - Input variable
keepDims - If true: keep the dimensions that are reduced on (as size 1). False: remove the reduction dimensions - default = false
dimensions - Dimensions to reduce over. If dimensions are not specified, full array reduction is performed (Size: AtLeast(min=0))
The merge operation is a control operation that forwards the either of the inputs to the output, when
the first of them becomes available. If both are available, the output is undefined (either input could
be forwarded to the output)
x (NUMERIC) - Input variable
y (NUMERIC) - Input variable
Minimum array reduction operation, optionally along specified dimensions. out = min(in)
Note that if keepDims = true, the output variable has the same rank as the input variable,
with the reduced dimensions having size 1. This can be useful for later broadcast operations (such as subtracting
the mean along a dimension).
Example: if input has shape [a,b,c] and dimensions=[1] then output has shape:
keepDims = true: [a,1,c]
keepDims = false: [a,c]
x (NUMERIC) - Input variable
keepDims - If true: keep the dimensions that are reduced on (as size 1). False: remove the reduction dimensions - default = false
dimensions - Dimensions to reduce over. If dimensions are not specified, full array reduction is performed (Size: AtLeast(min=0))
Element-wise minimum operation: out[i] = min(first[i], second[i])
Note: supports broadcasting if x and y have different shapes and are broadcastable.
For example, if X has shape [1,10] and Y has shape [5,10] then op(X,Y) has output shape [5,10]
Broadcast rules are the same as NumPy: https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
first (NUMERIC) - First input array
second (NUMERIC) - Second input array
Matrix multiplication: out = mmul(x,y)
Supports specifying transpose argument to perform operation such as mmul(a^T, b), etc.
x (NUMERIC) - First input variable
y (NUMERIC) - Second input variable
transposeX - Transpose x (first argument) - default = false
transposeY - Transpose y (second argument) - default = false
transposeZ - Transpose result array - default = false
Not equals operation: elementwise x != y
Return boolean array with values true where satisfied, or false otherwise.
x (NUMERIC) - Input array
y - Double value argument to use in operation
Not equal to operation: elementwise x != y
If x and y arrays have equal shape, the output shape is the same as these inputs.
Note: supports broadcasting if x and y have different shapes and are broadcastable.
For example, if X has shape [1,10] and Y has shape [5,10] then op(X,Y) has output shape [5,10]
Broadcast rules are the same as NumPy: https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
Return boolean array with values true where satisfied, or false otherwise.
x (NUMERIC) - Input 1
y (NUMERIC) - Input 2
Norm1 (L1 norm) reduction operation: The output contains the L1 norm for each tensor/subset along the specified dimensions:
out = sum_i abs(x[i])
Note that if keepDims = true, the output variable has the same rank as the input variable,
with the reduced dimensions having size 1. This can be useful for later broadcast operations (such as subtracting
the mean along a dimension).
Example: if input has shape [a,b,c] and dimensions=[1] then output has shape:
keepDims = true: [a,1,c]
keepDims = false: [a,c]
x (NUMERIC) - Input variable
keepDims - If true: keep the dimensions that are reduced on (as size 1). False: remove the reduction dimensions - default = false
dimensions - dimensions to reduce over (Size: AtLeast(min=0))
Norm2 (L2 norm) reduction operation: The output contains the L2 norm for each tensor/subset along the specified dimensions:
out = sqrt(sum_i x[i]^2)
Note that if keepDims = true, the output variable has the same rank as the input variable,
with the reduced dimensions having size 1. This can be useful for later broadcast operations (such as subtracting
the mean along a dimension).
Example: if input has shape [a,b,c] and dimensions=[1] then output has shape:
keepDims = true: [a,1,c]
keepDims = false: [a,c]
x (NUMERIC) - Input variable
keepDims - If true: keep the dimensions that are reduced on (as size 1). False: remove the reduction dimensions - default = false
dimensions - dimensions dimensions to reduce over (Size: AtLeast(min=0))
Max norm (infinity norm) reduction operation: The output contains the max norm for each tensor/subset along the
specified dimensions:
out = max(abs(x[i]))
Note that if keepDims = true, the output variable has the same rank as the input variable,
with the reduced dimensions having size 1. This can be useful for later broadcast operations (such as subtracting
the mean along a dimension).
Example: if input has shape [a,b,c] and dimensions=[1] then output has shape:
keepDims = true: [a,1,c]
keepDims = false: [a,c]
x (NUMERIC) - Input variable
keepDims - If true: keep the dimensions that are reduced on (as size 1). False: remove the reduction dimensions - default = false
dimensions - dimensions to reduce over (Size: AtLeast(min=0))
Convert the array to a one-hot array with walues and for each entry
If input has shape [ a, ..., n] then output has shape [ a, ..., n, depth],
with {out[i, ..., j, in[i,...,j]] with other values being set to
indices (NUMERIC) - Indices - value 0 to depth-1
depth - Number of classes
axis -
on -
off -
dataType - Output data type - default = DataType.FLOAT
Convert the array to a one-hot array with walues 0 and 1 for each entry
If input has shape [ a, ..., n] then output has shape [ a, ..., n, depth],
with out[i, ..., j, in[i,...,j]] = 1 with other values being set to 0
see oneHot(SDVariable, int, int, double, double)
indices (NUMERIC) - Indices - value 0 to depth-1
depth - Number of classes
Return a variable of all 1s, with the same shape as the input variable. Note that this is dynamic:
if the input shape changes in later execution, the returned variable's shape will also be updated
input (NUMERIC) - Input INDArray
As per onesLike(String, SDVariable) but the output datatype may be specified
input (NUMERIC) -
dataType -
Array permutation operation: permute the dimensions according to the specified permutation indices.
Example: if input has shape [a,b,c] and dimensions = [2,0,1] the output has shape [c,a,b]
x (NUMERIC) - Input variable
dimensions (INT) - Permute dimensions
Array permutation operation: permute the dimensions according to the specified permutation indices.
Example: if input has shape [a,b,c] and dimensions = [2,0,1] the output has shape [c,a,b]
x (NUMERIC) - Input variable
dimensions - (Size: AtLeast(min=0))
Product array reduction operation, optionally along specified dimensions
Note that if keepDims = true, the output variable has the same rank as the input variable,
with the reduced dimensions having size 1. This can be useful for later broadcast operations (such as subtracting
the mean along a dimension).
Example: if input has shape [a,b,c] and dimensions=[1] then output has shape:
keepDims = true: [a,1,c]
keepDims = false: [a,c]
x (NUMERIC) - Input variable
keepDims - If true: keep the dimensions that are reduced on (as size 1). False: remove the reduction dimensions - default = false
dimensions - Dimensions to reduce over. If dimensions are not specified, full array reduction is performed (Size: AtLeast(min=0))
Create a new variable with a 1d array, where the values start at from and increment by step
up to (but not including) limit.
For example, range(1.0, 3.0, 0.5) will return [1.0, 1.5, 2.0, 2.5]
from - Initial/smallest value
to - Largest value (exclusive)
step - Step size
dataType -
Create a new variable with a 1d array, where the values start at from and increment by step
up to (but not including) limit.
For example, range(1.0, 3.0, 0.5) will return [1.0, 1.5, 2.0, 2.5]
from (NUMERIC) - Initial/smallest value
to (NUMERIC) - Largest value (exclusive)
step (NUMERIC) - Step size
dataType -
Returns the rank (number of dimensions, i.e., length(shape)) of the specified INDArray as a 0D scalar variable
in (NUMERIC) - Input variable
Element-wise replace where condition:
out[i] = from[i] if condition(update[i]) is satisfied, or
out[i] = update[i] if condition(update[i]) is NOT satisfied
update (NUMERIC) - Source array
from (NUMERIC) - Replacement values array (used conditionally). Must be same shape as 'update' array
condition - Condition to check on update array elements
Element-wise replace where condition:
out[i] = value if condition(update[i]) is satisfied, or
out[i] = update[i] if condition(update[i]) is NOT satisfied
update (NUMERIC) - Source array
value - Value to set at the output, if the condition is satisfied
condition - Condition to check on update array elements
Reshape the input variable to the specified (fixed) shape. The output variable will have the same values as the
input, but with the specified shape.
Note that prod(shape) must match length(input) == prod(input.shape)
x (NUMERIC) - Input variable
shape (NUMERIC) - New shape for variable
Reshape the input variable to the specified (fixed) shape. The output variable will have the same values as the
input, but with the specified shape.
Note that prod(shape) must match length(input) == prod(input.shape)
x (NUMERIC) - Input variable
shape - New shape for variable (Size: AtLeast(min=0))
Reverse the values of an array for the specified dimensions
If input is:
[ 1, 2, 3]
[ 4, 5, 6]
then
reverse(in, 0):
[3, 2, 1]
[6, 5, 4]
reverse(in, 1):
[4, 5, 6]
[1, 2 3]
x (NUMERIC) - Input variable
dimensions - Input variable (Size: AtLeast(min=0))
Reverse sequence op: for each slice along dimension seqDimension, the first seqLength values are reversed
x (NUMERIC) - Input variable
seq_lengths (INT) - Length of the sequences
seqDim - Sequence dimension - default = -1
batchDim - Batch dimension - default = 0
Element-wise scalar floor modulus operation: out = floorMod(in, value).
i.e., returns the remainder after division by 'value'
in (NUMERIC) - Input variable
value - Scalar value to compare
Element-wise scalar maximum operation: out = max(in, value)
in (NUMERIC) - Input variable
value - Scalar value to compare
Element-wise scalar minimum operation: out = min(in, value)
in (NUMERIC) - Input variable
value - Scalar value to compare
Return a variable with equal shape to the input, but all elements set to value 'set'
in (NUMERIC) - Input variable
set - Value to set
Scatter addition operation.
If indices is rank 0 (a scalar), then out[index, ...] = out[index, ...] + op(updates[...])
If indices is rank 1 (a vector), then for each position i, out[indices[i], ...] = out[indices[i], ...] + op(updates[i, ...])
If indices is rank 2+, then for each position (i,...,k), out[indices[i], ..., indices[k], ...] = out[indices[i], ..., indices[k], ...] + op(updates[i, ..., k, ...])
Note that if multiple indices refer to the same location, the contributions from each is handled correctly.
ref (NUMERIC) - Initial/source variable
indices (NUMERIC) - Indices array
updates (NUMERIC) - Updates to add to the initial/source array
Scatter division operation.
If indices is rank 0 (a scalar), then out[index, ...] = out[index, ...] + op(updates[...])
If indices is rank 1 (a vector), then for each position i, out[indices[i], ...] = out[indices[i], ...] + op(updates[i, ...])
If indices is rank 2+, then for each position (i,...,k), out[indices[i], ..., indices[k], ...] = out[indices[i], ..., indices[k], ...] + op(updates[i, ..., k, ...])
Note that if multiple indices refer to the same location, the contributions from each is handled correctly.
ref (NUMERIC) - Initial/source variable
indices (NUMERIC) - Indices array
updates (NUMERIC) - Updates to add to the initial/source array
Scatter max operation.
If indices is rank 0 (a scalar), then out[index, ...] = out[index, ...] + op(updates[...])
If indices is rank 1 (a vector), then for each position i, out[indices[i], ...] = out[indices[i], ...] + op(updates[i, ...])
If indices is rank 2+, then for each position (i,...,k), out[indices[i], ..., indices[k], ...] = out[indices[i], ..., indices[k], ...] + op(updates[i, ..., k, ...])
Note that if multiple indices refer to the same location, the contributions from each is handled correctly.
ref (NUMERIC) - Initial/source variable
indices (NUMERIC) - Indices array
updates (NUMERIC) - Updates to add to the initial/source array
Scatter min operation.
If indices is rank 0 (a scalar), then out[index, ...] = out[index, ...] + op(updates[...])
If indices is rank 1 (a vector), then for each position i, out[indices[i], ...] = out[indices[i], ...] + op(updates[i, ...])
If indices is rank 2+, then for each position (i,...,k), out[indices[i], ..., indices[k], ...] = out[indices[i], ..., indices[k], ...] + op(updates[i, ..., k, ...])
Note that if multiple indices refer to the same location, the contributions from each is handled correctly.
ref (NUMERIC) - Initial/source variable
indices (NUMERIC) - Indices array
updates (NUMERIC) - Updates to add to the initial/source array
Scatter multiplication operation.
If indices is rank 0 (a scalar), then out[index, ...] = out[index, ...] + op(updates[...])
If indices is rank 1 (a vector), then for each position i, out[indices[i], ...] = out[indices[i], ...] + op(updates[i, ...])
If indices is rank 2+, then for each position (i,...,k), out[indices[i], ..., indices[k], ...] = out[indices[i], ..., indices[k], ...] + op(updates[i, ..., k, ...])
Note that if multiple indices refer to the same location, the contributions from each is handled correctly.
ref (NUMERIC) - Initial/source variable
indices (NUMERIC) - Indices array
updates (NUMERIC) - Updates to add to the initial/source array
Scatter subtraction operation.
If indices is rank 0 (a scalar), then out[index, ...] = out[index, ...] + op(updates[...])
If indices is rank 1 (a vector), then for each position i, out[indices[i], ...] = out[indices[i], ...] + op(updates[i, ...])
If indices is rank 2+, then for each position (i,...,k), out[indices[i], ..., indices[k], ...] = out[indices[i], ..., indices[k], ...] + op(updates[i, ..., k, ...])
Note that if multiple indices refer to the same location, the contributions from each is handled correctly.
ref (NUMERIC) - Initial/source variable
indices (NUMERIC) - Indices array
updates (NUMERIC) - Updates to add to the initial/source array
Scatter update operation.
If indices is rank 0 (a scalar), then out[index, ...] = out[index, ...] + op(updates[...])
If indices is rank 1 (a vector), then for each position i, out[indices[i], ...] = out[indices[i], ...] + op(updates[i, ...])
If indices is rank 2+, then for each position (i,...,k), out[indices[i], ..., indices[k], ...] = out[indices[i], ..., indices[k], ...] + op(updates[i, ..., k, ...])
Note that if multiple indices refer to the same location, the contributions from each is handled correctly.
ref (NUMERIC) - Initial/source variable
indices (NUMERIC) - Indices array
updates (NUMERIC) - Updates to add to the initial/source array
Segment max operation.
If data = [3, 6, 1, 4, 9, 2, 8]
segmentIds = [0, 0, 1, 1, 1, 2, 2]
then output = [6, 9, 8] = [op(3,6), op(1,4,9), op(2,8)]
Note that the segment IDs must be sorted from smallest to largest segment.
See {unsortedSegment (String, SDVariable, SDVariable, int) ops
for the same op without this sorted requirement
data (NDARRAY) - Data to perform segment max on
segmentIds (NUMERIC) - Variable for the segment IDs
Segment mean operation.
If data = [3, 6, 1, 4, 9, 2, 8]
segmentIds = [0, 0, 1, 1, 1, 2, 2]
then output = [6, 9, 8] = [op(3,6), op(1,4,9), op(2,8)]
Note that the segment IDs must be sorted from smallest to largest segment.
See {unsortedSegment (String, SDVariable, SDVariable, int) ops
for the same op without this sorted requirement
data (NDARRAY) - Data to perform segment max on
segmentIds (NUMERIC) - Variable for the segment IDs
Segment min operation.
If data = [3, 6, 1, 4, 9, 2, 8]
segmentIds = [0, 0, 1, 1, 1, 2, 2]
then output = [6, 9, 8] = [op(3,6), op(1,4,9), op(2,8)]
Note that the segment IDs must be sorted from smallest to largest segment.
See {unsortedSegment (String, SDVariable, SDVariable, int) ops
for the same op without this sorted requirement
data (NDARRAY) - Data to perform segment max on
segmentIds (NUMERIC) - Variable for the segment IDs
Segment product operation.
If data = [3, 6, 1, 4, 9, 2, 8]
segmentIds = [0, 0, 1, 1, 1, 2, 2]
then output = [6, 9, 8] = [op(3,6), op(1,4,9), op(2,8)]
Note that the segment IDs must be sorted from smallest to largest segment.
See {unsortedSegment (String, SDVariable, SDVariable, int) ops
for the same op without this sorted requirement
data (NDARRAY) - Data to perform segment max on
segmentIds (NUMERIC) - Variable for the segment IDs
Segment sum operation.
If data = [3, 6, 1, 4, 9, 2, 8]
segmentIds = [0, 0, 1, 1, 1, 2, 2]
then output = [6, 9, 8] = [op(3,6), op(1,4,9), op(2,8)]
Note that the segment IDs must be sorted from smallest to largest segment.
See {unsortedSegment (String, SDVariable, SDVariable, int) ops
for the same op without this sorted requirement
data (NDARRAY) - Data to perform segment max on
segmentIds (NUMERIC) - Variable for the segment IDs
Generate a sequence mask (with values 0 or 1) based on the specified lengths
Specifically, out[i, ..., k, j] = (j < lengths[i, ..., k] ? 1.0 : 0.0)
lengths (NUMERIC) - Lengths of the sequences
maxLen - Maximum sequence length
dataType -
Generate a sequence mask (with values 0 or 1) based on the specified lengths
Specifically, out[i, ..., k, j] = (j < lengths[i, ..., k] ? 1.0 : 0.0)
lengths (NUMERIC) - Lengths of the sequences
maxLen (INT) - Maximum sequence length
dataType -
see sequenceMask(String, SDVariable, SDVariable, DataType)
lengths (NUMERIC) -
dataType -
Returns the shape of the specified INDArray as a 1D INDArray
input (NUMERIC) - Input variable
Returns the size (number of elements, i.e., prod(shape)) of the specified INDArray as a 0D scalar variable
in (NUMERIC) - Input variable
Returns a rank 0 (scalar) variable for the size of the specified dimension.
For example, if X has shape [10,20,30] then sizeAt(X,1)=20. Similarly, sizeAt(X,-1)=30
in (NUMERIC) - Input variable
dimension - Dimension to get size of
Get a subset of the specified input, by specifying the first element and the size of the array.
For example, if input is:
[a, b, c]
[d, e, f]
then slice(input, begin=[0,1], size=[2,1] will return:
[b]
[e]
Note that for each dimension i, begin[i] + size[i] <= input.size(i)
input (NUMERIC) - input Variable to get subset of
begin - Beginning index. Must be same length as rank of input array (Size: AtLeast(min=1))
size - Size of the output array. Must be same length as rank of input array (Size: AtLeast(min=1))
Get a subset of the specified input, by specifying the first element and the size of the array.
For example, if input is:
[a, b, c]
[d, e, f]
then slice(input, begin=[0,1], size=[2,1] will return:
[b]
[e]
Note that for each dimension i, begin[i] + size[i] <= input.size(i)
input (NUMERIC) - input Variable to get subset of
begin (INT) - Beginning index. Must be same length as rank of input array
size (INT) - Size of the output array. Must be same length as rank of input array
Squared L2 norm: see norm2(String, SDVariable, boolean, int...)
Note that if keepDims = true, the output variable has the same rank as the input variable,
with the reduced dimensions having size 1. This can be useful for later broadcast operations (such as subtracting
the mean along a dimension).
Example: if input has shape [a,b,c] and dimensions=[1] then output has shape:
keepDims = true: [a,1,c]
keepDims = false: [a,c]
x (NUMERIC) -
keepDims - - default = false
dimensions - (Size: AtLeast(min=0))
Remove a single dimension of size 1.
For example, if input has shape [a,b,1,c] then squeeze(input, 2) returns an array of shape [a,b,c]
x (NUMERIC) - Input variable
axis - Size 1 dimension to remove
Stack a set of N INDArray of rank X into one rank X+1 variable.
If inputs have shape [a,b,c] then output has shape:
axis = 0: [N,a,b,c]
axis = 1: [a,N,b,c]
axis = 2: [a,b,N,c]
axis = 3: [a,b,c,N]
see unstack(String[], SDVariable, int, int)
values (NDARRAY) - Input variables to stack. Must have the same shape for all inputs
axis - Axis to stack on
Stardard deviation array reduction operation, optionally along specified dimensions
Note that if keepDims = true, the output variable has the same rank as the input variable,
with the reduced dimensions having size 1. This can be useful for later broadcast operations (such as subtracting
the mean along a dimension).
Example: if input has shape [a,b,c] and dimensions=[1] then output has shape:
keepDims = true: [a,1,c]
keepDims = false: [a,c]
x (NUMERIC) - Input variable
biasCorrected - If true: divide by (N-1) (i.e., sample stdev). If false: divide by N (population stdev)
keepDims - If true: keep the dimensions that are reduced on (as size 1). False: remove the reduction dimensions - default = false
dimensions - Dimensions to reduce over. If dimensions are not specified, full array reduction is performed (Size: AtLeast(min=0))
Get a subset of the specified input, by specifying the first element, last element, and the strides.
For example, if input is:
[a, b, c]
[d, e, f]
[g, h, i]
then stridedSlice(input, begin=[0,1], end=[2,2], strides=[2,1], all masks = 0) will return:
[b, c]
[h, i]
in (NUMERIC) - Variable to get subset of
begin - Beginning index (Size: AtLeast(min=1))
end - End index (Size: AtLeast(min=1))
strides - Stride ("step size") for each dimension. For example, stride of 2 means take every second element. (Size: AtLeast(min=1))
beginMask - Bit mask: If the ith bit is set to 1, then the value in the begin long[] is ignored, and a value of 0 is used instead for the beginning index for that dimension - default = 0
endMask - Bit mask: If the ith bit is set to 1, then the value in the end long[] is ignored, and a value of size(i)-1 is used instead for the end index for that dimension - default = 0
ellipsisMask - Bit mask: only one non-zero value is allowed here. If a non-zero value is set, then other dimensions are inserted as required at the specified position - default = 0
newAxisMask - Bit mask: if the ith bit is set to 1, then the begin/end/stride values are ignored, and a size 1 dimension is inserted at this point - default = 0
shrinkAxisMask - Bit mask: if the ith bit is set to 1, then the begin/end/stride values are ignored, and a size 1 dimension is removed at this point. Note that begin/end/stride values must result in a size 1 output for these dimensions - default = 0
Sum array reduction operation, optionally along specified dimensions.
Note that if keepDims = true, the output variable has the same rank as the input variable,
with the reduced dimensions having size 1. This can be useful for later broadcast operations (such as subtracting
the mean along a dimension).
Example: if input has shape [a,b,c] and dimensions=[1] then output has shape:
keepDims = true: [a,1,c]
keepDims = false: [a,c]
x (NUMERIC) - Input variable
keepDims - If true: keep the dimensions that are reduced on (as length 1). False: remove the reduction dimensions - default = false
dimensions - Dimensions to reduce over. If dimensions are not specified, full array reduction is performed (Size: AtLeast(min=0))
Switch operation
Predictate - if false, values are output to left (first) branch/output; if true, to right (second) branch/output
x (NDARRAY) - Input variable
predicate (BOOL) - Predictate - if false, values are output to left (first) branch/output; if true, to right (second) branch/output
//TODO: Ops must be documented.
x (NUMERIC) - Input variable x
y (NUMERIC) - Input variable y
dimensionsX - dimensions for first input array (x) (Size: AtLeast(min=1))
dimensionsY - dimensions for second input array (y) (Size: AtLeast(min=1))
transposeX - Transpose x (first argument) - default = false
transposeY - Transpose y (second argument) - default = false
transposeZ - Transpose result array - default = false
Repeat (tile) the input tensor the specified number of times.
For example, if input is
[1, 2]
[3, 4]
and repeat is [2, 3]
then output is
[1, 2, 1, 2, 1, 2]
[3, 4, 3, 4, 3, 4]
[1, 2, 1, 2, 1, 2]
[3, 4, 3, 4, 3, 4]
x (NDARRAY) - Input variable
repeat (INT) - Number of times to repeat in each axis. Must have length equal to the rank of the input array
see tile(String, SDVariable, int...)
x (NDARRAY) -
repeat - (Size: AtLeast(min=1))
Matrix transpose operation: If input has shape [a,b] output has shape [b,a]
x (NDARRAY) - Input variable
Unsorted segment max operation. As per segmentMax(String, SDVariable, SDVariable) but without
the requirement for the indices to be sorted.
If data = [1, 3, 2, 6, 4, 9, 8]
segmentIds = [1, 0, 2, 0, 1, 1, 2]
then output = [6, 9, 8] = [max(3,6), max(1,4,9), max(2,8)]
data (NUMERIC) - Data (variable) to perform unsorted segment max on
segmentIds (NUMERIC) - Variable for the segment IDs
numSegments - Number of segments
Unsorted segment mean operation. As per segmentMean(String, SDVariable, SDVariable) but without
the requirement for the indices to be sorted.
If data = [1, 3, 2, 6, 4, 9, 8]
segmentIds = [1, 0, 2, 0, 1, 1, 2]
then output = [4.5, 4.666, 5] = [mean(3,6), mean(1,4,9), mean(2,8)]
data (NUMERIC) - Data (variable) to perform unsorted segment max on
segmentIds (NUMERIC) - Variable for the segment IDs
numSegments - Number of segments
Unsorted segment min operation. As per segmentMin(String, SDVariable, SDVariable) but without
the requirement for the indices to be sorted.
If data = [1, 3, 2, 6, 4, 9, 8]
segmentIds = [1, 0, 2, 0, 1, 1, 2]
then output = [3, 1, 2] = [min(3,6), min(1,4,9), min(2,8)]
data (NUMERIC) - Data (variable) to perform unsorted segment max on
segmentIds (NUMERIC) - Variable for the segment IDs
numSegments - Number of segments
Unsorted segment product operation. As per segmentProd(String, SDVariable, SDVariable) but without
the requirement for the indices to be sorted.
If data = [1, 3, 2, 6, 4, 9, 8]
segmentIds = [1, 0, 2, 0, 1, 1, 2]
then output = [4.5, 4.666, 5] = [mean(3,6), mean(1,4,9), mean(2,8)]
data (NUMERIC) - Data (variable) to perform unsorted segment max on
segmentIds (NUMERIC) - Variable for the segment IDs
numSegments - Number of segments
Unsorted segment sqrtN operation. Simply returns the sqrt of the count of the number of values in each segment
If data = [1, 3, 2, 6, 4, 9, 8]
segmentIds = [1, 0, 2, 0, 1, 1, 2]
then output = [1.414, 1.732, 1.414] = [sqrt(2), sqrtN(3), sqrtN(2)]
data (NUMERIC) - Data (variable) to perform unsorted segment max on
segmentIds (NUMERIC) - Variable for the segment IDs
numSegments - Number of segments
Unsorted segment sum operation. As per segmentSum(String, SDVariable, SDVariable) but without
the requirement for the indices to be sorted.
If data = [1, 3, 2, 6, 4, 9, 8]
segmentIds = [1, 0, 2, 0, 1, 1, 2]
then output = [9, 14, 10] = [sum(3,6), sum(1,4,9), sum(2,8)]
data (NUMERIC) - Data (variable) to perform unsorted segment max on
segmentIds (NUMERIC) - Variable for the segment IDs
numSegments - Number of segments
Unstack a variable of rank X into N rank X-1 variables by taking slices along the specified axis.
If input has shape [a,b,c] then output has shape:
axis = 0: [b,c]
axis = 1: [a,c]
axis = 2: [a,b]
value (NDARRAY) - Input variable to unstack
axis - Axis to unstack on
num - Number of output variables
Variance array reduction operation, optionally along specified dimensions
Note that if keepDims = true, the output variable has the same rank as the input variable,
with the reduced dimensions having size 1. This can be useful for later broadcast operations (such as subtracting
the mean along a dimension).
Example: if input has shape [a,b,c] and dimensions=[1] then output has shape:
keepDims = true: [a,1,c]
keepDims = false: [a,c]
x (NUMERIC) - Input variable
biasCorrected - If true: divide by (N-1) (i.e., sample variable). If false: divide by N (population variance)
keepDims - If true: keep the dimensions that are reduced on (as size 1). False: remove the reduction dimensions - default = false
dimensions - Dimensions to reduce over. If dimensions are not specified, full array reduction is performed (Size: AtLeast(min=0))
Return a variable of all 0s, with the same shape as the input variable. Note that this is dynamic:
if the input shape changes in later execution, the returned variable's shape will also be updated
input (NUMERIC) - Input
All operations in ND4J and SameDiff are available in "Operation Namespaces". Each namespace is available on the Nd4j
and SameDiff
classes with its lowercase name.
For example, if you want to use the absoluteDifference operation it would look like this
Bitwise AND operation. Supports broadcasting.
x (INT) - First input array
y (INT) - Second input array
Roll integer bits to the left, i.e. var << 4 | var >> (32 - 4)
x (INT) - Input 1
shift (INT) - Number of bits to shift.
Roll integer bits to the right, i.e. var >> 4 | var << (32 - 4)
x (INT) - Input 1
shift (INT) - Number of bits to shift.
Shift integer bits to the left, i.e. var << 4
x (INT) - Input 1
shift (INT) - Number of bits to shift.
Shift integer bits to the right, i.e. var >> 4
x (INT) - Input 1
shift (INT) - Number of bits to shift.
Bitwise Hamming distance reduction over all elements of both input arrays. For example, if x=01100000 and y=1010000 then the bitwise Hamming distance is 2 (due to differences at positions 0 and 1)
x (INT) - First input array.
y (INT) - Second input array.
Bitwise left shift operation. Supports broadcasting.
x (INT) - Input to be bit shifted
y (INT) - Amount to shift elements of x array
Bitwise left cyclical shift operation. Supports broadcasting.
Unlike #leftShift(INDArray, INDArray) the bits will "wrap around":
leftShiftCyclic(01110000, 2) -> 11000001
x (INT) - Input to be bit shifted
y (INT) - Amount to shift elements of x array
Bitwise OR operation. Supports broadcasting.
x (INT) - First input array
y (INT) - First input array
Bitwise right shift operation. Supports broadcasting.
x (INT) - Input to be bit shifted
y (INT) - Amount to shift elements of x array
Bitwise right cyclical shift operation. Supports broadcasting.
Unlike rightShift(INDArray, INDArray) the bits will "wrap around":
rightShiftCyclic(00001110, 2) -> 10000011
x (INT) - Input to be bit shifted
y (INT) - Amount to shift elements of x array
Bitwise XOR operation (exclusive OR). Supports broadcasting.
x (INT) - First input array
y (INT) - First input array
Clips tensor values to a maximum average L2-norm.
x (NUMERIC) - Input variable
clipValue - Value for clipping
dimensions - Dimensions to reduce over (Size: AtLeast(min=0))
Looks up ids in a list of embedding tensors.
x (NUMERIC) - Input tensor
indices (INT) - A Tensor containing the ids to be looked up.
PartitionMode - partition_mode == 0 - i.e. 'mod' , 1 - 'div'
Return array of max elements indices with along tensor dimensions
x (NUMERIC) - Input tensor
dataType - Data type - default = DataType.INT
Elementwise absolute value operation: out = abs(x)
x (NUMERIC) - Input variable
Elementwise acos (arccosine, inverse cosine) operation: out = arccos(x)
x (NUMERIC) - Input variable
Elementwise acosh (inverse hyperbolic cosine) function: out = acosh(x)
x (NUMERIC) - Input variable
Pairwise addition operation, out = x + y
Note: supports broadcasting if x and y have different shapes and are broadcastable.
For example, if X has shape [1,10] and Y has shape [5,10] then op(X,Y) has output shape [5,10]
Broadcast rules are the same as NumPy: https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
x (NUMERIC) - Input variable
y (NUMERIC) - Input variable
Scalar add operation, out = in + scalar
x (NUMERIC) - Input variable
value - Scalar value for op
Absolute max array reduction operation, optionally along specified dimensions: out = max(abs(x))
in (NUMERIC) - Input variable
dimensions - Dimensions to reduce over. If dimensions are not specified, full array reduction is performed (Size: AtLeast(min=0))
Absolute mean array reduction operation, optionally along specified dimensions: out = mean(abs(x))
in (NUMERIC) - Input variable
dimensions - Dimensions to reduce over. If dimensions are not specified, full array reduction is performed (Size: AtLeast(min=0))
Absolute min array reduction operation, optionally along specified dimensions: out = min(abs(x))
in (NUMERIC) - Input variable
dimensions - Dimensions to reduce over. If dimensions are not specified, full array reduction is performed (Size: AtLeast(min=0))
Boolean AND operation: elementwise (x != 0) && (y != 0)
If x and y arrays have equal shape, the output shape is the same as these inputs.
Note: supports broadcasting if x and y have different shapes and are broadcastable.
Returns an array with values 1 where condition is satisfied, or value 0 otherwise.
x (BOOL) - Input 1
y (BOOL) - Input 2
Elementwise asin (arcsin, inverse sine) operation: out = arcsin(x)
x (NUMERIC) - Input variable
Elementwise asinh (inverse hyperbolic sine) function: out = asinh(x)
x (NUMERIC) - Input variable
Absolute sum array reduction operation, optionally along specified dimensions: out = sum(abs(x))
in (NUMERIC) - Input variable
dimensions - Dimensions to reduce over. If dimensions are not specified, full array reduction is performed (Size: AtLeast(min=0))
Elementwise atan (arctangent, inverse tangent) operation: out = arctangent(x)
x (NUMERIC) - Input variable
Elementwise atan (arctangent, inverse tangent) operation: out = atan2(x,y).
Similar to atan(y/x) but sigts of x and y are used to determine the location of the result
y (NUMERIC) - Input Y variable
x (NUMERIC) - Input X variable
Elementwise atanh (inverse hyperbolic tangent) function: out = atanh(x)
x (NUMERIC) - Input variable
Bit shift operation
x (NUMERIC) - input
shift (NUMERIC) - shift value
Right bit shift operation
x (NUMERIC) - Input tensor
shift (NUMERIC) - shift argument
Cyclic bit shift operation
x (NUMERIC) - Input tensor
shift (NUMERIC) - shift argy=ument
Cyclic right shift operation
x (NUMERIC) - Input tensor
shift (NUMERIC) - Shift argument
Element-wise ceiling function: out = ceil(x).
Rounds each value up to the nearest integer value (if not already an integer)
x (NUMERIC) - Input variable
Clipping by L2 norm, optionally along dimension(s)
if l2Norm(x,dimension) < clipValue, then input is returned unmodifed
Otherwise, out[i] = in[i] * clipValue / l2Norm(in, dimensions) where each value is clipped according
to the corresponding l2Norm along the specified dimensions
x (NUMERIC) - Input variable
clipValue - Clipping value (maximum l2 norm)
dimensions - Dimensions to reduce over. If dimensions are not specified, full array reduction is performed (Size: AtLeast(min=0))
Element-wise clipping function:
out[i] = in[i] if in[i] >= clipValueMin and in[i] <= clipValueMax
out[i] = clipValueMin if in[i] < clipValueMin
out[i] = clipValueMax if in[i] > clipValueMax
x (NUMERIC) - Input variable
clipValueMin - Minimum value for clipping
clipValueMax - Maximum value for clipping
Compute the 2d confusion matrix of size [numClasses, numClasses] from a pair of labels and predictions, both of
which are represented as integer values. This version assumes the number of classes is 1 + max(max(labels), max(pred))
For example, if labels = [0, 1, 1] and predicted = [0, 2, 1] then output is:
[1, 0, 0]
[0, 1, 1]
[0, 0, 0]
labels (NUMERIC) - Labels - 1D array of integer values representing label values
pred (NUMERIC) - Predictions - 1D array of integer values representing predictions. Same length as labels
dataType - Data type
Compute the 2d confusion matrix of size [numClasses, numClasses] from a pair of labels and predictions, both of
which are represented as integer values.
For example, if labels = [0, 1, 1], predicted = [0, 2, 1], and numClasses=4 then output is:
[1, 0, 0, 0]
[0, 1, 1, 0]
[0, 0, 0, 0]
[0, 0, 0, 0]
labels (NUMERIC) - Labels - 1D array of integer values representing label values
pred (NUMERIC) - Predictions - 1D array of integer values representing predictions. Same length as labels
numClasses - Number of classes
Compute the 2d confusion matrix of size [numClasses, numClasses] from a pair of labels and predictions, both of
which are represented as integer values. This version assumes the number of classes is 1 + max(max(labels), max(pred))
For example, if labels = [0, 1, 1], predicted = [0, 2, 1] and weights = [1, 2, 3]
[1, 0, 0]
[0, 3, 2]
[0, 0, 0]
labels (NUMERIC) - Labels - 1D array of integer values representing label values
pred (NUMERIC) - Predictions - 1D array of integer values representing predictions. Same length as labels
weights (NUMERIC) - Weights - 1D array of values (may be real/decimal) representing the weight/contribution of each prediction. Must be same length as both labels and predictions arrays
Compute the 2d confusion matrix of size [numClasses, numClasses] from a pair of labels and predictions, both of
which are represented as integer values.
For example, if labels = [0, 1, 1], predicted = [0, 2, 1], numClasses = 4, and weights = [1, 2, 3]
[1, 0, 0, 0]
[0, 3, 2, 0]
[0, 0, 0, 0]
[0, 0, 0, 0]
labels (NUMERIC) - Labels - 1D array of integer values representing label values
pred (NUMERIC) - Predictions - 1D array of integer values representing predictions. Same length as labels
weights (NUMERIC) - Weights - 1D array of values (may be real/decimal) representing the weight/contribution of each prediction. Must be same length as both labels and predictions arrays
numClasses -
Elementwise cosine operation: out = cos(x)
x (NUMERIC) - Input variable
Elementwise cosh (hyperbolic cosine) operation: out = cosh(x)
x (NUMERIC) - Input variable
Cosine distance reduction operation. The output contains the cosine distance for each
tensor/subset along the specified dimensions:
out = 1.0 - cosineSimilarity(x,y)
x (NUMERIC) - Input variable x
y (NUMERIC) - Input variable y
dimensions - Dimensions to calculate cosineDistance over (Size: AtLeast(min=0))
Cosine similarity pairwise reduction operation. The output contains the cosine similarity for each tensor/subset
along the specified dimensions:
out = (sum_i x[i] y[i]) / ( sqrt(sum_i x[i]^2) sqrt(sum_i y[i]^2)
x (NUMERIC) - Input variable x
y (NUMERIC) - Input variable y
dimensions - Dimensions to calculate cosineSimilarity over (Size: AtLeast(min=0))
Count non zero array reduction operation, optionally along specified dimensions: out = count(x != 0)
in (NUMERIC) - Input variable
dimensions - Dimensions to reduce over. If dimensions are not specified, full array reduction is performed (Size: AtLeast(min=0))
Count zero array reduction operation, optionally along specified dimensions: out = count(x == 0)
in (NUMERIC) - Input variable
dimensions - Dimensions to reduce over. If dimensions are not specified, full array reduction is performed (Size: AtLeast(min=0))
Returns the pair-wise cross product of equal size arrays a and b: a x b = ||a||x||b|| sin(theta).
Can take rank 1 or above inputs (of equal shapes), but note that the last dimension must have dimension 3
a (NUMERIC) - First input
b (NUMERIC) - Second input
Element-wise cube function: out = x^3
x (NUMERIC) - Input variable
Returns an output variable with diagonal values equal to the specified values; off-diagonal values will be set to 0
For example, if input = [1,2,3], then output is given by:
[ 1, 0, 0]
[ 0, 2, 0]
[ 0, 0, 3]
Higher input ranks are also supported: if input has shape [a,...,R-1] then output[i,...,k,i,...,k] = input[i,...,k].
i.e., for input rank R, output has rank 2R
x (NUMERIC) - Input variable
Extract the diagonal part from the input array.
If input is
[ 1, 0, 0]
[ 0, 2, 0]
[ 0, 0, 3]
then output is [1, 2, 3].
Supports higher dimensions: in general, out[i,...,k] = in[i,...,k,i,...,k]
x (NUMERIC) - Input variable
Pairwise division operation, out = x / y
Note: supports broadcasting if x and y have different shapes and are broadcastable.
For example, if X has shape [1,10] and Y has shape [5,10] then op(X,Y) has output shape [5,10]
Broadcast rules are the same as NumPy: https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
x (NUMERIC) - Input variable
y (NUMERIC) - Input variable
Scalar division operation, out = in / scalar
x (NUMERIC) - Input variable
value - Scalar value for op
Entropy reduction: -sum(x * log(x))
in (NUMERIC) - Input variable
dimensions - Dimensions to reduce over. If dimensions are not specified, full array reduction is performed (Size: AtLeast(min=0))
Element-wise Gaussian error function - out = erf(in)
x (NUMERIC) - Input variable
Element-wise complementary Gaussian error function - out = erfc(in) = 1 - erf(in)
x (NUMERIC) - Input variable
Euclidean distance (l2 norm, l2 distance) reduction operation. The output contains the Euclidean distance for each
tensor/subset along the specified dimensions:
out = sqrt( sum_i (x[i] - y[i])^2 )
x (NUMERIC) - Input variable x
y (NUMERIC) - Input variable y
dimensions - Dimensions to calculate euclideanDistance over (Size: AtLeast(min=0))
Elementwise exponent function: out = exp(x) = 2.71828...^x
x (NUMERIC) - Input variable
Elementwise 1.0 - exponent function: out = 1.0 - exp(x) = 1.0 - 2.71828...^x
x (NUMERIC) - Input variable
Generate an identity matrix with the specified number of rows and columns.
rows - Number of rows
As per eye(String, int, int, DataType) but with the default datatype, Eye.DEFAULT_DTYPE
rows - Number of rows
cols - Number of columns
Generate an identity matrix with the specified number of rows and columns
Example:
rows - Number of rows
cols - Number of columns
dataType - Data type
dimensions - (Size: AtLeast(min=0))
As per eye(int, int) bit with the number of rows/columns specified as scalar INDArrays
rows (INT) - Number of rows
cols (INT) - Number of columns
As per eye(String, int) but with the number of rows specified as a scalar INDArray
rows (INT) - Number of rows
First index reduction operation.
Returns a variable that contains the index of the first element that matches the specified condition (for each
slice along the specified dimensions)
Note that if keepDims = true, the output variable has the same rank as the input variable,
with the reduced dimensions having size 1. This can be useful for later broadcast operations (such as subtracting
the mean along a dimension).
Example: if input has shape [a,b,c] and dimensions=[1] then output has shape:
keepDims = true: [a,1,c]
keepDims = false: [a,c]
in (NUMERIC) - Input variable
condition - Condition to check on input variable
dimensions - Dimensions to reduce over. If dimensions are not specified, full array reduction is performed (Size: AtLeast(min=1))
keepDims - If true: keep the dimensions that are reduced on (as length 1). False: remove the reduction dimensions - default = false
Element-wise floor function: out = floor(x).
Rounds each value down to the nearest integer value (if not already an integer)
x (NUMERIC) - Input variable
Pairwise floor division operation, out = floor(x / y)
Note: supports broadcasting if x and y have different shapes and are broadcastable.
For example, if X has shape [1,10] and Y has shape [5,10] then op(X,Y) has output shape [5,10]
Broadcast rules are the same as NumPy: https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
x (NUMERIC) - Input variable
y (NUMERIC) - Input variable
Pairwise Modulus division operation
Note: supports broadcasting if x and y have different shapes and are broadcastable.
For example, if X has shape [1,10] and Y has shape [5,10] then op(X,Y) has output shape [5,10]
Broadcast rules are the same as NumPy: https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
x (NUMERIC) - Input variable
y (NUMERIC) - Input variable
Scalar floor modulus operation
x (NUMERIC) - Input variable
value - Scalar value for op
Hamming distance reduction operation. The output contains the cosine distance for each
tensor/subset along the specified dimensions:
out = count( x[i] != y[i] )
x (NUMERIC) - Input variable x
y (NUMERIC) - Input variable y
dimensions - Dimensions to calculate hammingDistance over (Size: AtLeast(min=0))
Index of the max absolute value: argmax(abs(in))
see argmax(String, INDArray, boolean, int...)
in (NUMERIC) - Input variable
dimensions - Dimensions to reduce over. If dimensions are not specified, full array reduction is performed (Size: AtLeast(min=1))
keepDims - If true: keep the dimensions that are reduced on (as length 1). False: remove the reduction dimensions - default = false
Index of the min absolute value: argmin(abs(in))
see argmin(String, INDArray, boolean, int...)
in (NUMERIC) - Input variable
dimensions - Dimensions to reduce over. If dimensions are not specified, full array reduction is performed (Size: AtLeast(min=1))
keepDims - If true: keep the dimensions that are reduced on (as length 1). False: remove the reduction dimensions - default = false
Is finite operation: elementwise isFinite(x)
Returns an array with the same shape/size as the input, with values 1 where condition is satisfied, or
value 0 otherwise
x (NUMERIC) - Input variable
Is infinite operation: elementwise isInfinite(x)
Returns an array with the same shape/size as the input, with values 1 where condition is satisfied, or
value 0 otherwise
x (NUMERIC) - Input variable
Is maximum operation: elementwise x == max(x)
Returns an array with the same shape/size as the input, with values 1 where condition is satisfied, or
value 0 otherwise
x (NUMERIC) - Input variable
Is Not a Number operation: elementwise isNaN(x)
Returns an array with the same shape/size as the input, with values 1 where condition is satisfied, or
value 0 otherwise
x (NUMERIC) - Input variable
Is the array non decreasing?
An array is non-decreasing if for every valid i, x[i] <= x[i+1]. For Rank 2+ arrays, values are compared
in 'c' (row major) order
x (NUMERIC) - Input variable
Is the array strictly increasing?
An array is strictly increasing if for every valid i, x[i] < x[i+1]. For Rank 2+ arrays, values are compared
in 'c' (row major) order
x (NUMERIC) - Input variable
Jaccard similarity reduction operation. The output contains the Jaccard distance for each
x (NUMERIC) - Input variable x
y (NUMERIC) - Input variable y
dimensions - Dimensions to calculate jaccardDistance over (Size: AtLeast(min=0))
Last index reduction operation.
Returns a variable that contains the index of the last element that matches the specified condition (for each
slice along the specified dimensions)
Note that if keepDims = true, the output variable has the same rank as the input variable,
with the reduced dimensions having size 1. This can be useful for later broadcast operations (such as subtracting
the mean along a dimension).
Example: if input has shape [a,b,c] and dimensions=[1] then output has shape:
keepDims = true: [a,1,c]
keepDims = false: [a,c]
in (NUMERIC) - Input variable
condition - Condition to check on input variable
dimensions - Dimensions to reduce over. If dimensions are not specified, full array reduction is performed (Size: AtLeast(min=1))
keepDims - If true: keep the dimensions that are reduced on (as length 1). False: remove the reduction dimensions - default = false
Calculates difference between inputs X and Y.
x (NUMERIC) - Input variable X
y (NUMERIC) - Input variable Y
Element-wise logarithm function (base e - natural logarithm): out = log(x)
x (NUMERIC) - Input variable
Element-wise logarithm function (with specified base): out = log_{base`(x)
x (NUMERIC) - Input variable
base - Logarithm base
Elementwise natural logarithm function: out = log_e (1 + x)
x (NUMERIC) - Input variable
Log entropy reduction: log(-sum(x * log(x)))
in (NUMERIC) - Input variable
dimensions - Dimensions to reduce over. If dimensions are not specified, full array reduction is performed (Size: AtLeast(min=0))
Log-sum-exp reduction (optionally along dimension).
Computes log(sum(exp(x))
input (NUMERIC) - Input variable
dimensions - Optional dimensions to reduce along (Size: AtLeast(min=0))
Manhattan distance (l1 norm, l1 distance) reduction operation. The output contains the Manhattan distance for each
tensor/subset along the specified dimensions:
out = sum_i abs(x[i]-y[i])
x (NUMERIC) - Input variable x
y (NUMERIC) - Input variable y
dimensions - Dimensions to calculate manhattanDistance over (Size: AtLeast(min=0))
Matrix determinant op. For 2D input, this returns the standard matrix determinant.
For higher dimensional input with shape [..., m, m] the matrix determinant is returned for each
shape [m,m] sub-matrix.
in (NUMERIC) - Input
Matrix inverse op. For 2D input, this returns the standard matrix inverse.
For higher dimensional input with shape [..., m, m] the matrix inverse is returned for each
shape [m,m] sub-matrix.
in (NUMERIC) - Input
Pairwise max operation, out = max(x, y)
Note: supports broadcasting if x and y have different shapes and are broadcastable.
For example, if X has shape [1,10] and Y has shape [5,10] then op(X,Y) has output shape [5,10]
Broadcast rules are the same as NumPy: https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
x (NUMERIC) - First input variable, x
y (NUMERIC) - Second input variable, y
Merge add function: merges an arbitrary number of equal shaped arrays using element-wise addition:
out = sum_i in[i]
inputs (NUMERIC) - Input variables
Merge average function: merges an arbitrary number of equal shaped arrays using element-wise mean operation:
out = mean_i in[i]
inputs (NUMERIC) - Input variables
Merge max function: merges an arbitrary number of equal shaped arrays using element-wise maximum operation:
out = max_i in[i]
inputs (NUMERIC) - Input variables
Broadcasts parameters for evaluation on an N-D grid.
inputs (NUMERIC) -
cartesian -
Pairwise max operation, out = min(x, y)
Note: supports broadcasting if x and y have different shapes and are broadcastable.
For example, if X has shape [1,10] and Y has shape [5,10] then op(X,Y) has output shape [5,10]
Broadcast rules are the same as NumPy: https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
x (NUMERIC) - First input variable, x
y (NUMERIC) - Second input variable, y
Pairwise modulus (remainder) operation, out = x % y
Note: supports broadcasting if x and y have different shapes and are broadcastable.
For example, if X has shape [1,10] and Y has shape [5,10] then op(X,Y) has output shape [5,10]
Broadcast rules are the same as NumPy: https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
x (NUMERIC) - Input variable
y (NUMERIC) - Input variable
Calculate the mean and (population) variance for the input variable, for the specified axis
input (NUMERIC) - Input to calculate moments for
axes - Dimensions to perform calculation over (Size: AtLeast(min=0))
Pairwise multiplication operation, out = x * y
Note: supports broadcasting if x and y have different shapes and are broadcastable.
For example, if X has shape [1,10] and Y has shape [5,10] then op(X,Y) has output shape [5,10]
Broadcast rules are the same as NumPy: https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
x (NUMERIC) - Input variable
y (NUMERIC) - Input variable
Scalar multiplication operation, out = in * scalar
x (NUMERIC) - Input variable
value - Scalar value for op
Elementwise negative operation: out = -x
x (NUMERIC) - Input variable
Calculate the mean and variance from the sufficient statistics
counts (NUMERIC) - Rank 0 (scalar) value with the total number of values used to calculate the sufficient statistics
means (NUMERIC) - Mean-value sufficient statistics: this is the SUM of all data values
variances (NUMERIC) - Variaance sufficient statistics: this is the squared sum of all data values
shift - Shift value, possibly 0, used when calculating the sufficient statistics (for numerical stability)
Boolean OR operation: elementwise (x != 0) || (y != 0)
If x and y arrays have equal shape, the output shape is the same as these inputs.
Note: supports broadcasting if x and y have different shapes and are broadcastable.
Returns an array with values 1 where condition is satisfied, or value 0 otherwise.
x (BOOL) - Input 1
y (BOOL) - Input 2
Element-wise power function: out = x^value
x (NUMERIC) - Input variable
value - Scalar value for op
Element-wise (broadcastable) power function: out = x[i]^y[i]
x (NUMERIC) - Input variable
y (NUMERIC) - Power
Rational Tanh Approximation elementwise function, as described in the paper:
Compact Convolutional Neural Network Cascade for Face Detection
This is a faster Tanh approximation
x (NUMERIC) - Input variable
Pairwise reverse division operation, out = y / x
Note: supports broadcasting if x and y have different shapes and are broadcastable.
For example, if X has shape [1,10] and Y has shape [5,10] then op(X,Y) has output shape [5,10]
Broadcast rules are the same as NumPy: https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
x (NUMERIC) - Input variable
y (NUMERIC) - Input variable
Scalar reverse division operation, out = scalar / in
x (NUMERIC) - Input variable
value - Scalar value for op
Element-wise reciprocal (inverse) function: out[i] = 1 / in[i]
x (NUMERIC) - Input variable
Rectified tanh operation: max(0, tanh(in))
x (NUMERIC) - Input variable
Element-wise round function: out = round(x).
Rounds (up or down depending on value) to the nearest integer value.
x (NUMERIC) - Input variable
Element-wise reciprocal (inverse) of square root: out = 1.0 / sqrt(x)
x (NUMERIC) - Input variable
Pairwise reverse subtraction operation, out = y - x
Note: supports broadcasting if x and y have different shapes and are broadcastable.
For example, if X has shape [1,10] and Y has shape [5,10] then op(X,Y) has output shape [5,10]
Broadcast rules are the same as NumPy: https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
x (NUMERIC) - Input variable
y (NUMERIC) - Input variable
Scalar reverse subtraction operation, out = scalar - in
x (NUMERIC) - Input variable
value - Scalar value for op
Set the diagonal value to the specified values
If input is
[ a, b, c]
[ d, e, f]
[ g, h, i]
and diag = [ 1, 2, 3] then output is
[ 1, b, c]
[ d, 2, f]
[ g, h, 3]
in (NUMERIC) - Input variable
diag (NUMERIC) - Diagonal
Shannon Entropy reduction: -sum(x * log2(x))
in (NUMERIC) - Input variable
dimensions - Dimensions to reduce over. If dimensions are not specified, full array reduction is performed (Size: AtLeast(min=0))
Element-wise sign (signum) function:
out = -1 if in < 0
out = 0 if in = 0
out = 1 if in > 0
x (NUMERIC) - Input variable
Elementwise sine operation: out = sin(x)
x (NUMERIC) - Input variable
Elementwise sinh (hyperbolic sine) operation: out = sinh(x)
x (NUMERIC) - Input variable
Element-wise square root function: out = sqrt(x)
x (NUMERIC) - Input variable
Element-wise square function: out = x^2
x (NUMERIC) - Input variable
Pairwise squared difference operation.
Note: supports broadcasting if x and y have different shapes and are broadcastable.
For example, if X has shape [1,10] and Y has shape [5,10] then op(X,Y) has output shape [5,10]
Broadcast rules are the same as NumPy: https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
x (NUMERIC) - Input variable
y (NUMERIC) - Input variable
Standardize input variable along given axis
out = (x - mean) / stdev
with mean and stdev being calculated along the given dimension.
For example: given x as a mini batch of the shape [numExamples, exampleLength]:
use dimension 1 too use the statistics (mean, stdev) for each example
use dimension 0 if you want to use the statistics for each column across all examples
use dimensions 0,1 if you want to use the statistics across all columns and examples
x (NUMERIC) - Input variable
dimensions - (Size: AtLeast(min=1))
Elementwise step function:
out(x) = 1 if x >= cutoff
out(x) = 0 otherwise
x (NUMERIC) - Input variable
value - Scalar value for op
Pairwise subtraction operation, out = x - y
Note: supports broadcasting if x and y have different shapes and are broadcastable.
For example, if X has shape [1,10] and Y has shape [5,10] then op(X,Y) has output shape [5,10]
Broadcast rules are the same as NumPy: https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
x (NUMERIC) - Input variable
y (NUMERIC) - Input variable
Scalar subtraction operation, out = in - scalar
x (NUMERIC) - Input variable
value - Scalar value for op
Elementwise tangent operation: out = tan(x)
x (NUMERIC) - Input variable
Elementwise tanh (hyperbolic tangent) operation: out = tanh(x)
x (NUMERIC) - Input variable
Matrix trace operation
For rank 2 matrices, the output is a scalar vith the trace - i.e., sum of the main diagonal.
For higher rank inputs, output[a,b,c] = trace(in[a,b,c,:,:])
in (NUMERIC) - Input variable
Boolean XOR (exclusive OR) operation: elementwise (x != 0) XOR (y != 0)
If x and y arrays have equal shape, the output shape is the same as these inputs.
Note: supports broadcasting if x and y have different shapes and are broadcastable.
Returns an array with values 1 where condition is satisfied, or value 0 otherwise.
x (BOOL) - Input 1
y (BOOL) - Input 2
Full array zero fraction array reduction operation, optionally along specified dimensions: out = (count(x == 0) / length(x))
input (NUMERIC) - Input variable
Computes the Cholesky decomposition of one or more square matrices.
input (NUMERIC) - Input tensor with inner-most 2 dimensions forming square matrices
Solver for linear squares problems.
matrix (NUMERIC) - input tensor
rhs (NUMERIC) - input tensor
l2_reguralizer - regularizer
fast - fast mode, defaults to True - default = true
Computes LU decomposition.
input (NUMERIC) - input tensor
Performs matrix mutiplication on input tensors.
a (NUMERIC) - input tensor
b (NUMERIC) - input tensor
Copy a tensor setting outside a central band in each innermost matrix.
input (NUMERIC) - input tensor
minLower - lower diagonal count
maxUpper - upper diagonal count
Computes the QR decompositions of input matrix.
input (NUMERIC) - input tensor
full - full matrices mode - default = false
Solver for systems of linear equations.
matrix (NUMERIC) - input tensor
rhs (NUMERIC) - input tensor
adjoint - adjoint mode, defaults to False - default = false
Solver for systems of linear questions.
matrix (NUMERIC) - input tensor
rhs (NUMERIC) - input tensor
lower - defines whether innermost matrices in matrix are lower or upper triangular
adjoint - adjoint mode
Computes pairwise cross product.
a (NUMERIC) -
b (NUMERIC) -
Calculates diagonal tensor.
input (NUMERIC) -
Calculates diagonal tensor.
input (NUMERIC) -
Calculates log of determinant.
input (NUMERIC) -
Matrix multiplication: out = mmul(x,y)
Supports specifying transpose argument to perform operation such as mmul(a^T, b), etc.
x (NUMERIC) - First input variable
y (NUMERIC) - Second input variable
transposeX - Transpose x (first argument) - default = false
transposeY - Transpose y (second argument) - default = false
transposeZ - Transpose result array - default = false
Calculates singular value decomposition.
input (NUMERIC) -
fullUV -
computeUV -
switchNum - - default = 16
An array with ones at and below the given diagonal and zeros elsewhere.
dataType - Data type - default = DataType.FLOAT
row -
column -
diagonal - - default = 0
Upper triangle of an array. Return a copy of a input tensor with the elements below the k-th diagonal zeroed.
input (NUMERIC) -
diag - - default = 0
Concatenates a ReLU which selects only the positive part of the activation with a ReLU which selects only the negative part of the activation. Note that as a result this non-linearity doubles the depth of the activations.
x (NUMERIC) - Input variable
Neural network batch normalization operation.
For details, see https://arxiv.org/abs/1502.03167
input (NUMERIC) - Input variable.
mean (NUMERIC) - Mean value. For 1d axis, this should match input.size(axis)
variance (NUMERIC) - Variance value. For 1d axis, this should match input.size(axis)
gamma (NUMERIC) - Gamma value. For 1d axis, this should match input.size(axis)
beta (NUMERIC) - Beta value. For 1d axis, this should match input.size(axis)
epsilon - Epsilon constant for numerical stability (to avoid division by 0)
axis - For 2d CNN activations: 1 for NCHW format activations, or 3 for NHWC format activations.
For 3d CNN activations: 1 for NCDHW format, 4 for NDHWC
For 1d/RNN activations: 1 for NCW format, 2 for NWC (Size: AtLeast(min=1))
Bias addition operation: a special case of addition, typically used with CNN 4D activations and a 1D bias vector
input (NUMERIC) - 4d input variable
bias (NUMERIC) - 1d bias
nchw - The format - nchw=true means [minibatch, channels, height, width] format; nchw=false - [minibatch, height, width, channels].
Unused for 2d inputs
This operation performs dot product attention on the given timeseries input with the given queries
out = sum(similarity(k_i, q) * v_i)
similarity(k, q) = softmax(k q) where x q is the dot product of x and q
Optionally with normalization step:
similarity(k, q) = softmax(k * q / sqrt(size(q))
See also "Attention is all you need" (https://arxiv.org/abs/1706.03762, p. 4, eq. 1)
Note: This supports multiple queries at once, if only one query is available the queries vector still has to
be 3D but can have queryCount = 1
Note: keys and values usually is the same array. If you want to use it as the same array, simply pass it for
both.
Note: Queries, keys and values must either be all rank 3 or all rank 4 arrays. Mixing them doesn't work. The
output rank will depend on the input rank.
queries (NUMERIC) - input 3D array "queries" of shape [batchSize, featureKeys, queryCount]
or 4D array of shape [batchSize, numHeads, featureKeys, queryCount]
keys (NUMERIC) - input 3D array "keys" of shape [batchSize, featureKeys, timesteps]
or 4D array of shape [batchSize, numHeads, featureKeys, timesteps]
values (NUMERIC) - input 3D array "values" of shape [batchSize, featureValues, timesteps]
or 4D array of shape [batchSize, numHeads, featureValues, timesteps]
mask (NUMERIC) - OPTIONAL; array that defines which values should be skipped of shape [batchSize, timesteps]
scaled - normalization, false -> do not apply normalization, true -> apply normalization
Dropout operation
input (NUMERIC) - Input array
inputRetainProbability - Probability of retaining an input (set to 0 with probability 1-p)
Element-wise exponential linear unit (ELU) function:
out = x if x > 0
out = a * (exp(x) - 1) if x <= 0
with constant a = 1.0
See: https://arxiv.org/abs/1511.07289
x (NUMERIC) - Input variable
GELU activation function - Gaussian Error Linear Units
For more details, see Gaussian Error Linear Units (GELUs) - https://arxiv.org/abs/1606.08415
This method uses the sigmoid approximation
x (NUMERIC) - Input variable
Element-wise hard sigmoid function:
out[i] = 0 if in[i] <= -2.5
out[1] = 0.2*in[i]+0.5 if -2.5 < in[i] < 2.5
out[i] = 1 if in[i] >= 2.5
x (NUMERIC) - Input variable
Element-wise hard tanh function:
out[i] = -1 if in[i] <= -1
out[1] = in[i] if -1 < in[i] < 1
out[i] = 1 if in[i] >= 1
x (NUMERIC) - Input variable
Derivative (dOut/dIn) of the element-wise hard Tanh function - hardTanh(INDArray)
x (NUMERIC) - Input variable
Apply Layer Normalization
y = gain * standardize(x) + bias
input (NUMERIC) - Input variable
gain (NUMERIC) - Gain
bias (NUMERIC) - Bias
channelsFirst - For 2D input - unused. True for NCHW (minibatch, channels, height, width), false for NHWC data
dimensions - Dimensions to perform layer norm over - dimension=1 for 2d/MLP data, dimension=1,2,3 for CNNs (Size: AtLeast(min=1))
Element-wise leaky ReLU function:
out = x if x >= 0.0
out = alpha * x if x < cutoff
Alpha value is most commonly set to 0.01
x (NUMERIC) - Input variable
alpha - Cutoff - commonly 0.01
Leaky ReLU derivative: dOut/dIn given input.
x (NUMERIC) - Input variable
alpha - Cutoff - commonly 0.01
Linear layer operation: out = mmul(in,w) + bias
Note that bias array is optional
input (NUMERIC) - Input data
weights (NUMERIC) - Weights variable, shape [nIn, nOut]
bias (NUMERIC) - Optional bias variable (may be null)
Element-wise sigmoid function: out[i] = log(sigmoid(in[i]))
x (NUMERIC) - Input variable
Log softmax activation
x (NUMERIC) -
Log softmax activation
x (NUMERIC) - Input
dimension - Dimension along which to apply log softmax
This performs multi-headed dot product attention on the given timeseries input
out = concat(head_1, head_2, ..., head_n) * Wo
head_i = dot_product_attention(Wq_i_q, Wk_i_k, Wv_i*v)
Optionally with normalization when calculating the attention for each head.
See also "Attention is all you need" (https://arxiv.org/abs/1706.03762, pp. 4,5, "3.2.2 Multi-Head Attention")
This makes use of dot_product_attention OP support for rank 4 inputs.
see dotProductAttention(INDArray, INDArray, INDArray, INDArray, boolean, boolean)
queries (NUMERIC) - input 3D array "queries" of shape [batchSize, featureKeys, queryCount]
keys (NUMERIC) - input 3D array "keys" of shape [batchSize, featureKeys, timesteps]
values (NUMERIC) - input 3D array "values" of shape [batchSize, featureValues, timesteps]
Wq (NUMERIC) - input query projection weights of shape [numHeads, projectedKeys, featureKeys]
Wk (NUMERIC) - input key projection weights of shape [numHeads, projectedKeys, featureKeys]
Wv (NUMERIC) - input value projection weights of shape [numHeads, projectedValues, featureValues]
Wo (NUMERIC) - output projection weights of shape [numHeads * projectedValues, outSize]
mask (NUMERIC) - OPTIONAL; array that defines which values should be skipped of shape [batchSize, timesteps]
scaled - normalization, false -> do not apply normalization, true -> apply normalization
Padding operation
input (NUMERIC) - Input tensor
padding (NUMERIC) - Padding value
PadMode - Padding format - default = CONSTANT
constant - Padding constant
GELU activation function - Gaussian Error Linear Units
For more details, see Gaussian Error Linear Units (GELUs) - https://arxiv.org/abs/1606.08415
This method uses the precise method
x (NUMERIC) - Input variable
PReLU (Parameterized Rectified Linear Unit) operation. Like LeakyReLU with a learnable alpha:
out[i] = in[i] if in[i] >= 0
out[i] = in[i] * alpha[i] otherwise
sharedAxes allows you to share learnable parameters along axes.
For example, if the input has shape [batchSize, channels, height, width]
and you want each channel to have its own cutoff, use sharedAxes = [2, 3] and an
alpha with shape [channels].
input (NUMERIC) - Input data
alpha (NUMERIC) - The cutoff variable. Note that the batch dimension (the 0th, whether it is batch or not) should not be part of alpha.
sharedAxes - Which axes to share cutoff parameters along. (Size: AtLeast(min=1))
Element-wise rectified linear function with specified cutoff:
out[i] = in[i] if in[i] >= cutoff
out[i] = 0 otherwise
x (NUMERIC) - Input
cutoff - Cutoff value for ReLU operation - x > cutoff ? x : 0. Usually 0
Element-wise "rectified linear 6" function with specified cutoff:
out[i] = min(max(in, cutoff), 6)
x (NUMERIC) - Input
cutoff - Cutoff value for ReLU operation. Usually 0
ReLU (Rectified Linear Unit) layer operation: out = relu(mmul(in,w) + bias)
Note that bias array is optional
input (NUMERIC) - Input data
weights (NUMERIC) - Weights variable
bias (NUMERIC) - Optional bias variable (may be null)
Element-wise SeLU function - Scaled exponential Lineal Unit: see Self-Normalizing Neural Networks
out[i] = scale alpha (exp(in[i])-1) if in[i]>0, or 0 if in[i] <= 0
Uses default scale and alpha values.
x (NUMERIC) - Input variable
Element-wise sigmoid function: out[i] = 1.0/(1+exp(-in[i]))
x (NUMERIC) - Input variable
Element-wise sigmoid function derivative: dL/dIn given input and dL/dOut
x (NUMERIC) - Input Variable
wrt (NUMERIC) - Gradient at the output - dL/dOut. Must have same shape as the input
Softmax activation, along the specified dimension
x (NUMERIC) - Input
dimension - Dimension along which to apply softmax - default = -1
Softmax derivative function
x (NUMERIC) - Softmax input
wrt (NUMERIC) - Gradient at output, dL/dx
dimension - Softmax dimension
Element-wise softplus function: out = log(exp(x) + 1)
x (NUMERIC) - Input variable
Element-wise softsign function: out = x / (abs(x) + 1)
x (NUMERIC) - Input variable
Element-wise derivative (dOut/dIn) of the softsign function softsign(INDArray)
x (NUMERIC) - Input variable
Element-wise "swish" function: out = x _sigmoid(b_x) with b=1.0
See: https://arxiv.org/abs/1710.05941
x (NUMERIC) - Input variable
Elementwise tanh (hyperbolic tangent) operation: out = tanh(x)
x (NUMERIC) - Input variable
Given an input image and some crop boxes, extract out the image subsets and resize them to the specified size.
image (NUMERIC) - Input image, with shape [batch, height, width, channels]
cropBoxes (NUMERIC) - Float32 crop, shape [numBoxes, 4] with values in range 0 to 1
boxIndices (NUMERIC) - Indices: which image (index to dimension 0) the cropBoxes belong to. Rank 1, shape [numBoxes]
cropOutSize (INT) - Output size for the images - int32, rank 1 with values [outHeight, outWidth]
extrapolationValue - Used for extrapolation, when applicable. 0.0 should be used for the default - default = 0.0
Adjusts contrast of RGB or grayscale images.
in (NUMERIC) - images to adjust. 3D shape or higher
factor - multiplier for adjusting contrast
Adjust hue of RGB image
in (NUMERIC) - image as 3D array
delta - value to add to hue channel
Adjust saturation of RGB images
in (NUMERIC) - RGB image as 3D array
factor - factor for saturation
Given an input image, extract out image patches (of size kSizes - h x w) and place them in the depth dimension.
image (NUMERIC) - Input image to extract image patches from - shape [batch, height, width, channels]
kSizes - Kernel size - size of the image patches, [height, width] (Size: Exactly(count=2))
strides - Stride in the input dimension for extracting image patches, [stride_height, stride_width] (Size: Exactly(count=2))
rates - Usually [1,1]. Equivalent to dilation rate in dilated convolutions - how far apart the output pixels
sameMode - Padding algorithm. If true: use Same padding
Converting image from HSV to RGB format
input (NUMERIC) - 3D image
Resize images to size using the specified method.
input (NUMERIC) - 4D image [NHWC]
size (INT) - new height and width
preserveAspectRatio - Whether to preserve the aspect ratio. If this is set, then images will be resized to a size that fits in size while preserving the aspect ratio of the original image. Scales up the image if size is bigger than the current size of the image. Defaults to False. - default = false
antialis - Whether to use an anti-aliasing filter when downsampling an image - default = false
ImageResizeMethod - ResizeBilinear: Bilinear interpolation. If 'antialias' is true, becomes a hat/tent filter function with radius 1 when downsampling.
ResizeLanczos5: Lanczos kernel with radius 5. Very-high-quality filter but may have stronger ringing.
ResizeBicubic: Cubic interpolant of Keys. Equivalent to Catmull-Rom kernel. Reasonably good quality and faster than Lanczos3Kernel, particularly when upsampling.
ResizeGaussian: Gaussian kernel with radius 3, sigma = 1.5 / 3.0.
ResizeNearest: Nearest neighbor interpolation. 'antialias' has no effect when used with nearest neighbor interpolation.
ResizeArea: Anti-aliased resampling with area interpolation. 'antialias' has no effect when used with area interpolation; it always anti-aliases.
ResizeMitchelcubic: Mitchell-Netravali Cubic non-interpolating filter. For synthetic images (especially those lacking proper prefiltering), less ringing than Keys cubic kernel but less sharp.
Greedily selects a subset of bounding boxes in descending order of score
boxes (NUMERIC) - Might be null. Name for the output variable
scores (NUMERIC) - vector of shape [num_boxes]
maxOutSize - scalar representing the maximum number of boxes to be selected
iouThreshold - threshold for deciding whether boxes overlap too much with respect to IOU
scoreThreshold - threshold for deciding when to remove boxes based on score
Randomly crops image
input (NUMERIC) - input array
shape (INT) - shape for crop
Converting array from HSV to RGB format
input (NUMERIC) - 3D image
Converting array from RGB to YIQ format
input (NUMERIC) - 3D image
Converting array from RGB to YUV format
input (NUMERIC) - 3D image
Converting image from YIQ to RGB format
input (NUMERIC) - 3D image
Converting image from YUV to RGB format
input (NUMERIC) - 3D image
The GRU operation. Gated Recurrent Unit - Cho et al. 2014.
x (NUMERIC) - input [time, bS, nIn]
hLast (NUMERIC) - initial cell output (at time step = 0) [bS, nOut]
Wx (NUMERIC) - input-to-hidden weights, [nIn, 3*nOut]
Wh (NUMERIC) - hidden-to-hidden weights, [nOut, 3*nOut]
biases (NUMERIC) - biases, [3*nOut]
The GRU cell. Does a single time step operation
x (NUMERIC) - Input, with shape [batchSize, inSize]
hLast (NUMERIC) - Output of the previous cell/time step, with shape [batchSize, numUnits]
GRUWeights - see GRUWeights
The LSTM cell. Does a single time step operation.
x (NUMERIC) - Input, with shape [batchSize, inSize]
cLast (NUMERIC) - Previous cell state, with shape [batchSize, numUnits]
yLast (NUMERIC) - revious cell output, with shape [batchSize, numUnits]
LSTMWeights - see LSTMWeights
LSTMConfiguration - see LSTMConfiguration
Long Short-Term Memory layer - Hochreiter 1997.
SUPPORTS following data formats:
for unidirectional:
TNS: shapes [timeLength, numExamples, inOutSize]
NST: shapes [numExamples, inOutSize, timeLength]
NTS: shapes [numExamples, timeLength, inOutSize]
for bidirectional:
T2NS: shapes [timeLength, 2, numExamples, inOutSize] (for ONNX)
SUPPORTS following direction modes:
FWD: forward
BWD: backward
BIDIR_SUM: bidirectional sum
BIDIR_CONCAT: bidirectional concat
BIDIR_EXTRA_DIM: bidirectional extra output dim (in conjunction with format dataFormat - T2NS)
You may use different gate configurations:
specify gate/cell/out aplha/beta and numbers of activations for gate/cell/out described in activations enum
("RELU","SIGMOID","AFFINE","LEAKY_RELU","THRESHHOLD_RELU","SCALED_TAHN","HARD_SIGMOID","ELU","SOFTSIGN","SOFTPLUS")
Also this layer supports MKLDNN (DNNL) and cuDNN acceleration
x (NUMERIC) - Input, with shape dependent on the data format (in config).
cLast (NUMERIC) - Previous/initial cell state, with shape [batchSize, numUnits]
yLast (NUMERIC) - Previous/initial cell output, with shape [batchSize, numUnits]
maxTSLength (NUMERIC) - maxTSLength with shape [batchSize]
LSTMLayerWeights - see LSTMLayerWeights
LSTMLayerConfig - see LSTMLayerConfig
The LSTM block
maxTSLength (NUMERIC) -
x (NUMERIC) - Input, with shape dependent on the data format (in config).
cLast (NUMERIC) - Previous/initial cell state, with shape [batchSize, numUnits]
yLast (NUMERIC) - Previous/initial cell output, with shape [batchSize, numUnits]
LSTMWeights - see LSTMWeights
LSTMConfiguration - see LSTMConfiguration
The SRU layer. Does a single time step operation.
x (NUMERIC) - Input, with shape [batchSize, inSize]
initialC (NUMERIC) - Initial cell state, with shape [batchSize, inSize]
mask (NUMERIC) - An optional dropout mask, with shape [batchSize, inSize]
SRUWeights - see SRUWeights
The SRU layer. Does a single time step operation.
x (NUMERIC) - Input, with shape [batchSize, inSize]
cLast (NUMERIC) - Previous cell state, with shape [batchSize, inSize]
SRUWeights - see SRUWeights
RnnDataFormat (ENUM) - The data format of the input. Input shape depends on data format (in config):
TNS -> [timeSteps, batchSize, inSize]
NST -> [batchSize, inSize, timeSteps]
NTS -> [batchSize, timeSteps, inSize]
peepHole (BOOL) - Whether to provide peephole connections
forgetBias (NUMERIC) - The bias added to forget gates in order to reduce the scale of forgetting in the beginning of the training.
clippingCellValue (NUMERIC) - The bias added to forget gates in order to reduce the scale of forgetting in the beginning of the training.
Used in these ops: lstmCell lstmblock
LSTMDataFormat (ENUM) - for unidirectional: TNS: shape [timeLength, numExamples, inOutSize] - sometimes referred to as "time major"
NST: shape [numExamples, inOutSize, timeLength]
NTS: shape [numExamples, timeLength, inOutSize] - TF "time_major=false" layout for bidirectional:
T2NS: 3 = [timeLength, 2, numExamples, inOutSize] (for ONNX)
LSTMDirectionMode (ENUM) - direction
FWD: 0 = fwd
BWD: 1 = bwd
BIDIR_SUM: 2 = bidirectional sum
BIDIR_CONCAT: 3 = bidirectional concat
BIDIR_EXTRA_DIM: 4 = bidirectional extra output dim (in conjunction with format dataFormat = 3)
gateAct (ENUM) - Activations
cellAct (ENUM) - Activations
outAct (ENUM) - Activations
retFullSequence (BOOL) - indicates whether to return whole time sequence h {h_0, h_1, ... , h_sL-1} - default = true
retLastH (BOOL) - indicates whether to return output at last time step only,
in this case shape would be [bS, nOut] (exact shape depends on dataFormat argument) - default = false
retLastC (BOOL) - indicates whether to return cells state at last time step only,
in this case shape would be [bS, nOut] (exact shape depends on dataFormat argument) - default = false
cellClip (NUMERIC) - Cell clipping value, if it = 0 then do not apply clipping - default = 0.0
gateAlpha (NUMERIC) - null - default = 0.0
gateBeta (NUMERIC) - null - default = 0.0
cellAlpha (NUMERIC) - null - default = 0.0
cellBeta (NUMERIC) - null - default = 0.0
outAlpha (NUMERIC) - null - default = 0.0
outBeta (NUMERIC) - null - default = 0.0
Used in these ops: lstmLayer
ruWeight- null (NUMERIC type)
cWeight- null (NUMERIC type)
ruBias- null (NUMERIC type)
cBias- null (NUMERIC type)
Used in these ops: gruCell
weights- null (NUMERIC type)
bias- null (NUMERIC type)
Used in these ops: sru sruCell
ruWeight- null (NUMERIC type)
inputPeepholeWeights- null (NUMERIC type)
forgetPeepholeWeights- null (NUMERIC type)
outputPeepholeWeights- null (NUMERIC type)
bias- null (NUMERIC type)
Used in these ops: lstmCell lstmblock
inputWeights- input weights Wx:
1) shapes [nIn, 4*nOut]
for FWD,BWD 2) shapes [2, nIn, 4*nOut]
BIDIR_SUM, BIDIR_CONCAT and BIDIR_EXTRA_DIM (NUMERIC type)
recurrentWeights- recurrent weights Wr:
1) shapes [nIn, 4*nOut]
for FWD, BWD 2) shapes [2, nIn, 4*nOut]
BIDIR_SUM, BIDIR_CONCAT and BIDIR_EXTRA_DIM (NUMERIC type)
biases- biases
1) shapes [4*nOut]
for FWD, BWD 2) shapes [2, 4*nOut]
for BIDIR_SUM, BIDIR_CONCAT and BIDIR_EXTRA_DIM (NUMERIC type)
peepholeWeights- peephole weights Wp:
1) [3*nOut]
when directionMode < 2
2) [2, 3*nOut]
when directionMode >= 2 (NUMERIC type)
Used in these ops: lstmLayer
What types of variables are used in SameDiff, their properties and how to switch these types.
All values defining or passing through each SameDiff
instance - be it weights, bias, inputs, activations or general parameters - all are handled by objects of class SDVariable
.
Observe that by variables we normally mean not just single values - as it is done in various online examples describing autodifferentiation - but rather whole multidimensional arrays of them.
All variables in SameDiff
belong to one of four variable types, constituting an enumeration VariableType
. Here they are:
VARIABLE
: are trainable parameters of your network, e.g. weights and bias of a layer. Naturally, we want them
to be both stored for further usage - we say, that they are persistent - as well as being updated during training.
CONSTANT
: are those parameters which, like variables, are persistent for the network, but are not being
trained; they, however, may be changed externally by the user.
PLACEHOLDER
: store temporary values that are to be supplied from the outside, like inputs and labels.
Accordingly, since new placeholders' values are provided at each iteration, they are not stored: in other words,
unlike VARIABLE
and CONSTANT
, PLACEHOLDER
is not persistent.
ARRAY
: are temporary values as well, representing outputs of within a SameDiff
, for
instance sums of vectors, activations of a layer, and many more. They are being recalculated at each iteration, and
therefor, like PLACEHOLDER
, are not persistent.
To infer the type of a particular variable, you may use the method getVariableType
, like so:
The current value of a variable in a form of INDArray
may be obtained using getArr
or getArr(true)
- the latter one if you wish the program to throw an exception if the variable's value is not initialized.
The data within each variable also has its data type, contained in DataType
enum. Currently in DataType
there are three floating point types: FLOAT
, DOUBLE
and HALF
; four integer types: LONG
, INT
, SHORT
and UBYTE
; one boolean type BOOL
- all of them will be referred as numeric types. In addition, there is a string type dubbed UTF8
; and two helper data types COMPRESSED
and UNKNOWN
. The 16-bit floating point format BFLOAT16
and unsigned integer types (UINT16
, UINT32
and UINT64
) will be available in 1.0.0-beta5
.
To infer the data type of your variable, use
You may need to trace your variable's data type since at times it does matter, which types you use in an operation. For example, a convolution product, like this one
will require its SDVariable
arguments input
and weights
to be of one of the floating point data types, and will throw an exception otherwise. Also, as we shall discuss just below, all the SDVariables
of type VARIABLE
are supposed to be of floating point type.
Before we go to the differences between variables, let us first look at the properties they all share
All variables are ultimately derived from an instance of SameDiff
, serving as parts of its
Results (outputs) of all operations are of ARRAY
type.
All SDVariable
's involved in an operation are to belong to the same SameDiff
.
All variables may or may not be given names - in the latter case, a name is actually created automatically. Either
way, the names need to be/are created unique. We shall come back to naming below.
Let us now have a closer look at each type of variables, and what distinguish them from each other.
Variables are the trainable parameters of your network. This predetermines their nature in SameDiff
. As we briefly mentioned above, variables' values need to be both preserved for application, and updated during training. Training means, that we iteratively update the values by small fractions of their gradients, and this only makes sense if variables are of floating point types (see data types above).
Variables may be added to your SameDiff
using different versions of var
function from your SameDiff
instance. For example, the code
adds a variable constituting of a 784x10 array of float
numbers - weights for a single layer MNIST perceptron in this case - to a pre-existing SameDiff
instance samediff
.
However, this way the values within a variable will be set as zeros. You may also create a variable with values from a preset INDArray
. Say
will create a variable filled with normally distributed randomly generated numbers with variance 1/28
. You may put any other array creation methods instead of nrand
, or any preset array, of course. Also, you may use some popular initialization scheme, like so:
Now, the weights will be randomly initialized using the Xavier scheme. There are other ways to create and
so e.g. int
and boolean
are allowed alongside with float
and double
.
In general, constants are added to SameDiff
by means of constant
methods. A constant may be created form an INDArray
, like that:
A constant consisting of a single scalar value may be created using one of the scalar
methods:
The most common placeholders you'll normally have in a SameDiff
are inputs and, when applicable, labels. You may create placeholders of any data type, depending on the operations you use them in. To add a placeholder to a SameDiff
, you may call one of placeHolder
methods, e.g. like that:
as in MNIST example. Here we specify name, data type and then shape of your placeholder - here, we have 28x28 grayscale pictures rendered as 1d vectors (therefore 784) coming in batches of length we don't know beforehand (therefore -1).
Let us summarize the main properties of variable types in one table:
We haven't discussed what 'Workspaces' mean - if you do not know, do not worry, this is an internal technical term that basically describes how memory is managed internally.
You may change variable types as well. For now, there are three of such options:
At times - for instance if you perform transfer learning - you may wish to turn a variable into a constant. This is done like so:
where someVariable
is an instance of SDVariable
of VARIABLE
type. The variable someVariable
will not be trained any more.
Conversely, constants - if they are of floating point data type - may be converted to variables. So, for instance, if you wish your frozen weights to become trainable again
Placeholders may be converted to constants as well - for instance, if you need to freeze one of the inputs. There are no restrictions on the data type, yet, since placeholder values are not persistent, their value should be set before you turn them into constants. This can be done as follows
For now it is not possible to turn a constant back into a placeholder, we may consider adding this functionality if there is a need for that. For now, if you wish to effectively freeze your placeholder but be able to use it again, consider supplying it with constant values rather than turning it into a constant.
SameDiff
Recall that every variable in an instance of SameDiff
has its unique String
name. Your SameDiff
actually tracks your variables by their names, and allows you to retrieve them by using getVariable(String name)
method.
Consider the following line:
Here, in the function sub
we actually have implicitly introduced a variable (of type ARRAY
) that holds the result of the subtraction. By adding a name into the operations's argument, we've secured ourselves the possibility to retrieve the variable from elsewhere: say, if later you need to infer the difference between the labels and the prediction as a vector, you may just write:
This becomes especially handy if your whole SameDiff
instance is initialized elsewhere, and you still need to get hold of some of its variables - say, multiple outputs.
You can get and set the name of an SDVariable
the methods getVarName
and setVarName
respectively. When renaming, note that variable's name is to remain unique within its SameDiff
.
You may retrieve any variable's current value as an INDArray
using the method eval()
. Note that for non-persistent variables, the value should first be set. For variables with gradients, the gradient's value may also be inferred using the method getGradient
.
. In fact, each variable has a SameDiff
as one of its fields.
fill variables: you may look them up in the 'known subclasses' section .
Constants hold values that are stored, but - unlike variables - remain unchanged during training. These, for instance, may be some hyperparamters you wish to have in your network and be able to access from the outside. Or they may be pretrained weights of a neural network that you wish to keep unchanged (see more on that in below). Constants may be of any data type
Variables of ARRAY
type appear as outputs of within SameDiff
. Accordingly, the data type of an array-type variable depends on the kind of operation it is produced by and variable type(s) ot its argument(s). Arrays are not persistent - they are one-time values that will be recalculated from scratch at the next step. However, unlike placeholders, gradients are computed for them, as those are needed to update the values of VARIABLE
's.
There are as many ways array-type variables are created as there are operations, so you're better up focusing on our , our and .
Trainable
Gradients
Persistent
Workspaces
Datatypes
Instantiated from
VARIABLE
Yes
Yes
Yes
Yes
Float only
Instance
CONSTANT
No
No
Yes
No
Any
Instance
PLACEHOLDER
No
No
No
No
Any
Instance
ARRAY
No
Yes
No
Yes
Any
Operations