Supported neural network layers.
Each layer in a neural network configuration represents a unit of hidden units. When layers are stacked together, they represent a deep neural network.
All layers available in Eclipse Deeplearning4j can be used either in a MultiLayerNetwork
or ComputationGraph
. When configuring a neural network, you pass the layer configuration and the network will instantiate the layer for you.
If you are configuring complex networks such as InceptionV4, you will need to use the ComputationGraph
API and join different branches together using vertices. Check the vertices for more information.
Activation layer is a simple layer that applies the specified activation function to the input activations
clone
param activation Activation function for the layer
activation
Activation function for the layer
activation
param activationFunction Activation function for the layer
activation
param activation Activation function for the layer
Dense layer: a standard fully connected feed forward layer
hasBias
If true (default): include bias parameters in the model. False: no bias.
hasLayerNorm
If true (default = false): enable layer normalization on this layer
Dropout layer. This layer simply applies dropout at training time, and passes activations through unmodified at test
build
Create a dropout layer with standard {- link Dropout}, with the specified probability of retaining the input activation. See {- link Dropout} for the full details
param dropout Activation retain probability.
Embedding layer: feed-forward layer that expects single integers per example as input (class numbers, in range 0 to the equivalent one-hot representation. Mathematically, EmbeddingLayer is equivalent to using a DenseLayer with a one-hot representation for the input; however, it can be much more efficient with a large number of classes (as a dense layer + one-hot input does a matrix multiply with all but one value being zero). Note: can only be used as the first layer for a network Note 2: For a given example index i, the output is activationFunction(weights.getRow(i) + bias), hence the weight rows can be considered a vector/embedding for each example. Note also that embedding layer has an activation function (set to IDENTITY to disable) and optional bias (which is disabled by default)
hasBias
If true: include bias parameters in the layer. False (default): no bias.
weightInit
Initialize the embedding layer using the specified EmbeddingInitializer - such as a Word2Vec instance
param embeddingInitializer Source of the embedding layer weights
weightInit
Initialize the embedding layer using values from the specified array. Note that the array should have shape [vocabSize, vectorSize]. After copying values from the array to initialize the network parameters, the input array will be discarded (so that, if necessary, it can be garbage collected)
param vectors Vectors to initialize the embedding layer with
Embedding layer for sequences: feed-forward layer that expects fixed-length number (inputLength) of integers/indices per example as input, ranged from 0 to numClasses - 1. This input thus has shape [numExamples, inputLength] or shape [numExamples, 1, inputLength]. The output of this layer is 3D (sequence/time series), namely of shape [numExamples, nOut, inputLength]. Note: can only be used as the first layer for a network Note 2: For a given example index i, the output is activationFunction(weights.getRow(i) + bias), hence the weight rows can be considered a vector/embedding of each index. Note also that embedding layer has an activation function (set to IDENTITY to disable) and optional bias (which is disabled by default)
hasBias
If true: include bias parameters in the layer. False (default): no bias.
inputLength
Set input sequence length for this embedding layer.
param inputLength input sequence length
return Builder
inferInputLength
Set input sequence inference mode for embedding layer.
param inferInputLength whether to infer input length
return Builder
weightInit
Initialize the embedding layer using the specified EmbeddingInitializer - such as a Word2Vec instance
param embeddingInitializer Source of the embedding layer weights
weightInit
Initialize the embedding layer using values from the specified array. Note that the array should have shape [vocabSize, vectorSize]. After copying values from the array to initialize the network parameters, the input array will be discarded (so that, if necessary, it can be garbage collected)
param vectors Vectors to initialize the embedding layer with
Global pooling layer - used to do pooling over time for RNNs, and 2d pooling for CNNs. Supports the following
Global pooling layer can also handle mask arrays when dealing with variable length inputs. Mask arrays are assumed to be 2d, and are fed forward through the network during training or post-training forward pass:
Time series: mask arrays are shape [miniBatchSize, maxTimeSeriesLength] and contain values 0 or 1 only
CNNs: mask have shape [miniBatchSize, height] or [miniBatchSize, width]. Important: the current implementation assumes that for CNNs + variable length (masking), the input shape is [miniBatchSize, channels, height, 1] or [miniBatchSize, channels, 1, width] respectively. This is the case with global pooling in architectures like CNN for sentence classification.
Behaviour with default settings:
3d (time series) input with shape [miniBatchSize, vectorSize, timeSeriesLength] -> 2d output [miniBatchSize, vectorSize]
4d (CNN) input with shape [miniBatchSize, channels, height, width] -> 2d output [miniBatchSize, channels]
5d (CNN3D) input with shape [miniBatchSize, channels, depth, height, width] -> 2d output [miniBatchSize, channels]
Alternatively, by setting collapseDimensions = false in the configuration, it is possible to retain the reduced dimensions as 1s: this gives
[miniBatchSize, vectorSize, 1] for RNN output,
[miniBatchSize, channels, 1, 1] for CNN output, and
[miniBatchSize, channels, 1, 1, 1] for CNN3D output.
poolingDimensions
Pooling type for global pooling
poolingType
param poolingType Pooling type for global pooling
collapseDimensions
Whether to collapse dimensions when pooling or not. Usually you do want to do this. Default: true. If true:
3d (time series) input with shape [miniBatchSize, vectorSize, timeSeriesLength] -> 2d output [miniBatchSize, vectorSize]
4d (CNN) input with shape [miniBatchSize, channels, height, width] -> 2d output [miniBatchSize, channels]
5d (CNN3D) input with shape [miniBatchSize, channels, depth, height, width] -> 2d output [miniBatchSize, channels]
If false:
3d (time series) input with shape [miniBatchSize, vectorSize, timeSeriesLength] -> 3d output [miniBatchSize, vectorSize, 1]
4d (CNN) input with shape [miniBatchSize, channels, height, width] -> 2d output [miniBatchSize, channels, 1, 1]
5d (CNN3D) input with shape [miniBatchSize, channels, depth, height, width] -> 2d output [miniBatchSize, channels, 1, 1, 1]
param collapseDimensions Whether to collapse the dimensions or not
pnorm
P-norm constant. Only used if using {- link PoolingType#PNORM} for the pooling type
param pnorm P-norm constant
Local response normalization layer See section 3.3 of http://www.cs.toronto.edu/~fritz/absps/imagenet.pdf
k
LRN scaling constant k. Default: 2
n
Number of adjacent kernel maps to use when doing LRN. default: 5
param n Number of adjacent kernel maps
alpha
LRN scaling constant alpha. Default: 1e-4
param alpha Scaling constant
beta
Scaling constant beta. Default: 0.75
param beta Scaling constant
cudnnAllowFallback
When using CuDNN and an error is encountered, should fallback to the non-CuDNN implementatation be allowed? If set to false, an exception in CuDNN will be propagated back to the user. If false, the built-in (non-CuDNN) implementation for BatchNormalization will be used
param allowFallback Whether fallback to non-CuDNN implementation should be used
SameDiff version of a 1D locally connected layer.
nIn
Number of inputs to the layer (input size)
nOut
param nOut Number of outputs (output size)
activation
param activation Activation function for the layer
kernelSize
param k Kernel size for the layer
stride
param s Stride for the layer
padding
param p Padding for the layer. Not used if {- link ConvolutionMode#Same} is set
convolutionMode
param cm Convolution mode for the layer. See {- link ConvolutionMode} for details
dilation
param d Dilation for the layer
hasBias
param hasBias If true (default is false) the layer will have a bias
setInputSize
Set input filter size for this locally connected 1D layer
param inputSize height of the input filters
return Builder
SameDiff version of a 2D locally connected layer.
setKernel
Number of inputs to the layer (input size)
setStride
param stride Stride for the layer. Must be 2 values (height/width)
setPadding
param padding Padding for the layer. Not used if {- link ConvolutionMode#Same} is set. Must be 2 values (height/width)
setDilation
param dilation Dilation for the layer. Must be 2 values (height/width)
nIn
param nIn Number of inputs to the layer (input size)
nOut
param nOut Number of outputs (output size)
activation
param activation Activation function for the layer
kernelSize
param k Kernel size for the layer. Must be 2 values (height/width)
stride