The GRU operation. Gated Recurrent Unit - Cho et al. 2014.
x (NUMERIC) - input [time, bS, nIn]
hLast (NUMERIC) - initial cell output (at time step = 0) [bS, nOut]
Wx (NUMERIC) - input-to-hidden weights, [nIn, 3*nOut]
Wh (NUMERIC) - hidden-to-hidden weights, [nOut, 3*nOut]
biases (NUMERIC) - biases, [3*nOut]
The GRU cell. Does a single time step operation
x (NUMERIC) - Input, with shape [batchSize, inSize]
hLast (NUMERIC) - Output of the previous cell/time step, with shape [batchSize, numUnits]
GRUWeights - see GRUWeights
The LSTM cell. Does a single time step operation.
x (NUMERIC) - Input, with shape [batchSize, inSize]
cLast (NUMERIC) - Previous cell state, with shape [batchSize, numUnits]
yLast (NUMERIC) - revious cell output, with shape [batchSize, numUnits]
LSTMWeights - see LSTMWeights
LSTMConfiguration - see LSTMConfiguration
Long Short-Term Memory layer - Hochreiter 1997.
SUPPORTS following data formats:
for unidirectional:
TNS: shapes [timeLength, numExamples, inOutSize]
NST: shapes [numExamples, inOutSize, timeLength]
NTS: shapes [numExamples, timeLength, inOutSize]
for bidirectional:
T2NS: shapes [timeLength, 2, numExamples, inOutSize] (for ONNX)
SUPPORTS following direction modes:
FWD: forward
BWD: backward
BIDIR_SUM: bidirectional sum
BIDIR_CONCAT: bidirectional concat
BIDIR_EXTRA_DIM: bidirectional extra output dim (in conjunction with format dataFormat - T2NS)
You may use different gate configurations:
specify gate/cell/out aplha/beta and numbers of activations for gate/cell/out described in activations enum
("RELU","SIGMOID","AFFINE","LEAKY_RELU","THRESHHOLD_RELU","SCALED_TAHN","HARD_SIGMOID","ELU","SOFTSIGN","SOFTPLUS")
Also this layer supports MKLDNN (DNNL) and cuDNN acceleration
x (NUMERIC) - Input, with shape dependent on the data format (in config).
cLast (NUMERIC) - Previous/initial cell state, with shape [batchSize, numUnits]
yLast (NUMERIC) - Previous/initial cell output, with shape [batchSize, numUnits]
maxTSLength (NUMERIC) - maxTSLength with shape [batchSize]
LSTMLayerWeights - see LSTMLayerWeights
LSTMLayerConfig - see LSTMLayerConfig
The LSTM block
maxTSLength (NUMERIC) -
x (NUMERIC) - Input, with shape dependent on the data format (in config).
cLast (NUMERIC) - Previous/initial cell state, with shape [batchSize, numUnits]
yLast (NUMERIC) - Previous/initial cell output, with shape [batchSize, numUnits]
LSTMWeights - see LSTMWeights
LSTMConfiguration - see LSTMConfiguration
The SRU layer. Does a single time step operation.
x (NUMERIC) - Input, with shape [batchSize, inSize]
initialC (NUMERIC) - Initial cell state, with shape [batchSize, inSize]
mask (NUMERIC) - An optional dropout mask, with shape [batchSize, inSize]
SRUWeights - see SRUWeights
The SRU layer. Does a single time step operation.
x (NUMERIC) - Input, with shape [batchSize, inSize]
cLast (NUMERIC) - Previous cell state, with shape [batchSize, inSize]
SRUWeights - see SRUWeights
RnnDataFormat (ENUM) - The data format of the input. Input shape depends on data format (in config):
TNS -> [timeSteps, batchSize, inSize]
NST -> [batchSize, inSize, timeSteps]
NTS -> [batchSize, timeSteps, inSize]
peepHole (BOOL) - Whether to provide peephole connections
forgetBias (NUMERIC) - The bias added to forget gates in order to reduce the scale of forgetting in the beginning of the training.
clippingCellValue (NUMERIC) - The bias added to forget gates in order to reduce the scale of forgetting in the beginning of the training.
Used in these ops: lstmCell lstmblock
LSTMDataFormat (ENUM) - for unidirectional: TNS: shape [timeLength, numExamples, inOutSize] - sometimes referred to as "time major"
NST: shape [numExamples, inOutSize, timeLength]
NTS: shape [numExamples, timeLength, inOutSize] - TF "time_major=false" layout for bidirectional:
T2NS: 3 = [timeLength, 2, numExamples, inOutSize] (for ONNX)
LSTMDirectionMode (ENUM) - direction
FWD: 0 = fwd
BWD: 1 = bwd
BIDIR_SUM: 2 = bidirectional sum
BIDIR_CONCAT: 3 = bidirectional concat
BIDIR_EXTRA_DIM: 4 = bidirectional extra output dim (in conjunction with format dataFormat = 3)
gateAct (ENUM) - Activations
cellAct (ENUM) - Activations
outAct (ENUM) - Activations
retFullSequence (BOOL) - indicates whether to return whole time sequence h {h_0, h_1, ... , h_sL-1} - default = true
retLastH (BOOL) - indicates whether to return output at last time step only,
in this case shape would be [bS, nOut] (exact shape depends on dataFormat argument) - default = false
retLastC (BOOL) - indicates whether to return cells state at last time step only,
in this case shape would be [bS, nOut] (exact shape depends on dataFormat argument) - default = false
cellClip (NUMERIC) - Cell clipping value, if it = 0 then do not apply clipping - default = 0.0
gateAlpha (NUMERIC) - null - default = 0.0
gateBeta (NUMERIC) - null - default = 0.0
cellAlpha (NUMERIC) - null - default = 0.0
cellBeta (NUMERIC) - null - default = 0.0
outAlpha (NUMERIC) - null - default = 0.0
outBeta (NUMERIC) - null - default = 0.0
Used in these ops: lstmLayer
ruWeight- null (NUMERIC type)
cWeight- null (NUMERIC type)
ruBias- null (NUMERIC type)
cBias- null (NUMERIC type)
Used in these ops: gruCell
weights- null (NUMERIC type)
bias- null (NUMERIC type)
Used in these ops: sru sruCell
ruWeight- null (NUMERIC type)
inputPeepholeWeights- null (NUMERIC type)
forgetPeepholeWeights- null (NUMERIC type)
outputPeepholeWeights- null (NUMERIC type)
bias- null (NUMERIC type)
Used in these ops: lstmCell lstmblock
inputWeights- input weights Wx:
1) shapes [nIn, 4*nOut]
for FWD,BWD 2) shapes [2, nIn, 4*nOut]
BIDIR_SUM, BIDIR_CONCAT and BIDIR_EXTRA_DIM (NUMERIC type)
recurrentWeights- recurrent weights Wr:
1) shapes [nIn, 4*nOut]
for FWD, BWD 2) shapes [2, nIn, 4*nOut]
BIDIR_SUM, BIDIR_CONCAT and BIDIR_EXTRA_DIM (NUMERIC type)
biases- biases
1) shapes [4*nOut]
for FWD, BWD 2) shapes [2, 4*nOut]
for BIDIR_SUM, BIDIR_CONCAT and BIDIR_EXTRA_DIM (NUMERIC type)
peepholeWeights- peephole weights Wp:
1) [3*nOut]
when directionMode < 2
2) [2, 3*nOut]
when directionMode >= 2 (NUMERIC type)
Used in these ops: lstmLayer