# What are activations?

At a simple level, activation functions help decide whether a neuron should be activated. This helps determine whether the information that the neuron is receiving is relevant for the input. The activation function is a non-linear transformation that happens over an input signal, and the transformed output is sent to the next neuron.

# Usage

The recommended method to use activations is to add an activation layer in your neural network, and configure your desired activation:

`GraphBuilder graphBuilder = new NeuralNetConfiguration.Builder()    // add hyperparameters and other layers    .addLayer("softmax", new ActivationLayer(Activation.SOFTMAX), "previous_input")    // add more layers and output    .build();`

# Available activations

## ActivationRectifiedTanh

[source]

Rectified tanh

Essentially max(0, tanh(x))

Underlying implementation is in native code

# ActivationELU

[source]

f(x) = alpha (exp(x) - 1.0); x < 0 = x ; x>= 0

alpha defaults to 1, if not specified

[source]

f(x) = max(0, x)

# ActivationRationalTanh

[source]

Rational tanh approximation From https://arxiv.org/pdf/1508.01292v3

f(x) = 1.7159 tanh(2x/3) where tanh is approximated as follows, tanh(y) ~ sgn(y) { 1 - 1/(1+|y|+y^2+1.41645y^4)}

Underlying implementation is in native code

# ActivationThresholdedReLU

[source]

Thresholded RELU

f(x) = x for x > theta, f(x) = 0 otherwise. theta defaults to 1.0

# ActivationReLU6

[source]

f(x) = min(max(input, cutoff), 6)

# ActivationHardTanH

[source]

`          ⎧  1, if x >  1 f(x) =   ⎨ -1, if x < -1          ⎩  x, otherwise`

# ActivationSigmoid

[source]

f(x) = 1 / (1 + exp(-x))

# ActivationGELU

[source]

GELU activation function - Gaussian Error Linear Units

# ActivationPReLU

[source]

/ Parametrized Rectified Linear Unit (PReLU)

f(x) = alpha x for x < 0, f(x) = x for x >= 0

alpha has the same shape as x and is a learned parameter.

[source]

f(x) = x

# ActivationSoftSign

[source]

 f_i(x) = x_i / (1+ x_i )

# ActivationHardSigmoid

[source]

f(x) = min(1, max(0, 0.2x + 0.5))

# ActivationSoftmax

[source]

f_i(x) = exp(x_i - shift) / sum_j exp(x_j - shift) where shift = max_i(x_i)

[source]

f(x) = x^3

# ActivationRReLU

[source]

f(x) = max(0,x) + alpha min(0, x)

alpha is drawn from uniform(l,u) during training and is set to l+u/2 during test l and u default to 1/8 and 1/3 respectively

# ActivationTanH

[source]

f(x) = (exp(x) - exp(-x)) / (exp(x) + exp(-x))

# ActivationSELU

[source]

https://arxiv.org/pdf/1706.02515.pdf

# ActivationLReLU

[source]

Leaky RELU f(x) = max(0, x) + alpha min(0, x) alpha defaults to 0.01

# ActivationSwish

[source]

f(x) = x sigmoid(x)

# ActivationSoftPlus

[source]

f(x) = log(1+e^x)