Deeplearning4j
Community ForumND4J JavadocDL4J Javadoc
EN 1.0.0-M2
EN 1.0.0-M2
  • Deeplearning4j Suite Overview
  • Release Notes
    • 1.0.0-M2
    • 1.0.0-M1.1
    • 1.0.0-M1
    • 1.0.0-beta7
    • 1.0.0-beta6
    • 1.0.0-beta5
    • 1.0.0-beta4
    • 1.0.0-beta3
    • 1.0.0-beta2
    • 1.0.0-beta
    • 1.0.0-alpha
    • 0.9.1
    • 0.9.0
    • 0.8.0
    • 0.7.2
    • 0.7.1
    • 0.7.0
    • 0.6.0
    • 0.5.0
    • 0.4.0
    • 1.00-M2.2
  • Multi-Project
    • Tutorials
      • Beginners
      • Quickstart
    • How To Guides
      • Import in to your favorite IDE
      • Contribute
        • Eclipse Contributors
      • Developer Docs
        • Github Actions/Build Infra
        • Javacpp
        • Release
        • Testing
      • Build From Source
      • Benchmark
      • Beginners
    • Reference
      • Examples Tour
    • Explanation
      • The core workflow
      • Configuration
        • Backends
          • Performance Issues
          • CPU
          • Cudnn
        • Memory
          • Workspaces
      • Build Tools
      • Snapshots
      • Maven
  • Deeplearning4j
    • Tutorials
      • Quick Start
      • Language Processing
        • Doc2Vec
        • Sentence Iterator
        • Tokenization
        • Vocabulary Cache
    • How To Guides
      • Custom Layers
      • Keras Import
        • Functional Models
        • Sequential Models
        • Custom Layers
        • Keras Import API Overview
          • Advanced Activations
          • Convolutional Layers
          • Core Layers
          • Embedding Layers
          • Local Layers
          • Noise Layers
          • Normalization Layers
          • Pooling Layers
          • Recurrent Layers
          • Wrapper Layers
        • Supported Features Overview
          • Activations
          • Constraints
          • Initializers
          • Losses
          • Optimizers
          • Regularizers
      • Tuning and Training
        • Visualization
        • Troubleshooting Training
        • Early Stopping
        • Evaluation
        • Transfer Learning
    • Reference
      • Model Zoo
        • Zoo Models
      • Activations
      • Auto Encoders
      • Computation Graph
      • Convolutional Layers
      • DataSet Iterators
      • Layers
      • Model Listeners
      • Saving and Loading Models
      • Multi Layer Network
      • Recurrent Layers
      • Updaters/Optimizers
      • Vertices
      • Word2vec/Glove/Doc2Vec
    • Explanation
  • datavec
    • Tutorials
      • Overview
    • How To Guides
    • Reference
      • Analysis
      • Conditions
      • Executors
      • Filters
      • Normalization
      • Operations
      • Transforms
      • Readers
      • Records
      • Reductions
      • Schemas
      • Serialization
      • Visualization
    • Explanation
  • Nd4j
    • Tutorials
      • Quickstart
    • How To Guides
      • Other Framework Interop
        • Tensorflow
        • TVM
        • Onnx
      • Matrix Manipulation
      • Element wise Operations
      • Basics
    • Reference
      • Op Descriptor Format
      • Tensor
      • Syntax
    • Explanation
  • Samediff
    • Tutorials
      • Quickstart
    • How To Guides
      • Importing Tensorflow
      • Adding Operations
        • codegen
    • Reference
      • Operation Namespaces
        • Base Operations
        • Bitwise
        • CNN
        • Image
        • LinAlg
        • Loss
        • Math
        • NN
        • Random
        • RNN
      • Variables
    • Explanation
      • Model Import Framework
  • Libnd4j
    • How To Guides
      • Building on Windows
      • Building for raspberry pi or Jetson Nano
      • Building on ios
      • How to Add Operations
      • How to Setup CLion
    • Reference
      • Understanding graph execution
      • Overview of working with libnd4j
      • Helpers Overview (CUDNN, OneDNN,Armcompute)
    • Explanation
  • Python4j
    • Tutorials
      • Quickstart
    • How To Guides
      • Write Python Script
    • Reference
      • Python Types
      • Python Path
      • Garbage Collection
      • Python Script Execution
    • Explanation
  • Spark
    • Tutorials
      • DL4J on Spark Quickstart
    • How To Guides
      • How To
      • Data How To
    • Reference
      • Parameter Server
      • Technical Reference
    • Explanation
      • Spark API Reference
  • codegen
Powered by GitBook
On this page
  • What are activations?
  • Usage
  • Available activations
  • ActivationRectifiedTanh
  • ActivationELU
  • ActivationReLU
  • ActivationRationalTanh
  • ActivationThresholdedReLU
  • ActivationReLU6
  • ActivationHardTanH
  • ActivationSigmoid
  • ActivationGELU
  • ActivationPReLU
  • ActivationIdentity
  • ActivationSoftSign
  • ActivationHardSigmoid
  • ActivationSoftmax
  • ActivationCube
  • ActivationRReLU
  • ActivationTanH
  • ActivationSELU
  • ActivationLReLU
  • ActivationSwish
  • ActivationSoftPlus

Was this helpful?

Export as PDF
  1. Deeplearning4j
  2. Reference

Activations

Special algorithms for gradient descent.

PreviousZoo ModelsNextAuto Encoders

Last updated 3 years ago

Was this helpful?

What are activations?

At a simple level, activation functions help decide whether a neuron should be activated. This helps determine whether the information that the neuron is receiving is relevant for the input. The activation function is a non-linear transformation that happens over an input signal, and the transformed output is sent to the next neuron.

Usage

The recommended method to use activations is to add an activation layer in your neural network, and configure your desired activation:

GraphBuilder graphBuilder = new NeuralNetConfiguration.Builder()
    // add hyperparameters and other layers
    .addLayer("softmax", new ActivationLayer(Activation.SOFTMAX), "previous_input")
    // add more layers and output
    .build();

Available activations

ActivationRectifiedTanh

Rectified tanh

Essentially max(0, tanh(x))

Underlying implementation is in native code

ActivationELU

f(x) = alpha (exp(x) - 1.0); x < 0 = x ; x>= 0

alpha defaults to 1, if not specified

ActivationReLU

f(x) = max(0, x)

ActivationRationalTanh

f(x) = 1.7159 tanh(2x/3) where tanh is approximated as follows, tanh(y) ~ sgn(y) { 1 - 1/(1+|y|+y^2+1.41645y^4)}

Underlying implementation is in native code

ActivationThresholdedReLU

Thresholded RELU

f(x) = x for x > theta, f(x) = 0 otherwise. theta defaults to 1.0

ActivationReLU6

f(x) = min(max(input, cutoff), 6)

ActivationHardTanH

          ⎧  1, if x >  1
 f(x) =   ⎨ -1, if x < -1
          ⎩  x, otherwise

ActivationSigmoid

f(x) = 1 / (1 + exp(-x))

ActivationGELU

GELU activation function - Gaussian Error Linear Units

ActivationPReLU

/ Parametrized Rectified Linear Unit (PReLU)

f(x) = alpha x for x < 0, f(x) = x for x >= 0

alpha has the same shape as x and is a learned parameter.

ActivationIdentity

f(x) = x

ActivationSoftSign

f_i(x) = x_i / (1+

x_i

)

ActivationHardSigmoid

f(x) = min(1, max(0, 0.2x + 0.5))

ActivationSoftmax

f_i(x) = exp(x_i - shift) / sum_j exp(x_j - shift) where shift = max_i(x_i)

ActivationCube

f(x) = x^3

ActivationRReLU

f(x) = max(0,x) + alpha min(0, x)

alpha is drawn from uniform(l,u) during training and is set to l+u/2 during test l and u default to 1/8 and 1/3 respectively

ActivationTanH

f(x) = (exp(x) - exp(-x)) / (exp(x) + exp(-x))

ActivationSELU

ActivationLReLU

Leaky RELU f(x) = max(0, x) + alpha min(0, x) alpha defaults to 0.01

ActivationSwish

f(x) = x sigmoid(x)

ActivationSoftPlus

f(x) = log(1+e^x)

Rational tanh approximation From

[source]
[source]
[source]
[source]
https://arxiv.org/pdf/1508.01292v3
[source]
[source]
[source]
[source]
[source]
[source]
[source]
[source]
[source]
[source]
[source]
[source]
Empirical Evaluation of Rectified Activations in Convolutional Network
[source]
[source]
https://arxiv.org/pdf/1706.02515.pdf
[source]
[source]
[source]