Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Extend DL4J functionality for custom layers.
There are two components to adding a custom layer:
Adding the layer configuration class: extends org.deeplearning4j.nn.conf.layers.Layer
Adding the layer implementation class: implements org.deeplearning4j.nn.api.Layer
The configuration layer ((1) above) class handles the settings. It's the one you would use when constructing a MultiLayerNetwork or ComputationGraph. You can add custom settings here, and use them in your layer.
The implementation layer ((2) above) class has parameters, and handles network forward pass, backpropagation, etc. It is created from the org.deeplearning4j.nn.conf.layers.Layer.instantiate(...) method. In other words: the instantiate method is how we go from the configuration to the implementation; MultiLayerNetwork or ComputationGraph will call this method when initializing the
An example of these are CustomLayer (the configuration class) and CustomLayerImpl (the implementation class). Both of these classes have extensive comments regarding their methods.
You'll note that in Deeplearning4j there are two DenseLayer clases, two GravesLSTM classes, etc: the reason is because one is for the configuration, one is for the implementation. We have not followed this "same name" pattern here to hopefully avoid confusion.
Once you have added a custom layer, it is necessary to run some tests to ensure it is correct.
These tests should at a minimum include the following:
Tests to ensure that the JSON configuration (to/from JSON) works correctly
This is necessary for networks with your custom layer to function with both
model serialization (saving) and Spark training.
Gradient checks to ensure that the implementation is correct.
A full custom layer example is available in our examples repository.
Importing the functional model.
Let's say you start with defining a simple MLP using Keras' functional API:
In Keras there are several ways to save a model. You can store the whole model (model definition, weights and training configuration) as HDF5 file, just the model configuration (as JSON or YAML file) or just the weights (as HDF5 file). Here's how you do each:
If you decide to save the full model, you will have access to the training configuration of the model, otherwise you don't. So if you want to further train your model in DL4J after import, keep that in mind and use model.save(...)
to persist your model.
Let's start with the recommended way, loading the full model back into DL4J (we assume it's on your class path):
In case you didn't compile your Keras model, it will not come with a training configuration. In that case you need to explicitly tell model import to ignore training configuration by setting the enforceTrainingConfig
flag to false like this:
To load just the model configuration from JSON, you use KerasModelImport
as follows:
If additionally you also want to load the model weights with the configuration, here's what you do:
In the latter two cases no training configuration will be read.
Imports PReLU layer from Keras
KerasPReLU
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Invalid Keras config
getOutputType
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
param enforceTrainingConfig whether to enforce training-related configuration options
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Invalid Keras config
getPReLULayer
Get DL4J ActivationLayer.
return ActivationLayer
setWeights
Set weights for layer.
param weights Dense layer weights
Imports ThresholdedReLU layer from Keras
KerasThresholdedReLU
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Invalid Keras config
getOutputType
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
param enforceTrainingConfig whether to enforce training-related configuration options
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Invalid Keras config
getActivationLayer
Get DL4J ActivationLayer.
return ActivationLayer
Imports LeakyReLU layer from Keras
KerasLeakyReLU
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Invalid Keras config
getOutputType
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
param enforceTrainingConfig whether to enforce training-related configuration options
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Invalid Keras config
getActivationLayer
Get DL4J ActivationLayer.
return ActivationLayer
Imports Permute layer from Keras
KerasPermute
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
isInputPreProcessor
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
param enforceTrainingConfig whether to enforce training-related configuration options
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getInputPreprocessor
Gets appropriate DL4J InputPreProcessor for given InputTypes.
param inputType Array of InputTypes
return DL4J InputPreProcessor
throws InvalidKerasConfigurationException Invalid Keras config
see InputPreProcessor
getOutputType
Get layer output type.
param inputType Array of InputTypes
return output type as InputType
throws InvalidKerasConfigurationException Invalid Keras config
Imports a Keras Flatten layer as a DL4J {Cnn,Rnn}ToFeedForwardInputPreProcessor.
KerasFlatten
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
isInputPreProcessor
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
param enforceTrainingConfig whether to enforce training-related configuration options
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getInputPreprocessor
Gets appropriate DL4J InputPreProcessor for given InputTypes.
param inputType Array of InputTypes
return DL4J InputPreProcessor
throws InvalidKerasConfigurationException Invalid Keras config
see org.deeplearning4j.nn.conf.InputPreProcessor
getOutputType
Get layer output type.
param inputType Array of InputTypes
return output type as InputType
throws InvalidKerasConfigurationException Invalid Keras config
Imports Reshape layer from Keras
KerasReshape
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
isInputPreProcessor
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
param enforceTrainingConfig whether to enforce training-related configuration options
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getInputPreprocessor
Gets appropriate DL4J InputPreProcessor for given InputTypes.
param inputType Array of InputTypes
return DL4J InputPreProcessor
throws InvalidKerasConfigurationException Invalid Keras config
see org.deeplearning4j.nn.conf.InputPreProcessor
getOutputType
Get layer output type.
param inputType Array of InputTypes
return output type as InputType
throws InvalidKerasConfigurationException Invalid Keras config
Imports a Keras Merge layer as a DL4J Merge (graph) vertex.
TODO: handle axes arguments that alter merge behavior (requires changes to DL4J?)
KerasMerge
Pass-through constructor from KerasLayer
param kerasVersion major keras version
throws UnsupportedKerasConfigurationException Unsupported Keras config
getOutputType
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration.
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
Imports a Dropout layer from Keras.
KerasDropout
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getOutputType
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
param enforceTrainingConfig whether to enforce training-related configuration options
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getDropoutLayer
Get DL4J DropoutLayer.
return DropoutLayer
Imports Keras masking layers.
KerasMasking
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getOutputType
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
param enforceTrainingConfig whether to enforce training-related configuration options
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getMaskingLayer
Get DL4J MaskZeroLayer.
return MaskZeroLayer
Keras wrapper for DL4J dropout layer with SpatialDropout, works 1D-3D.
KerasSpatialDropout
Pass-through constructor from KerasLayer
param kerasVersion major keras version
throws UnsupportedKerasConfigurationException Unsupported Keras config
getOutputType
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration.
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getSpatialDropoutLayer
Get DL4J DropoutLayer with spatial dropout.
return DropoutLayer
Wraps a DL4J SameDiffLambda into a KerasLayer
KerasLambda
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getOutputType
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
param enforceTrainingConfig whether to enforce training-related configuration options
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getSameDiffLayer
Get DL4J SameDiffLayer.
return SameDiffLayer
Imports an Activation layer from Keras.
KerasActivation
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getOutputType
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
param enforceTrainingConfig whether to enforce training-related configuration options
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getActivationLayer
Get DL4J ActivationLayer.
return ActivationLayer
Imports a Dense layer from Keras.
KerasDense
Pass-through constructor from KerasLayer
param kerasVersion major keras version
throws UnsupportedKerasConfigurationException Unsupported Keras config
getDenseLayer
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getOutputType
Get layer output type.
param inputType Array of InputTypes
return output type as InputType
throws InvalidKerasConfigurationException Invalid Keras config
getNumParams
Returns number of trainable parameters in layer.
return number of trainable parameters (2)
setWeights
Set weights for layer.
param weights Dense layer weights
Imports a Keras RepeatVector layer
KerasRepeatVector
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getOutputType
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
param enforceTrainingConfig whether to enforce training-related configuration options
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getRepeatVectorLayer
Get DL4J RepeatVector.
return RepeatVector
Importing the functional model.
Let's say you start with defining a simple MLP using Keras:
In Keras there are several ways to save a model. You can store the whole model (model definition, weights and training configuration) as HDF5 file, just the model configuration (as JSON or YAML file) or just the weights (as HDF5 file). Here's how you do each:
If you decide to save the full model, you will have access to the training configuration of the model, otherwise you don't. So if you want to further train your model in DL4J after import, keep that in mind and use model.save(...)
to persist your model.
Let's start with the recommended way, loading the full model back into DL4J (we assume it's on your class path):
In case you didn't compile your Keras model, it will not come with a training configuration. In that case you need to explicitly tell model import to ignore training configuration by setting the enforceTrainingConfig
flag to false like this:
To load just the model configuration from JSON, you use KerasModelImport
as follows:
If additionally you also want to load the model weights with the configuration, here's what you do:
In the latter two cases no training configuration will be read.
Imports a 2D Convolution layer from Keras.
KerasConvolution2D
Pass-through constructor from KerasLayer
param kerasVersion major keras version
throws UnsupportedKerasConfigurationException Unsupported Keras config
getConvolution2DLayer
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getOutputType
Get layer output type.
param inputType Array of InputTypes
return output type as InputType
throws InvalidKerasConfigurationException Invalid Keras config
Imports a Keras Cropping 2D layer.
KerasCropping2D
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration.
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getCropping2DLayer
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
param enforceTrainingConfig whether to enforce training-related configuration options
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getOutputType
Get layer output type.
param inputType Array of InputTypes
return output type as InputType
throws InvalidKerasConfigurationException Invalid Keras config
Keras Upsampling3D layer support
KerasUpsampling3D
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration.
throws InvalidKerasConfigurationException Invalid Keras configuration exception
throws UnsupportedKerasConfigurationException Unsupported Keras configuration exception
getUpsampling3DLayer
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
param enforceTrainingConfig whether to enforce training-related configuration options
throws InvalidKerasConfigurationException Invalid Keras configuration exception
throws UnsupportedKerasConfigurationException Invalid Keras configuration exception
getOutputType
Get layer output type.
param inputType Array of InputTypes
return output type as InputType
throws InvalidKerasConfigurationException Invalid Keras config
Imports a 1D Convolution layer from Keras.
KerasConvolution1D
Pass-through constructor from KerasLayer
param kerasVersion major keras version
throws UnsupportedKerasConfigurationException
getConvolution1DLayer
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
throws InvalidKerasConfigurationException
throws UnsupportedKerasConfigurationException
getOutputType
Get layer output type.
param inputType Array of InputTypes
return output type as InputType
throws InvalidKerasConfigurationException
getInputPreprocessor
Gets appropriate DL4J InputPreProcessor for given InputTypes.
param inputType Array of InputTypes
return DL4J InputPreProcessor
throws InvalidKerasConfigurationException Invalid Keras configuration exception
see org.deeplearning4j.nn.conf.InputPreProcessor
setWeights
Set weights for layer.
param weights Map from parameter name to INDArray.
Keras Upsampling1D layer support
KerasUpsampling1D
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration.
throws InvalidKerasConfigurationException Invalid Keras configuration exception
throws UnsupportedKerasConfigurationException Unsupported Keras configuration exception
getUpsampling1DLayer
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
param enforceTrainingConfig whether to enforce training-related configuration options
throws InvalidKerasConfigurationException Invalid Keras configuration exception
throws UnsupportedKerasConfigurationException Invalid Keras configuration exception
getOutputType
Get layer output type.
param inputType Array of InputTypes
return output type as InputType
throws InvalidKerasConfigurationException Invalid Keras config
Keras 1D atrous / dilated convolution layer. Note that in keras 2 this layer has been removed and dilations are now available through the “dilated” argument in regular Conv1D layers
author: Max Pumperla
KerasAtrousConvolution2D
Pass-through constructor from KerasLayer
param kerasVersion major keras version
throws UnsupportedKerasConfigurationException Unsupported Keras config
getAtrousConvolution2D
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getOutputType
Get layer output type.
param inputType Array of InputTypes
return output type as InputType
throws InvalidKerasConfigurationException Invalid Keras config
Keras 1D atrous / dilated convolution layer. Note that in keras 2 this layer has been removed and dilations are now available through the “dilated” argument in regular Conv1D layers
author: Max Pumperla
KerasAtrousConvolution1D
Pass-through constructor from KerasLayer
param kerasVersion major keras version
throws UnsupportedKerasConfigurationException Unsupported Keras config
getAtrousConvolution1D
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getOutputType
Get layer output type.
param inputType Array of InputTypes
return output type as InputType
throws InvalidKerasConfigurationException Invalid Keras config
Imports a Keras Cropping 3D layer.
KerasCropping3D
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration.
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getCropping3DLayer
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
param enforceTrainingConfig whether to enforce training-related configuration options
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getOutputType
Get layer output type.
param inputType Array of InputTypes
return output type as InputType
throws InvalidKerasConfigurationException Invalid Keras config
Imports a Keras ZeroPadding 2D layer.
KerasZeroPadding2D
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration.
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getZeroPadding2DLayer
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
param enforceTrainingConfig whether to enforce training-related configuration options
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getOutputType
Get layer output type.
param inputType Array of InputTypes
return output type as InputType
throws InvalidKerasConfigurationException Invalid Keras config
Imports a 3D Convolution layer from Keras.
KerasConvolution3D
Pass-through constructor from KerasLayer
param kerasVersion major keras version
throws UnsupportedKerasConfigurationException Unsupported Keras config
getConvolution3DLayer
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getOutputType
Get layer output type.
param inputType Array of InputTypes
return output type as InputType
throws InvalidKerasConfigurationException Invalid Keras config
Imports a 2D Deconvolution layer from Keras.
KerasDeconvolution2D
Pass-through constructor from KerasLayer
param kerasVersion major keras version
throws UnsupportedKerasConfigurationException Unsupported Keras config
getDeconvolution2DLayer
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getOutputType
Get layer output type.
param inputType Array of InputTypes
return output type as InputType
throws InvalidKerasConfigurationException Invalid Keras config
Imports a Keras ZeroPadding 3D layer.
KerasZeroPadding3D
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration.
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getZeroPadding3DLayer
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
param enforceTrainingConfig whether to enforce training-related configuration options
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getOutputType
Get layer output type.
param inputType Array of InputTypes
return output type as InputType
throws InvalidKerasConfigurationException Invalid Keras config
Utility functionality for Keras convolution layers.
getConvolutionModeFromConfig
Get (convolution) stride from Keras layer configuration.
param layerConfig dictionary containing Keras layer configuration
return Strides array from Keras configuration
throws InvalidKerasConfigurationException Invalid Keras config
Imports a Keras ZeroPadding 1D layer.
KerasZeroPadding1D
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration.
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getZeroPadding1DLayer
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
param enforceTrainingConfig whether to enforce training-related configuration options
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getOutputType
Get layer output type.
param inputType Array of InputTypes
return output type as InputType
throws InvalidKerasConfigurationException Invalid Keras config
Imports a Keras Cropping 1D layer.
KerasCropping1D
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration.
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getCropping1DLayer
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
param enforceTrainingConfig whether to enforce training-related configuration options
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getOutputType
Get layer output type.
param inputType Array of InputTypes
return output type as InputType
throws InvalidKerasConfigurationException Invalid Keras config
Constructor from parsed Keras layer configuration dictionary.
KerasSpaceToDepth
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
param enforceTrainingConfig whether to enforce training-related configuration options
throws InvalidKerasConfigurationException Invalid Keras configuration exception
throws UnsupportedKerasConfigurationException Unsupported Keras configuration exception
getSpaceToDepthLayer
Get DL4J SpaceToDepth layer.
return SpaceToDepth layer
getOutputType
Get layer output type.
param inputType Array of InputTypes
return output type as InputType
throws InvalidKerasConfigurationException Invalid Keras config
Keras Upsampling2D layer support
KerasUpsampling2D
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration.
throws InvalidKerasConfigurationException Invalid Keras configuration exception
throws UnsupportedKerasConfigurationException Unsupported Keras configuration exception
getUpsampling2DLayer
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
param enforceTrainingConfig whether to enforce training-related configuration options
throws InvalidKerasConfigurationException Invalid Keras configuration exception
throws UnsupportedKerasConfigurationException Invalid Keras configuration exception
getOutputType
Get layer output type.
param inputType Array of InputTypes
return output type as InputType
throws InvalidKerasConfigurationException Invalid Keras config
Keras separable convolution 2D layer support
KerasSeparableConvolution2D
Pass-through constructor from KerasLayer
param kerasVersion major keras version
throws UnsupportedKerasConfigurationException Unsupported Keras configuration
setWeights
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
throws InvalidKerasConfigurationException Invalid Keras configuration
throws UnsupportedKerasConfigurationException Unsupported Keras configuration
getSeparableConvolution2DLayer
Get DL4J SeparableConvolution2D.
return SeparableConvolution2D
getOutputType
Get layer output type.
param inputType Array of InputTypes
return output type as InputType
throws InvalidKerasConfigurationException Invalid Keras config
Keras depth-wise convolution 2D layer support
KerasDepthwiseConvolution2D
Pass-through constructor from KerasLayer
param kerasVersion major keras version
throws UnsupportedKerasConfigurationException Unsupported Keras configuration
setWeights
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
throws InvalidKerasConfigurationException Invalid Keras configuration
throws UnsupportedKerasConfigurationException Unsupported Keras configuration
getDepthwiseConvolution2DLayer
Get DL4J DepthwiseConvolution2D.
return DepthwiseConvolution2D
getOutputType
Get layer output type.
param inputType Array of InputTypes
return output type as InputType
throws InvalidKerasConfigurationException Invalid Keras config
Overview of model import.
Keras model import provides routines for importing neural network models originally configured and trained using Keras, a popular Python deep learning library.
Once you have imported your model into DL4J, our full production stack is at your disposal. We support import of all Keras model types, most layers and practically all utility functionality. Please check here for a complete list of supported Keras features.
Note to users: tf.keras models are also supported. Please check here for an overview of what to expect for tf.keras as well as other features. Our documentation needs to be updated to reflect the changes between keras and tf.keras. For now, users should aware of this as you read the below docs. Migrating from keras to tf.keras mainly involves changing the imports in your python script. The equivalent kind of changes needed to happen for the model import in deeplearning4j. Those changes happened in beta7.
To import a Keras model, you need to create and serialize such a model first. Here's a simple example that you can use. The model is a simple MLP that takes mini-batches of vectors of length 100, has two Dense layers and predicts a total of 10 categories. After defining the model, we serialize it in HDF5 format.
If you put this model file (simple_mlp.h5
) into the base of your resource folder of your project, you can load the Keras model as DL4J MultiLayerNetwork
as follows
This shows only how to import a Keras Sequential model. For more details take a look at both Functional Model import and Sequential Model import.
That's it! The KerasModelImport
is your main entry point to model import and class takes care of mapping Keras to DL4J concepts internally. As user you just have to provide your model file, see our Getting started guide for more details and options to load Keras models into DL4J.
You can now use your imported model for inference (here with dummy data for simplicity)
Here's how you do training in DL4J for your imported model:
The full example just shown can be found in our DL4J examples.
To use Keras model import in your existing project, all you need to do is add the following dependency to your pom.xml.
If you need a project to get started in the first place, consider cloning DL4J examples and follow the instructions in the repository to build the project.
DL4J Keras model import is backend agnostic. No matter which backend you choose (TensorFlow, Theano, CNTK), your models can be imported into DL4J.
We support import for a growing number of applications, check here for a full list of currently covered models. These applications include
Deep convolutional and Wasserstein GANs
UNET
ResNet50
SqueezeNet
MobileNet
Inception
Xception
An IncompatibleKerasConfigurationException
message indicates that you are attempting to import a Keras model configuration that is not currently supported in Deeplearning4j (either because model import does not cover it, or DL4J does not implement the layer, or feature).
Once you have imported your model, we recommend our own ModelSerializer
class for further saving and reloading of your model.
You can inquire further by visiting the community forums. You might consider filing a feature request via Github so that this missing functionality can be placed on the DL4J development roadmap or even sending us a pull request with the necessary changes!
Keras is a popular and user-friendly deep learning library written in Python. The intuitive API of Keras makes defining and running your deep learning models in Python easy. Keras allows you to choose which lower-level library it runs on, but provides a unified API for each such backend. Currently, Keras supports Tensorflow, CNTK and Theano backends.
There is often a gap between the production system of a company and the experimental setup of its data scientists. Keras model import allows data scientists to write their models in Python, but still seamlessly integrates with the production stack.
Keras model import is targeted at users mainly familiar with writing their models in Python with Keras. With model import you can bring your Python models to production by allowing users to import their models into the DL4J ecosystem for either further training or evaluation purposes.
You should use this module when the experimentation phase of your project is completed and you need to ship your models to production. Konduit commercial support for Keras implementations in enterprise.
How to implement custom Keras layers for import in Deeplearning4J.
Many more advanced models will contain custom layers, i.e. layers that aren't included in Keras.
You can import those models too, but you will have to provide an implementation of that layer yourself, as the exported model file only provides us with a name for it.
Usually, you will have found out about needing to implement a custom layer, when you saw an exception like the following:
or
There are two ways of implementing a custom layer for Keras import. Which one is the right approach for you, depends on the type of layer you need to implement.
SameDiffLambdaLayer
Use this approach if your layer doesn't have any weights and defines just a computation. It is most useful when you have to define a custom layer because you are using a lambda
in your model definition. This is the approach you should be using when you've gotten the exception about no lambda layer being found.
KerasLayer
Use this approach if your layer needs its own weights. It is most useful when you have to define some complex layer that is more than just a simple computation. This is the approach you should be using when you've gotten the exception about an unsupported layer type.
Using a SameDiffLambdaLayer
is pretty easy. You create a new class that extends it, and override the defineLayer
and getOutputType
methods.
This simple lambda layer just multiplies its input by 3.
defineLayer
will only be called once to create the SameDiff graph that is used as the definition of this layer. Do not use information about the size of the inputs or other non-static sizes, like batch size, when defining the layer, or it may fail later on.
After defining your layer, you have to register it to make it available on import.
The correct name for your lambda layer will depend on the model you are importing. As you, most likely, were made aware of needing to implement the lambda layer by an exception, this exception should have given you the proper name already.
Implementing a full layer with weights is more complex than defining a lambda layer. You will have to create a new class that extends KerasLayer
and that reads the configuration of that layer and defines it appropriately.
For examples on how this was done, take a look at KerasLRN and KerasPoolHelper which are custom layers that were needed to be able to import GoogLeNet.
After you've defined your layer, you will have to register it to make it available on import:
Again, the appropriate name will we apparent from the exception that has notified you about needing to implement the custom layer in the first place.
Keras wrapper for DL4J dropout layer with GaussianNoise.
KerasGaussianNoise
Pass-through constructor from KerasLayer
param kerasVersion major keras version
throws UnsupportedKerasConfigurationException Unsupported Keras config
getOutputType
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration.
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getGaussianNoiseLayer
Get DL4J DropoutLayer with Gaussian dropout.
return DropoutLayer
Keras wrapper for DL4J dropout layer with AlphaDropout.
KerasAlphaDropout
Pass-through constructor from KerasLayer
param kerasVersion major keras version
throws UnsupportedKerasConfigurationException Unsupported Keras config
getOutputType
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration.
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getAlphaDropoutLayer
Get DL4J DropoutLayer with Alpha dropout.
return DropoutLayer
Keras wrapper for DL4J dropout layer with GaussianDropout.
KerasGaussianDropout
Pass-through constructor from KerasLayer
param kerasVersion major keras version
throws UnsupportedKerasConfigurationException Invalid Keras config
getOutputType
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration.
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getGaussianDropoutLayer
Get DL4J DropoutLayer with Gaussian dropout.
return DropoutLayer
Imports an Embedding layer from Keras.
KerasEmbedding
Pass through constructor for unit tests
throws UnsupportedKerasConfigurationException Unsupported Keras config
getEmbeddingLayer
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getOutputType
Get layer output type.
param inputType Array of InputTypes
return output type as InputType
throws InvalidKerasConfigurationException Invalid Keras config
getNumParams
Returns number of trainable parameters in layer.
return number of trainable parameters (1)
setWeights
Set weights for layer.
param weights Embedding layer weights
Imports a BatchNormalization layer from Keras.
KerasBatchNormalization
Pass-through constructor from KerasLayer
param kerasVersion major keras version
throws UnsupportedKerasConfigurationException Unsupported Keras config
getBatchNormalizationLayer
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getOutputType
Get layer output type.
param inputType Array of InputTypes
return output type as InputType
throws InvalidKerasConfigurationException Invalid Keras config
getNumParams
Returns number of trainable parameters in layer.
return number of trainable parameters (4)
setWeights
Set weights for layer.
param weights Map from parameter name to INDArray.
Imports a 1D locally connected layer from Keras.
KerasLocallyConnected1D
Pass-through constructor from KerasLayer
param kerasVersion major keras version
throws UnsupportedKerasConfigurationException Unsupported Keras config
getLocallyConnected1DLayer
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getOutputType
Get layer output type.
param inputType Array of InputTypes
return output type as InputType
throws InvalidKerasConfigurationException Invalid Keras config
setWeights
Set weights for 1D locally connected layer.
param weights Map from parameter name to INDArray.
Imports a 2D locally connected layer from Keras.
KerasLocallyConnected2D
Pass-through constructor from KerasLayer
param kerasVersion major keras version
throws UnsupportedKerasConfigurationException Unsupported Keras config
getLocallyConnected2DLayer
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getOutputType
Get layer output type.
param inputType Array of InputTypes
return output type as InputType
throws InvalidKerasConfigurationException Invalid Keras config
setWeights
Set weights for 2D locally connected layer.
param weights Map from parameter name to INDArray.
Imports a Keras SimpleRNN layer as a DL4J SimpleRnn layer.
KerasSimpleRnn
Pass-through constructor from KerasLayer
param kerasVersion major keras version
throws UnsupportedKerasConfigurationException Unsupported Keras config
getSimpleRnnLayer
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration.
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getOutputType
Get layer output type.
param inputType Array of InputTypes
return output type as InputType
throws InvalidKerasConfigurationException Invalid Keras config
getNumParams
Returns number of trainable parameters in layer.
return number of trainable parameters (12)
getInputPreprocessor
Gets appropriate DL4J InputPreProcessor for given InputTypes.
param inputType Array of InputTypes
return DL4J InputPreProcessor
throws InvalidKerasConfigurationException Invalid Keras configuration exception
see org.deeplearning4j.nn.conf.InputPreProcessor
getUnroll
Get whether SimpleRnn layer should be unrolled (for truncated BPTT).
return whether RNN should be unrolled (boolean)
setWeights
Set weights for layer.
param weights Simple RNN weights
throws InvalidKerasConfigurationException Invalid Keras configuration exception
Utility functions for Keras RNN layers
getUnrollRecurrentLayer
Get unroll parameter to decide whether to unroll RNN with BPTT or not.
param conf KerasLayerConfiguration
param layerConfig dictionary containing Keras layer properties
return boolean unroll parameter
throws InvalidKerasConfigurationException Invalid Keras configuration
getRecurrentDropout
Get recurrent weight dropout from Keras layer configuration. Non-zero dropout rates are currently not supported.
param conf KerasLayerConfiguration
param layerConfig dictionary containing Keras layer properties
return recurrent dropout rate
throws InvalidKerasConfigurationException Invalid Keras configuration
Imports a Keras LSTM layer as a DL4J LSTM layer.
KerasLSTM
Pass-through constructor from KerasLayer
param kerasVersion major keras version
throws UnsupportedKerasConfigurationException Unsupported Keras config
getLSTMLayer
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration.
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getOutputType
Get layer output type.
param inputType Array of InputTypes
return output type as InputType
throws InvalidKerasConfigurationException Invalid Keras config
getNumParams
Returns number of trainable parameters in layer.
return number of trainable parameters (12)
getInputPreprocessor
Gets appropriate DL4J InputPreProcessor for given InputTypes.
param inputType Array of InputTypes
return DL4J InputPreProcessor
throws InvalidKerasConfigurationException Invalid Keras configuration exception
see org.deeplearning4j.nn.conf.InputPreProcessor
setWeights
Set weights for layer.
param weights LSTM layer weights
getUnroll
Get whether LSTM layer should be unrolled (for truncated BPTT).
return whether to unroll the LSTM
getGateActivationFromConfig
Get LSTM gate activation function from Keras layer configuration.
param layerConfig dictionary containing Keras layer configuration
return LSTM inner activation function
throws InvalidKerasConfigurationException Invalid Keras config
getForgetBiasInitFromConfig
Get LSTM forget gate bias initialization from Keras layer configuration.
param layerConfig dictionary containing Keras layer configuration
return LSTM forget gate bias init
throws InvalidKerasConfigurationException Unsupported Keras config
Imports a Keras 1D Pooling layer as a DL4J Subsampling layer.
KerasPooling1D
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration.
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getSubsampling1DLayer
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
param enforceTrainingConfig whether to enforce training-related configuration options
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getOutputType
Get layer output type.
param inputType Array of InputTypes
return output type as InputType
throws InvalidKerasConfigurationException Invalid Keras config
Utility functionality for Keras pooling layers.
mapPoolingType
Map Keras pooling layers to DL4J pooling types.
param className name of the Keras pooling class
return DL4J pooling type
throws UnsupportedKerasConfigurationException Unsupported Keras config
Imports a Keras 3D Pooling layer as a DL4J Subsampling3D layer.
KerasPooling3D
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration.
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getSubsampling3DLayer
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
param enforceTrainingConfig whether to enforce training-related configuration options
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getOutputType
Get layer output type.
param inputType Array of InputTypes
return output type as InputType
throws InvalidKerasConfigurationException Invalid Keras config
Imports a Keras Pooling layer as a DL4J Subsampling layer.
KerasGlobalPooling
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration.
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getGlobalPoolingLayer
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
param enforceTrainingConfig whether to enforce training-related configuration options
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getInputPreprocessor
Gets appropriate DL4J InputPreProcessor for given InputTypes.
param inputType Array of InputTypes
return DL4J InputPreProcessor
throws InvalidKerasConfigurationException Invalid Keras config
see org.deeplearning4j.nn.conf.InputPreProcessor
getOutputType
Get layer output type.
param inputType Array of InputTypes
return output type as InputType
throws InvalidKerasConfigurationException Invalid Keras config
Imports a Keras 2D Pooling layer as a DL4J Subsampling layer.
KerasPooling2D
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration.
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getSubsampling2DLayer
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
param enforceTrainingConfig whether to enforce training-related configuration options
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getOutputType
Get layer output type.
param inputType Array of InputTypes
return output type as InputType
throws InvalidKerasConfigurationException Invalid Keras config
Supported Keras activations.
We support all Keras activation functions, namely:
softmax
elu
selu
softplus
softsign
relu
tanh
sigmoid
hard_sigmoid
linear
The mapping of Keras to DL4J activation functions is defined in KerasActivationUtils
Supported Keras loss functions.
DL4J supports all available Keras losses (except for logcosh
), namely:
mean_squared_error
mean_absolute_error
mean_absolute_percentage_error
mean_squared_logarithmic_error
squared_hinge
hinge
categorical_hinge
logcosh
categorical_crossentropy
sparse_categorical_crossentropy
binary_crossentropy
kullback_leibler_divergence
poisson
cosine_proximity
The mapping of Keras loss functions can be found in KerasLossUtils.
Supported Keras features.
While not every concept in DL4J has an equivalent in Keras and vice versa, many of the key concepts can be matched. Importing keras models into DL4J is done in our deeplearning4j-modelimport module. Below is a comprehensive list of currently supported features.
Note that we also support importing tf.keras models as well. The format only changed a little bit from keras to tf.keras. We handle this transition from beta7 and above.
Mapping keras to DL4J layers is done in the layers sub-module of model import. The structure of this project loosely reflects the structure of Keras.
❌ GRU
✅ LSTM
❌ ConvLSTM2D
✅ Add / add
✅ Multiply / multiply
✅ Subtract / subtract
✅ Average / average
✅ Maximum / maximum
✅ Concatenate / concatenate
❌ Dot / dot
✅ PReLU
✅ ELU
❌ TimeDistributed
✅ mean_squared_error
✅ mean_absolute_error
✅ mean_absolute_percentage_error
✅ mean_squared_logarithmic_error
✅ squared_hinge
✅ hinge
✅ categorical_hinge
❌ logcosh
✅ categorical_crossentropy
✅ sparse_categorical_crossentropy
✅ binary_crossentropy
✅ kullback_leibler_divergence
✅ poisson
✅ cosine_proximity
✅ softmax
✅ elu
✅ selu
✅ softplus
✅ softsign
✅ relu
✅ tanh
✅ sigmoid
✅ hard_sigmoid
✅ linear
✅ Zeros
✅ Ones
✅ Constant
✅ RandomNormal
✅ RandomUniform
✅ TruncatedNormal
✅ VarianceScaling
✅ Orthogonal
✅ Identity
✅ lecun_uniform
✅ lecun_normal
✅ glorot_normal
✅ glorot_uniform
✅ he_normal
✅ he_uniform
✅ l1
✅ l2
✅ l1_l2
✅ max_norm
✅ non_neg
✅ unit_norm
✅ min_max_norm
✅ SGD
✅ RMSprop
✅ Adagrad
✅ Adadelta
✅ Adam
✅ Adamax
✅ Nadam
❌ TFOptimizer
Builds a DL4J Bidirectional layer from a Keras Bidirectional layer wrapper
KerasBidirectional
Pass-through constructor from KerasLayer
param kerasVersion major keras version
throws UnsupportedKerasConfigurationException Unsupported Keras config
getUnderlyingRecurrentLayer
Constructor from parsed Keras layer configuration dictionary.
param layerConfig dictionary containing Keras layer configuration
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
getBidirectionalLayer
Get DL4J Bidirectional layer.
return Bidirectional Layer
getOutputType
Get layer output type.
param inputType Array of InputTypes
return output type as InputType
throws InvalidKerasConfigurationException Invalid Keras config
getNumParams
Returns number of trainable parameters in layer.
return number of trainable parameters
getInputPreprocessor
Gets appropriate DL4J InputPreProcessor for given InputTypes.
param inputType Array of InputTypes
return DL4J InputPreProcessor
throws InvalidKerasConfigurationException Invalid Keras configuration exception
see org.deeplearning4j.nn.conf.InputPreProcessor
setWeights
Set weights for Bidirectional layer.
param weights Map of weights
Supported Keras weight initializers.
DL4J supports all available Keras initializers, namely:
Zeros
Ones
Constant
RandomNormal
RandomUniform
TruncatedNormal
VarianceScaling
Orthogonal
Identity
lecun_uniform
lecun_normal
glorot_normal
glorot_uniform
he_normal
he_uniform
The mapping of Keras to DL4J initializers can be found in KerasInitilizationUtils.
Supported Keras constraints.
All Keras constraints are supported:
max_norm
non_neg
unit_norm
min_max_norm
Mapping Keras to DL4J constraints happens in KerasConstraintUtils.
Supported Keras optimizers
All standard Keras optimizers are supported, but importing custom TensorFlow optimizers won't work:
SGD
RMSprop
Adagrad
Adadelta
Adam
Adamax
Nadam
TFOptimizer
Keras model import API
Reads stored Keras configurations and weights from one of two archives: either as
a single HDF5 file storing model and training JSON configurations and weights
separate text file storing model JSON configuration and HDF5 file storing weights.
importKerasModelAndWeights
Load Keras (Functional API) Model saved using model.save_model(…).
param modelHdf5Stream InputStream containing HDF5 archive storing Keras Model
param enforceTrainingConfig whether to enforce training configuration options
return ComputationGraph
see ComputationGraph
importKerasModelAndWeights
Load Keras (Functional API) Model saved using model.save_model(…).
param modelHdf5Stream InputStream containing HDF5 archive storing Keras Model
return ComputationGraph
see ComputationGraph
importKerasSequentialModelAndWeights
Load Keras Sequential model saved using model.save_model(…).
param modelHdf5Stream InputStream containing HDF5 archive storing Keras Sequential model
param enforceTrainingConfig whether to enforce training configuration options
return ComputationGraph
see ComputationGraph
importKerasSequentialModelAndWeights
Load Keras Sequential model saved using model.save_model(…).
param modelHdf5Stream InputStream containing HDF5 archive storing Keras Sequential model
return ComputationGraph
see ComputationGraph
importKerasModelAndWeights
Load Keras (Functional API) Model saved using model.save_model(…).
param modelHdf5Filename path to HDF5 archive storing Keras Model
param inputShape optional input shape for models that come without such (e.g. notop = false models)
param enforceTrainingConfig whether to enforce training configuration options
return ComputationGraph
throws IOException IO exception
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
see ComputationGraph
importKerasModelAndWeights
Load Keras (Functional API) Model saved using model.save_model(…).
param modelHdf5Filename path to HDF5 archive storing Keras Model
param enforceTrainingConfig whether to enforce training configuration options
return ComputationGraph
throws IOException IO exception
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
see ComputationGraph
importKerasModelAndWeights
Load Keras (Functional API) Model saved using model.save_model(…).
param modelHdf5Filename path to HDF5 archive storing Keras Model
return ComputationGraph
throws IOException IO exception
throws InvalidKerasConfigurationException Invalid Keras config
throws UnsupportedKerasConfigurationException Unsupported Keras config
see ComputationGraph
importKerasSequentialModelAndWeights
Load Keras Sequential model saved using model.save_model(…).
param modelHdf5Filename path to HDF5 archive storing Keras Sequential model
param inputShape optional input shape for models that come without such (e.g. notop = false models)
param enforceTrainingConfig whether to enforce training configuration options
return MultiLayerNetwork
throws IOException IO exception
see MultiLayerNetwork
importKerasSequentialModelAndWeights
Load Keras Sequential model saved using model.save_model(…).
param modelHdf5Filename path to HDF5 archive storing Keras Sequential model
param enforceTrainingConfig whether to enforce training configuration options
return MultiLayerNetwork
throws IOException IO exception
see MultiLayerNetwork
importKerasSequentialModelAndWeights
Load Keras Sequential model saved using model.save_model(…).
param modelHdf5Filename path to HDF5 archive storing Keras Sequential model
return MultiLayerNetwork
throws IOException IO exception
see MultiLayerNetwork
importKerasModelAndWeights
Load Keras (Functional API) Model for which the configuration and weights were saved separately using calls to model.to_json() and model.save_weights(…).
param modelJsonFilename path to JSON file storing Keras Model configuration
param weightsHdf5Filename path to HDF5 archive storing Keras model weights
param enforceTrainingConfig whether to enforce training configuration options
return ComputationGraph
throws IOException IO exception
see ComputationGraph
importKerasModelAndWeights
Load Keras (Functional API) Model for which the configuration and weights were saved separately using calls to model.to_json() and model.save_weights(…).
param modelJsonFilename path to JSON file storing Keras Model configuration
param weightsHdf5Filename path to HDF5 archive storing Keras model weights
return ComputationGraph
throws IOException IO exception
see ComputationGraph
importKerasSequentialModelAndWeights
Load Keras Sequential model for which the configuration and weights were saved separately using calls to model.to_json() and model.save_weights(…).
param modelJsonFilename path to JSON file storing Keras Sequential model configuration
param weightsHdf5Filename path to HDF5 archive storing Keras model weights
param enforceTrainingConfig whether to enforce training configuration options
return MultiLayerNetwork
throws IOException IO exception
see MultiLayerNetwork
importKerasSequentialModelAndWeights
Load Keras Sequential model for which the configuration and weights were saved separately using calls to model.to_json() and model.save_weights(…).
param modelJsonFilename path to JSON file storing Keras Sequential model configuration
param weightsHdf5Filename path to HDF5 archive storing Keras model weights
return MultiLayerNetwork
throws IOException IO exception
see MultiLayerNetwork
importKerasModelConfiguration
Load Keras (Functional API) Model for which the configuration was saved separately using calls to model.to_json() and model.save_weights(…).
param modelJsonFilename path to JSON file storing Keras Model configuration
param enforceTrainingConfig whether to enforce training configuration options
return ComputationGraph
throws IOException IO exception
see ComputationGraph
importKerasModelConfiguration
Load Keras (Functional API) Model for which the configuration was saved separately using calls to model.to_json() and model.save_weights(…).
param modelJsonFilename path to JSON file storing Keras Model configuration
return ComputationGraph
throws IOException IO exception
see ComputationGraph
importKerasSequentialConfiguration
Load Keras Sequential model for which the configuration was saved separately using calls to model.to_json() and model.save_weights(…).
param modelJsonFilename path to JSON file storing Keras Sequential model configuration
param enforceTrainingConfig whether to enforce training configuration options
return MultiLayerNetwork
throws IOException IO exception
see MultiLayerNetwork
importKerasSequentialConfiguration
Load Keras Sequential model for which the configuration was saved separately using calls to model.to_json() and model.save_weights(…).
param modelJsonFilename path to JSON file storing Keras Sequential model configuration
return MultiLayerNetwork
throws IOException IO exception
see MultiLayerNetwork
How to visualize, monitor and debug neural network learning.
Note: This information here pertains to DL4J versions 1.0.0-beta6 and later.
DL4J Provides a user interface to visualize in your browser (in real time) the current network status and progress of training. The UI is typically used to help with tuning neural networks - i.e., the selection of hyperparameters (such as learning rate) to obtain good performance for a network.
Step 1: Add the Deeplearning4j UI dependency to your project.
Step 2: Enable the UI in your project
This is relatively straightforward:
To access the UI, open your browser and go to http://localhost:9000/train/overview
. You can set the port by using the org.deeplearning4j.ui.port
system property: i.e., to use port 9001, pass the following to the JVM on launch: -Dorg.deeplearning4j.ui.port=9001
Information will then be collected and routed to the UI when you call the fit
method on your network.
The overview page (one of 3 available pages) contains the following information:
Top left: score vs iteration chart - this is the value of the loss function on the current minibatch
Top right: model and training information
Bottom left: Ratio of parameters to updates (by layer) for all network weights vs. iteration
Bottom right: Standard deviations (vs. time) of: activations, gradients and updates
Note that for the bottom two charts, these are displayed as the logarithm (base 10) of the values. Thus a value of -3 on the update: parameter ratio chart corresponds to a ratio of 10-3 = 0.001.
The ratio of updates to parameters is specifically the ratio of mean magnitudes of these values (i.e., log10(mean(abs(updates))/mean(abs(parameters))).
See the later section of this page on how to use these values in practice.
The model page contains a graph of the neural network layers, which operates as a selection mechanism. Click on a layer to display information for it.
On the right, the following charts are available, after selecting a layer:
Table of layer information
Update to parameter ratio for this layer, as per the overview page. The components of this ratio (the parameter and update mean magnitudes) are also available via tabs.
Layer activations (mean and mean +/- 2 standard deviations) over time
Histograms of parameters and updates, for each parameter type
Learning rate vs. time (note this will be flat, unless learning rate schedules are used)
Note: parameters are labeled as follows: weights (W) and biases (b). For recurrent neural networks, W refers to the weights connecting the layer to the layer below, and RW refers to the recurrent weights (i.e., those between time steps).
The DL4J UI can be used with Spark. However, as of 0.7.0, conflicting dependencies mean that running the UI and Spark is the same JVM can be difficult.
Two alternatives are available:
Collect and save the relevant stats, to be visualized (offline) at a later point
Run the UI in a separate server, and Use the remote UI functionality to upload the data from the Spark master to your UI instance
Collecting Stats for Later Offline Use
Then, later you can load and display the saved information using:
Using the Remote UI Functionality
First, in the JVM running the UI (note this is the server):
This will require the deeplearning4j-ui
dependency. (NOTE THIS IS NOT THE CLIENT THIS IS YOUR SERVER - SEE BELOW FOR THE CLIENT WHICH USES: deeplearning4j-ui-model)
To avoid dependency conflicts with Spark, you should use the deeplearning4j-ui-model
dependency to get the StatsListener, not the full deeplearning4j-ui
UI dependency.
Note: you should replace UI_MACHINE_IP
with the IP address of the machine running the user interface instance.
Tuning neural networks is often more an art than a science. However, here's some ideas that may be useful:
Overview Page - Model Score vs. Iteration Chart
The score vs. iteration should (overall) go down over time.
If the score increases consistently, your learning rate is likely set too high. Try reducing it until scores become more stable.
Increasing scores can also be indicative of other network issues, such as incorrect data normalization
If the score is flat or decreases very slowly (over a few hundred iterations) (a) your learning rate may be too low, or (b) you might be having difficulties with optimization. In the latter case, if you are using the SGD updater, try a different updater such as Nesterovs (momentum), RMSProp or Adagrad.
Note that data that isn't shuffled (i.e., each minibatch contains only one class, for classification) can result in very rough or abnormal-looking score vs. iteration graphs
Some noise in this line chart is expected (i.e., the line will go up and down within a small range). However, if the scores vary quite significantly between runs variation is very large, this can be a problem
The issues mentioned above (learning rate, normalization, data shuffling) may contribute to this.
Setting the minibatch size to a very small number of examples can also contribute to noisy score vs. iteration graphs, and might lead to optimization difficulties
Overview Page and Model Page - Using the Update: Parameter Ratio Chart
The ratio of mean magnitude of updates to parameters is provided on both the overview and model pages
"Mean magnitude" = the average of the absolute value of the parameters or updates at the current time step
The most important use of this ratio is in selecting a learning rate. As a rule of thumb: this ratio should be around 1:1000 = 0.001. On the (log10) chart, this corresponds to a value of -3 (i.e., 10-3 = 0.001)
Note that is a rough guide only, and may not be appropriate for all networks. It's often a good starting point, however.
If the ratio diverges significantly from this (for example, > -2 (i.e., 10-2=0.01) or < -4 (i.e., 10-4=0.0001), your parameters may be too unstable to learn useful features, or may change too slowly to learn useful features
To change this ratio, adjust your learning rate (or sometimes, parameter initialization). In some networks, you may need to set the learning rate differently for different layers.
Keep an eye out for unusually large spikes in the ratio: this may indicate exploding gradients
Model Page: Layer Activations (vs. Time) Chart
This chart can be used to detect vanishing or exploding activations (due to poor weight initialization, too much regularization, lack of data normalization, or too high a learning rate).
This chart should ideally stabilize over time (usually a few hundred iterations)
A good standard deviation for the activations is on the order of 0.5 to 2.0. Significantly outside of this range may indicate one of the problems mentioned above.
Model Page: Layer Parameters Histogram
The layer parameters histogram is displayed for the most recent iteration only.
For weights, these histograms should have an approximately Gaussian (normal) distribution, after some time
For biases, these histograms will generally start at 0, and will usually end up being approximately Gaussian
One exception to this is for LSTM recurrent neural network layers: by default, the biases for one gate (the forget gate) are set to 1.0 (by default, though this is configurable), to help in learning dependencies across long time periods. This results in the bias graphs initially having many biases around 0.0, with another set of biases around 1.0
Keep an eye out for parameters that are diverging to +/- infinity: this may be due to too high a learning rate, or insufficient regularization (try adding some L2 regularization to your network).
Keep an eye out for biases that become very large. This can sometimes occur in the output layer for classification, if the distribution of classes is very imbalanced
Model Page: Layer Updates Histogram
The layer update histogram is displayed for the most recent iteration only.
Note that these are the updates - i.e., the gradients after applying learning rate, momentum, regularization etc
As with the parameter graphs, these should have an approximately Gaussian (normal) distribution
Keep an eye out for very large values: this can indicate exploding gradients in your network
Exploding gradients are problematic as they can 'mess up' the parameters of your network
In this case, it may indicate a weight initialization, learning rate or input/labels data normalization issue
Model Page: Parameter Learning Rates Chart
This chart simply shows the learning rates of the parameters of selected layer, over time.
If you are not using learning rate schedules, the chart will be flat. If you are using learning rate schedules, you can use this chart to track the current value of the learning rate (for each parameter), over time.
The recommended solution (for Maven) is to use the Maven Shade plugin to produce an uber-jar, configured as follows:
Then, create your uber-jar with mvn package
and run via cd target && java -cp dl4j-examples-0.9.1-bin.jar org.deeplearning4j.examples.userInterface.UIExample
. Note the "-bin" suffix for the generated JAR file: this includes all dependencies.
Note also that this Maven Shade approach is configured for DL4J's examples repository.
Understanding common errors like NaNs and tuning hyperparameters.
Neural networks can be difficult to tune. If the network hyperparameters are poorly chosen, the network may learn slowly, or perhaps not at all. This page aims to provide some baseline steps you should take when tuning your network.
Many of these tips have already been discussed in the academic literature. Our purpose is to consolidate them in one site and express them as clearly as possible.
What's distribution of your data? Are you scaling it properly? As a general rule:
For continuous values: you want these to be in the range of -1 to 1, 0 to 1 or ditributed normally with mean 0 and standard deviation 1. This does not have to be exact, but ensuring your inputs are approximately in this range can help during training. Scale down large inputs, and scale up small inputs.
For discrete classes (and, for classification problems for the output), generally use a one-hot representation. That is, if you have 3 classes, then your data will be represeted as [1,0,0], [0,1,0] or [0,0,1] for each of the 3 classes respectively.
Note that it's very important to use the exact same normalization method for both the training data and testing data.
Deeplearning4j supports several different kinds of weight initializations with the weightInit parameter. These are set using the .weightInit(WeightInit) method in your configuration.
You need to make sure your weights are neither too big nor too small. Xavier weight initialization is usually a good choice for this. For networks with rectified linear (relu) or leaky relu activations, RELU weight initialization is a sensible choice.
An epoch is defined as a full pass of the data set.
The learning rate is one of, if not the most important hyperparameter. If this is too large or too small, your network may learn very poorly, very slowly, or not at all. Typical values for the learning rate are in the range of 0.1 to 1e-6, though the optimal learning rate is usually data (and network architecture) specific. Some simple advice is to start by trying three different learning rates – 1e-1, 1e-3, and 1e-6 – to get a rough idea of what it should be, before further tuning this. Ideally, they run models with different learning rates simultaneously to save time.
For training neural networks in a distributed manner, you may need a different (frequently higher) learning rate compared to training the same network on a single machine.
Note that if you're using multiple GPUs, this will affect your scheduling. For example, if you have 2x GPUs, then you will need to divide the iterations in your schedule by 2, since the throughput of your training process will be double, and the learning rate schedule is only applicable to the local GPU.
There are two aspects to be aware of, with regard to the choice of activation function.
First, the activation function of the hidden (non-output) layers. As a general rule, 'relu' or 'leakyrelu' activations are good choices for this. Some other activation functions (tanh, sigmoid, etc) are more prone to vanishing gradient problems, which can make learning much harder in deep neural networks. However, for LSTM layers, the tanh activation function is still commonly used.
Second, regarding the activation function for the output layer: this is usually application specific. For classification problems, you generally want to use the softmax activation function, combined with the negative log likelihood / MCXENT (multi-class cross entropy). The softmax activation function gives you a probability distribution over classes (i.e., outputs sum to 1.0). For regression problems, the "identity" activation function is frequently a good choice, in conjunction with the MSE (mean squared error) loss function.
Loss functions for each neural network layer can either be used in pretraining, to learn better weights, or in classification (on the output layer) for achieving some result. (In the example above, classification happens in the override section.)
Your net's purpose will determine the loss function you use. For pretraining, choose reconstruction entropy. For classification, use multiclass cross entropy.
Regularization methods can help to avoid overfitting during training. Overfitting occurs when the network predicts the training set very well, but makes poor predictions on data the network has never seen. One way to think about overfitting is that the network memorizes the training data (instead of learning the general relationships in it).
Common types of regularization include:
l1 and l2 regularization penalizes large network weights, and avoids weights becoming too large. Some level of l2 regularization is commonly used in practice. However, note that if the l1 or l2 regularization coefficients are too high, they may over-penalize the network, and stop it from learning. Common values for l2 regularization are 1e-3 to 1e-6.
Dropconnect (conceptually similar to dropout, but used much less frequently)
Restricting the total number of network size (i.e., limit the number of layers and size of each layer)
To use l1/l2/dropout regularization, use .regularization(true) followed by .l1(x), .l2(y), .dropout(z) respectively. Note that z in dropout(z) is the probability of retaining an activation.
A minibatch refers to the number of examples used at a time, when computing gradients and parameter updates. In practice (for all but the smallest data sets), it is standard to break your data set up into a number of minibatches.
The ideal minibatch size will vary. For example, a minibatch size of 10 is frequently too small for GPUs, but can work on CPUs. A minibatch size of 1 will allow a network to train, but will not reap the benefits of parallelism. 32 may be a sensible starting point to try, with minibatches in the range of 16-128 (sometimes smaller or larger, depending on the application and type of network) being common.
In DL4J, the term 'updater' refers to training mechanisms such as momentum, RMSProp, adagrad, and others. Using one of these methods can result in much faster network training companed to 'vanilla' stochastic gradient descent. You can set the updater using the .updater(Updater) configuration option.
The optimization algorithm is how updates are made, given the gradient. The simplest (and most commonly used) method is stochastic gradient descent (SGD), however DL4J also provides SGD with line search, conjugate gradient and LBFGS optimization algorithms. These latter algorithms are more powerful compared to SGD, but considerably more costly per parameter update due to a line search component, and aren't used as much in practice. Note that you can in principle combine any updater with any optimization algorithm.
A good default choice in most cases is to use the stochastic gradient descent optimization algorithm combined with one of the momentum/rmsprop/adagrad updaters, with momentum frequently being used in practice. Note that for momentum, the updater is called NESTEROVS (a reference to the Nesterovs variant of momentum), and the momentum rate can be set by the .momentum(double) option.
Q. Why is my Neural Network throwing nan values?
A. Backpropagation involves the multiplication of very small gradients, due to limited precision when representing real numbers values very close to zero can not be represented. The term for this issue is Arithmetic Underflow. If your Neural Network is throwing nan's then the solution is to retune your network to avoid the very small gradients. This is more likely an issue with deeper Neural Networks.
You can try using double data type but it's usually recommended to retune the net first.
Following the basic tuning tips and monitoring the results is the way to ensure NAN doesn't show up anymore.
Terminate a training session given certain conditions.
When training neural networks, numerous decisions need to be made regarding the settings (hyperparameters) used, in order to obtain good performance. Once such hyperparameter is the number of training epochs: that is, how many full passes of the data set (epochs) should be used? If we use too few epochs, we might underfit (i.e., not learn everything we can from the training data); if we use too many epochs, we might overfit (i.e., fit the 'noise' in the training data, and not the signal).
Early stopping attempts to remove the need to manually set this value. It can also be considered a type of regularization method (like L1/L2 weight decay and dropout) in that it can stop the network from overfitting.
The idea behind early stopping is relatively simple:
Split data into training and test sets
At the end of each epoch (or, every N epochs):
evaluate the network performance on the test set
if the network outperforms the previous best model: save a copy of the network at the current epoch
Take as our final model the model that has the best test set performance
This is shown graphically below:
The best model is the one saved at the time of the vertical dotted line - i.e., the model with the best accuracy on the test set.
Using DL4J's early stopping functionality requires you to provide a number of configuration options:
A score calculator, such as the DataSetLossCalculator(, ) for a Multi Layer Network, or DataSetLossCalculatorCG (, ) for a Computation Graph. Is used to calculate at every epoch (for example: the loss function value on a test set, or the accuracy on the test set)
How frequently we want to calculate the score function (default: every epoch)
One or more termination conditions, which tell the training process when to stop. There are two classes of termination conditions:
Epoch termination conditions: evaluated every N epochs
Iteration termination conditions: evaluated once per minibatch
A model saver, that defines how models are saved
An example, with an epoch termination condition of maximum of 30 epochs, a maximum of 20 minutes training time, calculating the score every epoch, and saving the intermediate results to disk:
You can also implement your own iteration and epoch termination conditions.
The early stopping implementation described above will only work with a single device. However, EarlyStoppingParallelTrainer
provides similar functionality as early stopping and allows you to optimize for either multiple CPUs or GPUs. EarlyStoppingParallelTrainer
wraps your model in a ParallelWrapper
class and performs localized distributed training.
Note that EarlyStoppingParallelTrainer
doesn't support all of the functionality as its single device counterpart. It is not UI-compatible and may not work with complex iteration listeners. This is due to how the model is distributed and copied in the background.
Supported Keras regularizers.
All [Keras regularizers] are supported by DL4J model import:
l1
l2
l1_l2
Mapping of regularizers can be found in .
Tools and classes for evaluating neural network performance
When training or deploying a Neural Network it is useful to know the accuracy of your model. In DL4J the Evaluation Class and variants of the Evaluation Class are available to evaluate your model's performance.
The Evaluation class is used to evaluate the performance for binary and multi-class classifiers (including time series classifiers). This section covers basic usage of the Evaluation Class.
Given a dataset in the form of a DataSetIterator, the easiest way to perform evaluation is to use the built-in evaluate methods on MultiLayerNetwork and ComputationGraph:
However, evaluation can be performed on individual minibatches also. Here is an example taken from our dataexamples/CSVExample in the project.
The CSV example has CSV data for 3 classes of flowers and builds a simple feed forward neural network to classify the flowers based on 4 measurements.
The first line creates an Evaluation object with 3 classes. The second line gets the labels from the model for our test dataset. The third line uses the eval method to compare the labels array from the testdata with the labels generated from the model. The fourth line logs the evaluation data to the console.
The output.
By default the .stats() method displays the confusion matrix entries (one per line), Accuracy, Precision, Recall and F1 Score. Additionally the Evaluation Class can also calculate and return the following values:
Confusion Matrix
False Positive/Negative Rate
True Positive/Negative
Class Counts
Display the Confusion Matrix.
Displays
Additionaly the confusion matrix can be accessed directly, converted to csv or html using.
To Evaluate a network performing regression use the RegressionEvaluation Class.
As with the Evaluation class, RegressionEvaluation on a DataSetIterator can be performed as follows:
Here is a code snippet with single column, in this case the neural network was predicting the age of shelfish based on measurements.
Print the statistics for the Evaluation.
Returns
Columns are Mean Squared Error, Mean Absolute Error, Root Mean Squared Error, Relative Squared Error, and R^2 Coefficient of Determination
When performing multiple types of evaluations (for example, Evaluation and ROC on the same network and dataset) it is more efficient to do this in one pass of the dataset, as follows:
For most users, it is simply sufficient to use the MultiLayerNetwork.evaluate(DataSetIterator)
or MultiLayerNetwork.evaluateRegression(DataSetIterator)
and similar methods. These methods will properly handle masking, if mask arrays are present.
The EvaluationBinary is used for evaluating networks with binary classification outputs - these networks usually have Sigmoid activation functions and XENT loss functions. The typical classification metrics, such as accuracy, precision, recall, F1 score, etc. are calculated for each output.
ROC (Receiver Operating Characteristic) is another commonly used evaluation metric for the evaluation of classifiers. Three ROC variants exist in DL4J:
ROC - for single binary label (as a single column probability, or 2 column 'softmax' probability distribution).
ROCBinary - for multiple binary labels
ROCMultiClass - for evaluation of non-binary classifiers, using a "one vs. all" approach
These classes have the ability to calculate the area under ROC curve (AUROC) and area under Precision-Recall curve (AUPRC), via the calculateAUC()
and calculateAUPRC()
methods. Furthermore, the ROC and Precision-Recall curves can be obtained using getRocCurve()
and getPrecisionRecallCurve()
.
The ROC and Precision-Recall curves can be exported to HTML for viewing using: EvaluationTools.exportRocChartsToHtmlFile(ROC, File)
, which will export a HTML file with both ROC and P-R curves, that can be viewed in a browser.
Note that all three support two modes of operation/calculation
Thresholded (approximate AUROC/AUPRC calculation, no memory issues)
Exact (exact AUROC/AUPRC calculation, but can require large amount of memory with very large datasets - i.e., datasets with many millions of examples)
The number of bins can be set using the constructors. Exact can be set using the default constructor new ROC()
or explicitly using new ROC(0)
Deeplearning4j also has the EvaluationCalibration class, which is designed to analyze the calibration of a classifier. It provides a number of tools for this purpose:
Counts of the number of labels and predictions for each class
Reliability diagram (or reliability curve)
Residual plot (histogram)
Histograms of probabilities, including probabilities for each class separately
Evaluation of a classifier using EvaluationCalibration is performed in a similar manner to the other evaluation classes. The various plots/histograms can be exported to HTML for viewing using EvaluationTools.exportevaluationCalibrationToHtmlFile(EvaluationCalibration, File)
.
SparkDl4jMultiLayer and SparkComputationGraph both have similar methods for evaluation:
Evaluation Classes useful for Multi-Task Network
Example:
The full set of UI examples are available .
Client (both spark and standalone neural networks using simple deeplearning4j-nn) Second, for your neural net (Note this example is for spark, but computation graph and multi layer network both have the equivalemtn setListeners method with the same usage, ):
Here's an excellent about visualizing neural net training. It is worth reading and understanding that page first.
In the case of recurrent neural networks, adding some may help
Too few epochs don't give your network enough time to learn good parameters; too many and you might overfit the training data. One way to choose the number of epochs is to use early stopping. can also help to prevent the neural network from overfitting (i.e., can help the net generalize better to unseen data).
The usual approach to selecting an appropriate learning rate is to use to visualize the progress of training. You want to pay attention to both the loss over time, and the ratio of update magnitudes to parameter magnitudes (a ratio of approximately 1:1000 is a good place to start). For more information on tuning the learning rate, see .
You can optionally define a learning rate policy for your neural network. A policy will change the learning rate over time, achieving better results since the learning rate can "slow down" to find closer local minima for convergence. A common policy used is scheduling. See the for a learning rate schedule used in practice.
, is a frequently used regularization method can be very effective. Dropout is most commoly used with a dropout rate of 0.5.
When training a neural network, it can sometimes be helpful to apply gradient normalization, to avoid the gradients being too large (the so-called exploding gradient problem, common in recurrent neural networks) or too small. This can be applied using the .gradientNormalization(GradientNormalization) and .gradientNormalizationThreshould(double) methods. For an example of gradient normalization see, . The test code for that example is .
When training recurrent networks with long time series, it is generally advisable to use truncated backpropagation through time. With 'standard' backpropagation through time (the default in DL4J) the cost per parameter update can become prohibative. For more details, see .
F-beta, G-measure, Matthews Correlation Coefficient and more, see
See
Time series evaluation is very similar to the above evaluation approaches. Evaluation in DL4J is performed on all (non-masked) time steps separately - for example, a time series of length 10 will contribute 10 predictions/labels to an Evaluation object. One difference with time seires is the (optional) presence of mask arrays, which are used to mark some time steps as missing or not present. See for more details on masking.
See
See is used to evaluate Binary Classifiers.
A multi-task network is a network that is trained to produce multiple outputs. For example a network given audio samples can be trained to both predict the language spoken and the gender of the speaker. Multi-task configuration is briefly described .
See
See
The DL4J transfer learning API enables users to:
Modify the architecture of an existing model
Fine tune learning configurations of an existing model.
Hold parameters of a specified layer constant during training, also referred to as “frozen"
Holding certain layers frozen on a network and training is effectively the same as training on a transformed version of the input, the transformed version being the intermediate outputs at the boundary of the frozen layers. This is the process of “feature extraction” from the input data and will be referred to as “featurizing” in this document.
The forward pass to “featurize” the input data on large, pertained networks can be time consuming. DL4J also provides a TransferLearningHelper class with the following capabilities.
Featurize an input dataset to save for future use
Fit the model with frozen layers with a featurized dataset
Output from the model with frozen layers given a featurized input.
When running multiple epochs users will save on computation time since the expensive forward pass on the frozen layers/vertices will only have to be conducted once.
This example will use VGG16 to classify images belonging to five categories of flowers. The dataset will automatically download from http://download.tensorflow.org/example_images/flower_photos.tgz
Deeplearning4j has a new native model zoo. Read about the deeplearning4j-zoo module for more information on using pretrained models. Here, we load a pretrained VGG-16 model initialized with weights trained on ImageNet:
The final layer of VGG16 does a softmax regression on the 1000 classes in ImageNet. We modify the very last layer to give predictions for five classes keeping the other layers frozen.
After a mere thirty iterations, which in this case is exposure to 450 images, the model attains an accuracy > 75% on the test dataset. This is rather remarkable considering the complexity of training an image classifier from scratch.
Here we hold all but the last three dense layers frozen and attach new dense layers onto it. Note that the primary intent here is to demonstrate the use of the API, secondary to what might give better results.
Say we have saved off our model from (B) and now want to allow “block_5” layers to train.
We use the transfer learning helper API. Note this freezes the layers of the model passed in.
Here is how you obtain the featured version of the dataset at the specified layer “fc2”.
Here is how you can fit with a featured dataset. vgg16Transfer is a model setup in (A) of section III.
The TransferLearning builder returns a new instance of a dl4j model.
Keep in mind this is a second model that leaves the original one untouched. For large pertained network take into consideration memory requirements and adjust your JVM heap space accordingly.
The trained model helper imports models from Keras without enforcing a training configuration.
Therefore the last layer (as seen when printing the summary) is a dense layer and not an output layer with a loss function. Therefore to modify nOut of an output layer we delete the layer vertex, keeping it’s connections and add back in a new output layer with the same name, a different nOut, the suitable loss function etc etc.
Changing nOuts at a layer/vertex will modify nIn of the layers/vertices it fans into.
When changing nOut users can specify a weight initialization scheme or a distribution for the layer as well as a separate weight initialization scheme or distribution for the layers it fans out to.
Frozen layer configurations are not saved when writing the model to disk.
In other words, a model with frozen layers when serialized and read back in will not have any frozen layers. To continue training holding specific layers constant the user is expected to go through the transfer learning helper or the transfer learning API. There are two ways to “freeze” layers in a dl4j model.
On a copy: With the transfer learning API which will return a new model with the relevant frozen layers
In place: With the transfer learning helper API which will apply the frozen layers to the given model.
FineTune configurations will selectively update learning parameters.
For eg, if a learning rate is specified this learning rate will apply to all unfrozen/trainable layers in the model. However, newly added layers can override this learning rate by specifying their own learning rates in the layer builder.