1 of 2

Model Zoo

Prebuilt model architectures and weights for out-of-the-box application.

Deeplearning4j has native model zoo that can be accessed and instantiated directly from DL4J. The model zoo also includes pretrained weights for different datasets that are downloaded automatically and checked for integrity using a checksum mechanism.

If you want to use the new model zoo, you will need to add it as a dependency. A Maven POM would add the following:

<dependency>
    <groupId>org.deeplearning4j</groupId>
    <artifactId>deeplearning4j-zoo</artifactId>
    <version>1.0.0-M1.1</version>
</dependency>

Getting started

Once you've successfully added the zoo dependency to your project, you can start to import and use models. Each model extends the ZooModel abstract class and uses the InstantiableModel interface. These classes provide methods that help you initialize either an empty, fresh network or a pretrained network.

Initializing fresh configurations

You can instantly instantiate a model from the zoo using the .init() method. For example, if you want to instantiate a fresh, untrained network of AlexNet you can use the following code:

import org.deeplearning4j.zoo.model.AlexNet
import org.deeplearning4j.zoo.*;

...

int numberOfClassesInYourData = 1000;
int randomSeed = 123;

ZooModel zooModel = AlexNet.builder()
                .numClasses(numberOfClassesInYourData)
                .seed(randomSeed)
                .build();
Model net = zooModel.init();

If you want to tune parameters or change the optimization algorithm, you can obtain a reference to the underlying network configuration:

ZooModel zooModel = AlexNet.builder()
                .numClasses(numberOfClassesInYourData)
                .seed(randomSeed)
                .build();
MultiLayerConfiguration net = ((AlexNet) zooModel).conf();

Initializing pretrained weights

Some models have pretrained weights available, and a small number of models are pretrained across different datasets. PretrainedType is an enumerator that outlines different weight types, which includes IMAGENET, MNIST, CIFAR10, and VGGFACE.

For example, you can initialize a VGG-16 model with ImageNet weights like so:

import org.deeplearning4j.zoo.model.VGG16;
import org.deeplearning4j.zoo.*;

...

ZooModel zooModel = VGG16.builder().build();;
Model net = zooModel.initPretrained(PretrainedType.IMAGENET);

And initialize another VGG16 model with weights trained on VGGFace:

ZooModel zooModel = VGG16.builder().build();
Model net = zooModel.initPretrained(PretrainedType.VGGFACE);

If you're not sure whether a model contains pretrained weights, you can use the .pretrainedAvailable() method which returns a boolean. Simply pass a PretrainedType enum to this method, which returns true if weights are available.

Note that for convolutional models, input shape information follows the NCHW convention. So if a model's input shape default is new int[]{3, 224, 224}, this means the model has 3 channels and height/width of 224.

What's in the zoo?

The model zoo comes with well-known image recognition configurations in the deep learning community. The zoo also includes an LSTM for text generation, and a simple CNN for general image recognition.

You can find a complete list of models using this deeplearning4j-zoo Github link.

This includes ImageNet models such as VGG-16, ResNet-50, AlexNet, Inception-ResNet-v1, LeNet, and more.

Advanced usage

The zoo comes with a couple additional features if you're looking to use the models for different use cases.

Changing Inputs

Aside from passing certain configuration information to the constructor of a zoo model, you can also change its input shape using .setInputShape().

NOTE: this applies to fresh configurations only, and will not affect pretrained models:

int numberOfClassesInYourData = 10;
int randomSeed = 123;

ZooModel zooModel = ResNet50.builder()
        .numClasses(numberOfClassesInYourData)
        .seed(randomSeed)
        .build();
zooModel.setInputShape(new int[][]{{3, 28, 28}});

Transfer Learning

Pretrained models are perfect for transfer learning! You can read more about transfer learning using DL4J here.

Workspaces

Initialization methods often have an additional parameter named workspaceMode. For the majority of users you will not need to use this; however, if you have a large machine that has "beefy" specifications, you can pass WorkspaceMode.SINGLE for models such as VGG-19 that have many millions of parameters. To learn more about workspaces, please see this section.

Zoo Models

Available models

AlexNet

Model is built in dl4j based on available functionality and notes indicate where there are gaps waiting for enhancements.

Bias initialization in the paper is 1 in certain layers but 0.1 in the imagenetExample code Weight distribution uses 0.1 std for all layers in the paper but 0.005 in the dense layers in the imagenetExample code

Darknet19

FaceNetNN4Small2

InceptionResNetV1

LeNet

LeNet was an early promising achiever on the ImageNet dataset. References:

NASNet

Implementation of NASNet-A in Deeplearning4j. NASNet refers to Neural Architecture Search Network, a family of models that were designed automatically by learning the model architectures directly on the dataset of interest.

This implementation uses 1056 penultimate filters and an input shape of (3, 224, 224). You can change this.

ResNet50

Residual networks for deep learning.

SimpleCNN

SqueezeNet

U-Net

An implementation of SqueezeNet. Touts similar accuracy to AlexNet with a fraction of the parameters.

TextGenerationLSTM

LSTM designed for text generation. Can be trained on a corpus of text. For this model, numClasses is

TinyYOLO

String filename = “tiny-yolo-voc.h5”; ComputationGraph graph = KerasModelImport.importKerasModelAndWeights(filename, false); INDArray priors = Nd4j.create(priorBoxes);

FineTuneConfiguration fineTuneConf = new FineTuneConfiguration.Builder() .seed(seed) .iterations(iterations) .optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT) .gradientNormalization(GradientNormalization.RenormalizeL2PerLayer) .gradientNormalizationThreshold(1.0) .updater(new Adam.Builder().learningRate(1e-3).build()) .l2(0.00001) .activation(Activation.IDENTITY) .trainingWorkspaceMode(workspaceMode) .inferenceWorkspaceMode(workspaceMode) .build();

ComputationGraph model = new TransferLearning.GraphBuilder(graph) .fineTuneConfiguration(fineTuneConf) .addLayer(“outputs”, new Yolo2OutputLayer.Builder() .boundingBoxPriors(priors) .build(), “conv2d_9”) .setOutputs(“outputs”) .build();

System.out.println(model.summary(InputType.convolutional(416, 416, 3)));

ModelSerializer.writeModel(model, “tiny-yolo-voc_dl4j_inference.v1.zip”, false); }</pre>

The channels of the 416x416 input images need to be in RGB order (not BGR), with values normalized within [0, 1].

UNet

U-Net

An implementation of U-Net, a deep learning network for image segmentation in Deeplearning4j. The u-net is convolutional network architecture for fast and precise segmentation of images. Up to now it has outperformed the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.

VGG16

VGG19

Xception

U-Net

An implementation of Xception in Deeplearning4j. A novel deep convolutional neural network architecture inspired by Inception, where Inception modules have been replaced with depthwise separable convolutions.

YOLO2

The channels of the 608x608 input images need to be in RGB order (not BGR), with values normalized within [0, 1].

pretrainedUrl

Default prior boxes for the model

Model Zoo

Prebuilt model architectures and weights for out-of-the-box application.

If you want to use the new model zoo, you will need to add it as a dependency. A Maven POM would add the following:

<dependency>
    <groupId>org.deeplearning4j</groupId>
    <artifactId>deeplearning4j-zoo</artifactId>
    <version>1.0.0-M1.1</version>
</dependency>

Getting started

Initializing fresh configurations

You can instantly instantiate a model from the zoo using the .init() method. For example, if you want to instantiate a fresh, untrained network of AlexNet you can use the following code:

import org.deeplearning4j.zoo.model.AlexNet
import org.deeplearning4j.zoo.*;

...

int numberOfClassesInYourData = 1000;
int randomSeed = 123;

ZooModel zooModel = AlexNet.builder()
                .numClasses(numberOfClassesInYourData)
                .seed(randomSeed)
                .build();
Model net = zooModel.init();

If you want to tune parameters or change the optimization algorithm, you can obtain a reference to the underlying network configuration:

ZooModel zooModel = AlexNet.builder()
                .numClasses(numberOfClassesInYourData)
                .seed(randomSeed)
                .build();
MultiLayerConfiguration net = ((AlexNet) zooModel).conf();

Initializing pretrained weights

For example, you can initialize a VGG-16 model with ImageNet weights like so:

import org.deeplearning4j.zoo.model.VGG16;
import org.deeplearning4j.zoo.*;

...

ZooModel zooModel = VGG16.builder().build();;
Model net = zooModel.initPretrained(PretrainedType.IMAGENET);

And initialize another VGG16 model with weights trained on VGGFace:

ZooModel zooModel = VGG16.builder().build();
Model net = zooModel.initPretrained(PretrainedType.VGGFACE);

What's in the zoo?

The model zoo comes with well-known image recognition configurations in the deep learning community. The zoo also includes an LSTM for text generation, and a simple CNN for general image recognition.

You can find a complete list of models using this deeplearning4j-zoo Github link.

This includes ImageNet models such as VGG-16, ResNet-50, AlexNet, Inception-ResNet-v1, LeNet, and more.

Advanced usage

The zoo comes with a couple additional features if you're looking to use the models for different use cases.

Changing Inputs

Aside from passing certain configuration information to the constructor of a zoo model, you can also change its input shape using .setInputShape().

NOTE: this applies to fresh configurations only, and will not affect pretrained models:

int numberOfClassesInYourData = 10;
int randomSeed = 123;

ZooModel zooModel = ResNet50.builder()
        .numClasses(numberOfClassesInYourData)
        .seed(randomSeed)
        .build();
zooModel.setInputShape(new int[][]{{3, 28, 28}});

Transfer Learning

Pretrained models are perfect for transfer learning! You can read more about transfer learning using DL4J here.

Workspaces

Zoo Models

Available models

AlexNet

Dl4j’s AlexNet model interpretation based on the original paper ImageNet Classification with Deep Convolutional Neural Networks and the imagenetExample code referenced. References:

Model is built in dl4j based on available functionality and notes indicate where there are gaps waiting for enhancements.

Darknet19

Darknet19 Reference: ImageNet weights for this model are available and have been converted from using .

There are 2 pretrained models, one for 224x224 images and one fine-tuned for 448x448 images. Call setInputShape() with either {3, 224, 224} or {3, 448, 448} before initialization. The channels of the input images need to be in RGB order (not BGR), with values normalized within [0, 1]. The output labels are as per .

FaceNetNN4Small2

A variant of the original FaceNet model that relies on embeddings and triplet loss. Reference: Also based on the OpenFace implementation:

InceptionResNetV1

A variant of the original FaceNet model that relies on embeddings and triplet loss. Reference: Also based on the OpenFace implementation:

LeNet

LeNet was an early promising achiever on the ImageNet dataset. References:

MNIST weights for this model are available and have been converted from .

NASNet

This implementation uses 1056 penultimate filters and an input shape of (3, 224, 224). You can change this.

Paper: ImageNet weights for this model are available and have been converted from .

ResNet50

Residual networks for deep learning.

Paper: ImageNet weights for this model are available and have been converted from ;.

SimpleCNN

A simple convolutional network for generic image classification. Reference:

SqueezeNet

U-Net

An implementation of SqueezeNet. Touts similar accuracy to AlexNet with a fraction of the parameters.

Paper: ImageNet weights for this model are available and have been converted from .

TextGenerationLSTM

LSTM designed for text generation. Can be trained on a corpus of text. For this model, numClasses is

Architecture follows this implementation:

Walt Whitman weights are available for generating text from his works, adapted from .

TinyYOLO

Tiny YOLO Reference:

ImageNet+VOC weights for this model are available and have been converted from using and the following code.

String filename = “tiny-yolo-voc.h5”; ComputationGraph graph = KerasModelImport.importKerasModelAndWeights(filename, false); INDArray priors = Nd4j.create(priorBoxes);

System.out.println(model.summary(InputType.convolutional(416, 416, 3)));

ModelSerializer.writeModel(model, “tiny-yolo-voc_dl4j_inference.v1.zip”, false); }</pre>

The channels of the 416x416 input images need to be in RGB order (not BGR), with values normalized within [0, 1].

UNet

U-Net

Paper: Weights are available for image segmentation trained on a synthetic dataset

VGG16

VGG-16, from Very Deep Convolutional Networks for Large-Scale Image Recognition

Deep Face Recognition

ImageNet weights for this model are available and have been converted from . CIFAR-10 weights for this model are available and have been converted using “approach 2” from . VGGFace weights for this model are available and have been converted from .

VGG19

VGG-19, from Very Deep Convolutional Networks for Large-Scale Image Recognition ImageNet weights for this model are available and have been converted from .

Xception

U-Net

Paper: ImageNet weights for this model are available and have been converted from .

YOLO2

YOLOv2 Reference:

ImageNet+COCO weights for this model are available and have been converted from using and the following code.

String filename = “yolo.h5”; 
KerasLayer.registerCustomLayer(“Lambda”, KerasSpaceToDepth.class); 
ComputationGraph graph = KerasModelImport.importKerasModelAndWeights(filename, false);
INDArray priors = Nd4j.create(priorBoxes);
FineTuneConfiguration fineTuneConf = new FineTuneConfiguration.Builder()
 .seed(seed)
 .optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT)
 .gradientNormalization(GradientNormalization.RenormalizeL2PerLayer)
 .gradientNormalizationThreshold(1.0)
 .updater(new Adam.Builder().learningRate(1e-3).build())
 .l2(0.00001)
 .activation(Activation.IDENTITY)
 .trainingWorkspaceMode(workspaceMode)
 .inferenceWorkspaceMode(workspaceMode)
 .build();
ComputationGraph model = new TransferLearning.GraphBuilder(graph)
 .fineTuneConfiguration(fineTuneConf) 
 .addLayer(“outputs”, new Yolo2OutputLayer.Builder() 
                      .boundingBoxPriors(priors)
                      .build(), “conv2d_23”)
 .setOutputs(“outputs”)
 .build();
System.out.println(model.summary(InputType.convolutional(608, 608, 3)));
ModelSerializer.writeModel(model, “yolo2_dl4j_inference.v1.zip”, false); }

The channels of the 608x608 input images need to be in RGB order (not BGR), with values normalized within [0, 1].

pretrainedUrl

public String pretrainedUrl(PretrainedType pretrainedType)

Default prior boxes for the model