Dl4j’s AlexNet model interpretation based on the original paper ImageNet Classification with Deep Convolutional Neural Networks and the imagenetExample code referenced.
References:
Model is built in dl4j based on available functionality and notes indicate where there are gaps waiting for enhancements.
Bias initialization in the paper is 1 in certain layers but 0.1 in the imagenetExample code Weight distribution uses 0.1 std for all layers in the paper but 0.005 in the dense layers in the imagenetExample code
Darknet19
Darknet19
Reference:
ImageNet weights for this model are available and have been converted from using .
There are 2 pretrained models, one for 224x224 images and one fine-tuned for 448x448 images. Call setInputShape() with either {3, 224, 224} or {3, 448, 448} before initialization. The channels of the input images need to be in RGB order (not BGR), with values normalized within [0, 1]. The output labels are as per .
FaceNetNN4Small2
A variant of the original FaceNet model that relies on embeddings and triplet loss. Reference:
Also based on the OpenFace implementation:
InceptionResNetV1
A variant of the original FaceNet model that relies on embeddings and triplet loss.
Reference:
Also based on the OpenFace implementation:
LeNet
LeNet was an early promising achiever on the ImageNet dataset. References:
MNIST weights for this model are available and have been converted from .
NASNet
Implementation of NASNet-A in Deeplearning4j. NASNet refers to Neural Architecture Search Network, a family of models that were designed automatically by learning the model architectures directly on the dataset of interest.
This implementation uses 1056 penultimate filters and an input shape of (3, 224, 224). You can change this.
Paper: ImageNet weights for this model are available and have been converted from .
ResNet50
Residual networks for deep learning.
Paper: ImageNet weights for this model are available and have been converted from ;.
SimpleCNN
A simple convolutional network for generic image classification. Reference:
SqueezeNet
U-Net
An implementation of SqueezeNet. Touts similar accuracy to AlexNet with a fraction of the parameters.
Paper: ImageNet weights for this model are available and have been converted from .
TextGenerationLSTM
LSTM designed for text generation. Can be trained on a corpus of text. For this model, numClasses is
Architecture follows this implementation:
Walt Whitman weights are available for generating text from his works, adapted from .
TinyYOLO
Tiny YOLO Reference:
ImageNet+VOC weights for this model are available and have been converted from using and the following code.
ComputationGraph model = new TransferLearning.GraphBuilder(graph) .fineTuneConfiguration(fineTuneConf) .addLayer(“outputs”, new Yolo2OutputLayer.Builder() .boundingBoxPriors(priors) .build(), “conv2d_9”) .setOutputs(“outputs”) .build();
The channels of the 416x416 input images need to be in RGB order (not BGR), with values normalized within [0, 1].
UNet
U-Net
An implementation of U-Net, a deep learning network for image segmentation in Deeplearning4j. The u-net is convolutional network architecture for fast and precise segmentation of images. Up to now it has outperformed the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.
Paper: Weights are available for image segmentation trained on a synthetic dataset
VGG16
VGG-16, from Very Deep Convolutional Networks for Large-Scale Image Recognition
Deep Face Recognition
ImageNet weights for this model are available and have been converted from . CIFAR-10 weights for this model are available and have been converted using “approach 2” from . VGGFace weights for this model are available and have been converted from .
VGG19
VGG-19, from Very Deep Convolutional Networks for Large-Scale Image Recognition
ImageNet weights for this model are available and have been converted from .
Xception
U-Net
An implementation of Xception in Deeplearning4j. A novel deep convolutional neural network architecture inspired by Inception, where Inception modules have been replaced with depthwise separable convolutions.
Paper: ImageNet weights for this model are available and have been converted from .
YOLO2
YOLOv2 Reference:
ImageNet+COCO weights for this model are available and have been converted from using and the following code.
The channels of the 608x608 input images need to be in RGB order (not BGR), with values normalized within [0, 1].