Quickstart with MNIST
Last updated
Last updated
Deeplearning4j - also known as “DL4J” - is a high performance domain-specific language to configure deep neural networks, which are made of multiple layers. Deeplearning4j is open source, written in C++, Java, Scala, and Python, and maintained by the Eclipse Foundation & community contributors.
If you are having difficulty, we recommend you join our community forums. There you can request help and give feedback, but please do use this guide before asking questions we’ve answered below. If you are new to deep learning, we’ve included a road map for beginners with links to courses, readings and other resources. For a longer and more detailed version of this guide, please visiting the Deeplearning4j Getting Started guide.
In this quickstart, you will create a deep neural network using Deeplearning4j and train a model capable of classifying random handwriting digits. While handwriting recognition has been attempted by different machine learning algorithms over the years, deep learning performs remarkably well and achieves an accuracy of over 99.7% on the MNIST dataset. For this tutorial, we will classify digits in EMNIST, the “next generation” of MNIST and a larger dataset.
Load a dataset for a neural network.
Format EMNIST for image recognition.
Create a deep neural network.
Train a model.
Evaluate the performance of your model.
Like most programming languages, you need to explicitly import the classes you want to use into scope. Below, we will import common Deeplearning4j classes that will help us configure and train a neural network. The code below is written in Scala.
Note we import methods from Scala’s JavaConversions
class because this will allow us to use native Scala collections while maintaining compatibility with Deeplearning4j’s Java collections.
Dataset iterators are important pieces of code that help batch and iterate across your dataset for training and inferring with neural networks. Deeplearning4j comes with a built-in implementation of a BaseDatasetIterator
for EMNIST known as EmnistDataSetIterator
. This particular iterator is a convenience utility that handles downloading and preparation of data.
Note that we create two different iterators below, one for training data and the other for for evaluating the accuracy of our model after training. The last boolean
parameter in the constructor indicates whether we are instantiating test/train.
You won’t need it for this tutorial, you can learn more about loading data for neural networks in this ETL user guide. DL4J comes with many record readers that can load and convert data into ND-Arrays from CSVs, images, videos, audio, and sequences.
For any neural network you build in Deeplearning4j, the foundation is the NeuralNetConfiguration
class. This is where you configure hyperparameters, the quantities that define the architecture and how the algorithm learns. Intuitively, each hyperparameter is like one ingredient in a meal, a meal that can go very right, or very wrong… Luckily, you can adjust hyperparameters if they don’t produce the right results.
The list()
method specifies the number of layers in the net; this function replicates your configuration n times and builds a layerwise configuration.
Each node (the circles) in the hidden layer represents a feature of a handwritten digit in the MNIST dataset. For example, imagine you are looking at the number 6. One node may represent rounded edges, another node may represent the intersection of curly lines, and so on and so forth. Such features are weighted by importance by the model’s coefficients, and recombined in each hidden layer to help predict whether the handwritten number is indeed 6. The more layers of nodes you have, the more complexity and nuance they can capture to make better predictions.
You could think of a layer as “hidden” because, while you can see the input entering a net, and the decisions coming out, it’s difficult for humans to decipher how and why a neural net processes data on the inside. The parameters of a neural net model are simply long vectors of numbers, readable by machines.
Now that we’ve built a NeuralNetConfiguration
, we can use the configuration to instantiate a MultiLayerNetwork
. When we call the init()
method on the network, it applies the chosen weight initialization across the network and allows us to pass data to train. If we want to see the loss score during training, we can also pass a listener to the network.
An instantiated model has a fit()
method that accepts a dataset iterator (an iterator that extends BaseDatasetIterator
), a single DataSet
, or an ND-Array (an implementation of INDArray
). Since our EMNIST iterator already extends the iterator base class, we can pass it directly to fit. If we want to train for multiple epochs, put the number of total epochs in the second argument of fit()
method.
Deeplearning4j exposes several tools to evaluate the performance of a model. You can perform basic evaluation and get metrics such as precision and accuracy, or use a Receiver Operating Characteristic (ROC). Note that the general ROC
class works for binary classifiers, whereas ROCMultiClass
is meant for classifiers such as the model we are building here.
A MultiLayerNetwork
conveniently has a few methods built-in to help us perform evaluation. You can pass a dataset iterator with your testing/validation data to an evaluate()
method.
Now that you’ve learned how to get started and train your first model, head to the Deeplearning4j website to see all the other tutorials that await you. Learn how to build dataset iterators, train a facial recognition network like FaceNet, and more.