1 of 1

Convolutional Networks

Train FaceNet Using Center Loss

Deep learning is the de facto standard for face recognition. In 2015, Google researchers published FaceNet: A Unified Embedding for Face Recognition and Clustering, which set a new record for accuracy of 99.63% on the LFW dataset. An important aspect of FaceNet is that it made face recognition more practical by using the embeddings to learn a mapping of face features to a compact Euclidean space (basically, you input an image and get a small 1D array from the network). FaceNet was an adapted version of an Inception-style network.

Around the same time FaceNet was being developed, other research groups were making significant advances in facial recognition. DeepID3, for example, achieved impressive results. Oxford’s Visual Geometry Group published Deep Face Recognition. Note that the Deep Face Recognition paper has a comparison of previous papers, and one key factor in FaceNet is the number of images used to train the network: 200 million.

Introducing center loss

FaceNet is difficult to train, partially because of how it uses triplet loss. This required exotic architectures that either set up three models in tandem, or required stacking of examples and unstacking with additional nodes to calculate loss based on euclidean similarity. introduced center loss, a promising technique that added an intraclass component to a training loss function.

The advantage of training embeddings with center loss is that an exotic architecture is no longer required. In addition, because hardware is better utilized, the amount of time it takes to train embeddings is much shorter. One important distinction when using center loss vs. a triplet loss architecture is that a center loss layer stores its own parameters. These parameters calculate the intraclass “center” of all examples for each label.

What are we going to learn in this tutorial?

Using Deeplearning4j, you will learn how to train embeddings for facial recognition and transfer parameters to a new network that uses the embeddings for feed forward. The network will be built using ComputationGraph (Inception-type networks require multiple nodes) via the variant, which is a hand-tuned, parameter-minimized model of FaceNet.

Because Inception networks are large, we will use the Deeplearning4j model zoo to help build our network.

Imports

Instantiate the model

We are using a minified version of the full FaceNet network to reduce the hardware requirements. Below, we use the FaceNetHelper class for some of the Inception blocks, where parameters have been unchanged from the larger version.

Print the configuration

To see that Center Loss if already in the model configuration, you can print a string table of all layers in the network. Use the summary() method to get a complete summary of all layers and parameters. You’ll see that our network here has over 5 million parameters, this is still quite low compared to advanced ImageNet configurations, but will still be taxing on your hardware.

Using the LFW iterator

The LFWDataSetIterator, like most of the Deeplearning4j built-in iterators, extends the DataSetIterator class. This API allows for the simple instantiation of datasets and automatic downloading of data in the background. If you are unfamiliar with using DL4J’s built-in iterators, there’s a tutorial describing their usage.

Classifier training

With the network configruation is set up and instantiated along with the LFW test/train iterators, training takes just a few lines of code. Since we have a labelled dataset and are using center loss, this is considered “classifier training” and is a supervised learning process. Earlier we attached a ScoreIterationListener to the model by using the setListeners() method. Its output is printed to the console since the internals of Deeplearning4j use SL4J for logging.

After each epoch, we will evaluate how well the network is learning by using the evaluate() method. Although in this example we only use accuracy() and precision(), it is strongly recommended you perform advanced evaluation with ROC curves and understand the output of a confusion matrix.

Transferring the parameters

Now that the network has been trained, using the embeddings requires removing the center loss output layer. Deeplearning4j has a native transfer learning API to assist.