Deeplearning4j
Community ForumND4J JavadocDL4J Javadoc
EN 1.0.0-beta7
EN 1.0.0-beta7
  • Eclipse DeepLearning4J
  • Getting Started
    • Quickstart
      • Untitled
    • Tutorials
      • Quickstart with MNIST
      • MultiLayerNetwork And ComputationGraph
      • Logistic Regression
      • Built-in Data Iterators
      • Feed Forward Networks
      • Basic Autoencoder
      • Advanced Autoencoder
      • Convolutional Networks
      • Recurrent Networks
      • Early Stopping
      • Layers and Preprocessors
      • Hyperparameter Optimization
      • Using Multiple GPUs
      • Clinical Time Series LSTM
      • Sea Temperature Convolutional LSTM
      • Sea Temperature Convolutional LSTM 2
      • Instacart Multitask Example
      • Instacart Single Task Example
      • Cloud Detection Example
    • Core Concepts
    • Cheat Sheet
    • Examples Tour
    • Deep Learning Beginners
    • Build from Source
    • Contribute
      • Eclipse Contributors
    • Benchmark Guide
    • About
    • Release Notes
  • Configuration
    • Backends
      • CPU and AVX
      • cuDNN
      • Performance Issues
    • Memory Management
      • Memory Workspaces
    • Snapshots
    • Maven
    • SBT, Gradle, & Others
  • Models
    • Autoencoders
    • Multilayer Network
    • Computation Graph
    • Convolutional Neural Network
    • Recurrent Neural Network
    • Layers
    • Vertices
    • Iterators
    • Listeners
    • Custom Layers
    • Model Persistence
    • Activations
    • Updaters
  • Model Zoo
    • Overview
    • Zoo Models
  • ND4J
    • Overview
    • Quickstart
    • Basics
    • Elementwise Operations
    • Matrix Manipulation
    • Syntax
    • Tensors
  • SAMEDIFF
    • Importing TensorFlow models
    • Variables
    • Ops
    • Adding Ops
  • ND4J & SameDiff Ops
    • Overview
    • Bitwise
    • Linalg
    • Math
    • Random
    • BaseOps
    • CNN
    • Image
    • Loss
    • NN
    • RNN
  • Tuning & Training
    • Evaluation
    • Visualization
    • Trouble Shooting
    • Early Stopping
    • t-SNE Visualization
    • Transfer Learning
  • Keras Import
    • Overview
    • Get Started
    • Supported Features
      • Activations
      • Losses
      • Regularizers
      • Initializers
      • Constraints
      • Optimizers
    • Functional Model
    • Sequential Model
    • Custom Layers
    • API Reference
      • Core Layers
      • Convolutional Layers
      • Embedding Layers
      • Local Layers
      • Noise Layers
      • Normalization Layers
      • Pooling Layers
      • Recurrent Layers
      • Wrapper Layers
      • Advanced Activations
  • DISTRIBUTED DEEP LEARNING
    • Introduction/Getting Started
    • Technical Explanation
    • Spark Guide
    • Spark Data Pipelines Guide
    • API Reference
    • Parameter Server
  • Arbiter
    • Overview
    • Layer Spaces
    • Parameter Spaces
  • Datavec
    • Overview
    • Records
    • Reductions
    • Schema
    • Serialization
    • Transforms
    • Analysis
    • Readers
    • Conditions
    • Executors
    • Filters
    • Operations
    • Normalization
    • Visualization
  • Language Processing
    • Overview
    • Word2Vec
    • Doc2Vec
    • Sentence Iteration
    • Tokenization
    • Vocabulary Cache
  • Mobile (Android)
    • Setup
    • Tutorial: First Steps
    • Tutorial: Classifier
    • Tutorial: Image Classifier
    • FAQ
    • Press
    • Support
    • Why Deep Learning?
Powered by GitBook
On this page
  • Imports
  • Download Data
  • DataSetIterators
  • Neural Network
  • Model Training
  • Model Evaluation

Was this helpful?

Edit on Git
Export as PDF
  1. Getting Started
  2. Tutorials

Instacart Single Task Example

PreviousInstacart Multitask ExampleNextCloud Detection Example

Last updated 5 years ago

Was this helpful?

This tutorial will be similar to the . The only difference is that we will not use multitasking to train our neural network. Recall the data originially comes from a Kaggle challenge (). We removed users that only made 1 order using the instacart app and then took 5000 users out of the remaining to be part of the data for this tutorial.

For each order, we have information on the product the user purchased. For example, there is information on the product name, what aisle it is found in, and the department it falls under. To construct features, we extracted indicators representing whether or not a user purchased a product in the given aisles for each order. In total there are 134 aisles. The targets were whether or not a user will buy a product in the breakfast department in the next order. As mentioned, we will not use any auxiliary targets.

Because of temporal dependencies within the data, we used a LSTM network for our model.

Imports

import org.deeplearning4j.nn.api.OptimizationAlgorithm;
import org.deeplearning4j.nn.conf.NeuralNetConfiguration;
import org.deeplearning4j.nn.conf.Updater;
import org.deeplearning4j.nn.conf.layers.LSTM;
import org.deeplearning4j.nn.weights.WeightInit;
import org.nd4j.linalg.activations.Activation;
import org.deeplearning4j.nn.conf.layers.RnnOutputLayer;
import org.nd4j.linalg.lossfunctions.LossFunctions.LossFunction;
import org.deeplearning4j.nn.conf.GradientNormalization;
import org.deeplearning4j.eval.ROC;
import org.datavec.api.records.reader.impl.csv.CSVSequenceRecordReader;
import org.datavec.api.records.reader.SequenceRecordReader;
import org.datavec.api.split.NumberedFileInputSplit;
import org.deeplearning4j.datasets.datavec.SequenceRecordReaderDataSetIterator;
import org.deeplearning4j.nn.multilayer.MultiLayerNetwork;
import org.nd4j.linalg.api.ndarray.INDArray;
import org.nd4j.linalg.learning.config.Adam;

import java.io.File;
import java.net.URL;
import java.io.BufferedInputStream;
import java.io.FileInputStream;
import java.io.BufferedOutputStream;
import java.io.FileOutputStream;
import org.apache.commons.io.FilenameUtils;
import org.apache.commons.io.FileUtils;
import org.apache.commons.compress.archivers.tar.TarArchiveInputStream;
import org.apache.commons.compress.compressors.gzip.GzipCompressorInputStream;
import org.apache.commons.compress.archivers.tar.TarArchiveEntry;

Download Data

To download the data, we will create a temporary directory that will store the data files, extract the tar.gz file from the url, and place it in the specified directory.

val DATA_URL = "https://dl4jdata.blob.core.windows.net/training/tutorials/instacart.tar.gz"
val DATA_PATH = FilenameUtils.concat(System.getProperty("java.io.tmpdir"), "dl4j_instacart/")
val directory = new File(DATA_PATH)
directory.mkdir() 

val archizePath = DATA_PATH + "instacart.tar.gz"
val archiveFile = new File(archizePath)
val extractedPath = DATA_PATH + "instacart" 
val extractedFile = new File(extractedPath)

FileUtils.copyURLToFile(new URL(DATA_URL), archiveFile) 

We will then extract the data from the tar.gz file, recreate directories within the tar.gz file into our temporary directories, and copy the files from the tar.gz file.

var fileCount = 0
var dirCount = 0
val BUFFER_SIZE = 4096
val tais = new TarArchiveInputStream(new GzipCompressorInputStream( new BufferedInputStream( new FileInputStream(archizePath))))

var entry = tais.getNextEntry().asInstanceOf[TarArchiveEntry]

while(entry != null){
    if (entry.isDirectory()) {
        new File(DATA_PATH + entry.getName()).mkdirs()
        dirCount = dirCount + 1
        fileCount = 0
    }
    else {
        
        val data = new Array[scala.Byte](4 * BUFFER_SIZE)

        val fos = new FileOutputStream(DATA_PATH + entry.getName());
        val dest = new BufferedOutputStream(fos, BUFFER_SIZE);
        var count = tais.read(data, 0, BUFFER_SIZE)
        
        while (count != -1) {
            dest.write(data, 0, count)
            count = tais.read(data, 0, BUFFER_SIZE)
        }
        
        dest.close()
        fileCount = fileCount + 1
    }
    if(fileCount % 1000 == 0){
        print(".")
    }
    
    entry = tais.getNextEntry().asInstanceOf[TarArchiveEntry]
}

DataSetIterators

Next we will convert the raw data (csv files) into DataSetIterators, which will be fed into a neural network. Our training data will have 4000 examples which will be represented by a single DataSetIterator, and the testing data will have 1000 examples which will be represented by a separate DataSetIterator.

val path = FilenameUtils.concat(DATA_PATH, "instacart/") // set parent directory

val featureBaseDir = FilenameUtils.concat(path, "features") // set feature directory
val targetsBaseDir = FilenameUtils.concat(path, "breakfast") // set label directory

We first initialize CSVSequenceRecordReaders, which will parse the raw data into record-like format. Then the SequenceRecordReaderDataSetIterators can be created using the RecordReaders. Since each example has sequences of different lengths, an alignment mode of align end is needed.

val trainFeatures = new CSVSequenceRecordReader(1, ",");
trainFeatures.initialize( new NumberedFileInputSplit(featureBaseDir + "/%d.csv", 1, 4000));
val trainLabels = new CSVSequenceRecordReader(1, " ");
trainLabels.initialize(new NumberedFileInputSplit(targetsBaseDir + "/%d.csv", 1, 4000));

val train = new SequenceRecordReaderDataSetIterator(trainFeatures, trainLabels, 32,
    2, false, SequenceRecordReaderDataSetIterator.AlignmentMode.ALIGN_END);

val testFeatures = new CSVSequenceRecordReader(1, ",");
testFeatures.initialize( new NumberedFileInputSplit(featureBaseDir + "/%d.csv", 4001, 5000));
val testLabels = new CSVSequenceRecordReader(1, " ");
testLabels.initialize(new NumberedFileInputSplit(targetsBaseDir + "/%d.csv", 4001, 5000));

val test = new SequenceRecordReaderDataSetIterator(testFeatures, testLabels, 32,
    2, false, SequenceRecordReaderDataSetIterator.AlignmentMode.ALIGN_END);;

Neural Network

The next task is to set up the neural network configuration. We will use a MultiLayerNetwork and the configuration will be similar to the multitask model from before. Again we use one GravesLSTM layer but this time only one RnnOutputLayer.

val conf = new NeuralNetConfiguration.Builder()
    .optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT)
    .seed(12345)
    .dropOut(0.25)
    .weightInit(WeightInit.XAVIER)
    .updater(new Adam())
    .list()
    .layer(0, new LSTM.Builder()
        .activation(Activation.TANH)
        .gradientNormalization(GradientNormalization.ClipElementWiseAbsoluteValue)
        .gradientNormalizationThreshold(10)
        .nIn(134)
        .nOut(150)
        .build())
    .layer(1, new RnnOutputLayer.Builder(LossFunction.XENT)
        .activation(Activation.SOFTMAX)
        .nIn(150)
        .nOut(2)
        .build())
.build();

We must then initialize the neural network.

val net = new MultiLayerNetwork(conf);
net.init();

Model Training

To train the model, we use 5 epochs with a simple call to the fit method of the MultiLayerNetwork.

net.fit( train , 5);

Model Evaluation

We will now evaluate our trained model. Note that we will use the area under the curve (AUC) metric of the ROC curve.

// Evaluate the model

val roc = new ROC(100);

while(test.hasNext()){
    val next = test.next();
    val features = next.getFeatures();
    val output = net.output(features);
    roc.evalTimeSeries(next.getLabels(), output);
}
println(roc.calculateAUC());

We achieve a AUC of 0.64!

Instacart Multitask tutorial
kaggle.com/c/instacart-market-basket-analysis