In this tutorial, we will learn how to apply a long-short term memory (LSTM) neural network to a medical time series problem. The data used comes from 4000 intensive care unit (ICU) patients and the goal is to predict the mortality of patients using 6 general descriptor features, such as age, gender, and weight along with 37 sequential features, such as cholesterol level, temperature, pH, and glucose level. Each patient has multiple measurements of the sequential features, with patients having a different amount of measurements taken. Furthermore, the time between measurements also differ among patients as well.
A LSTM is well suited for this type of problem due to the sequential nature of the data. In addition, LSTM networks avoid vanishing and exploding gradients and are able to effectively capture long term dependencies due to its cell state, a feature not present in typical recurrent networks.
Now that we have imported everything needed to run this tutorial, we will start with obtaining the data and then converting the data into a format a neural network can understand.
Data Source
The data is contained in a compressed tar.gz file. We will have to download the data from the url below and then extract csv files containing the ICU data. Each patient will have a separate csv file for the features and labels. The features will be contained in a directory called sequence and the labels will be contained in a directory called mortality. The features are contained in a single csv file with the columns representing the features and the rows representing different time steps. The labels are contained in a single csv file which contains a value of 0 indicating death and a value of 1 indicating survival.
val DATA_URL ="https://dl4jdata.blob.core.windows.net/training/physionet2012/physionet2012.tar.gz"val DATA_PATH = FilenameUtils.concat(System.getProperty("java.io.tmpdir"), "dl4j_physionet/")
Download Data
To download the data, we will create a temporary directory that will store the data files, extract the tar.gz file from the url, and place it in the specified directory.
val directory =new File(DATA_PATH)directory.mkdir() // create new directory at specified pathval archizePath = DATA_PATH +"physionet2012.tar.gz"// set path for tar.gz fileval archiveFile =new File(archizePath) // create tar.gz fileval extractedPath = DATA_PATH +"physionet2012"val extractedFile =new File(extractedPath)FileUtils.copyURLToFile(new URL(DATA_URL), archiveFile) // copy data from URL to file
Next, we must extract the data from the tar.gz file, recreate directories within the tar.gz file into our temporary directory, and copy the files into our temporary directory.
Our next goal is to convert the raw data (csv files) into a DataSetIterator, which can then be fed into a neural network for training. Our training data will have 3200 examples which will be represented by a single DataSetIterator, and the testing data will have 800 examples which will be represented by a separate DataSet Iterator.
val NB_TRAIN_EXAMPLES =3200// number of training examplesval NB_TEST_EXAMPLES =800// number of testing examples
In order to obtain DataSetIterators, we must first initialize CSVSequenceRecordReaders, which will parse the raw data into record-like format. We will first set the directories for the features and labels and initialize the CSVSequenceRecordReaders.
Next, we can initialize the SequenceRecordReaderDataSetIterator using the previously created CSVSequenceRecordReaders. We will use an alignment mode of ALIGN_END. This alignment mode is needed due to the fact that the number of time steps differs between different patients. Because the mortality label is always at the end of the sequence, we need all the sequences aligned so that the time step with the mortality label is the last time step for all patients. For a more in depth explanation of alignment modes, see Recurrent Networks in DL4J.
val conf =new NeuralNetConfiguration.Builder() .seed(RANDOM_SEED) .optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT) .updater(new Adam(LEARNING_RATE)) .weightInit(WeightInit.XAVIER) .dropOut(0.25) .graphBuilder() .addInputs("trainFeatures") .setOutputs("predictMortality") .addLayer("L1", new GravesLSTM.Builder() .nIn(NB_INPUTS) .nOut(LSTM_LAYER_SIZE) .forgetGateBiasInit(1) .activation(Activation.TANH) .build(),"trainFeatures") .addLayer("predictMortality", new RnnOutputLayer.Builder(LossFunctions.LossFunction.XENT) .activation(Activation.SOFTMAX) .nIn(LSTM_LAYER_SIZE).nOut(NUM_LABEL_CLASSES).build(),"L1") .build()val model =new ComputationGraph(conf)
Training
To train the neural network, we simply call the fit method of the ComputationGraph on the trainData DataSetIterator and also pass how many epochs it should train for.
model.fit(trainData, 2)
Model Evaluation
Finally, we can evaluate the model with the testing split using the AUC (area under the curve metric ) using a ROC curve. A randomly guessing model will have an AUC close to 0.50, while a perfect model will achieve an AUC of 1.00
val roc =new ROC(100);while (testData.hasNext()) {val batch = testData.next();val output = model.output(batch.getFeatures()); roc.evalTimeSeries(batch.getLabels(), output(0));}println("FINAL TEST AUC: "+ roc.calculateAUC());
We see that this model achieves an AUC on the test set of 0.69!