Highlights - 1.0.0-beta2 Release

  • ND4J/Deeplearning4j: Added support for CUDA 9.2. Dropped support for CUDA 9.1. (1.0.0-beta2 release has CUDA 8.0, 9.0 and 9.2 support)
  • Deeplearning4j: New SameDiff layers with training support - Link Link
  • Deeplearning4j resource (datasets, pretrained models) storage directory can now be configured via DL4JResources.setBaseDirectory method or org.deeplearning4j.resources.directory system property
  • ND4J: all indexing is now done with longs instead of ints to allow for arrays with dimensions and lengths greater than Integer.MAX_VALUE (approx. 2.1 billion)
  • ND4J: nd4j-native-platform will now use Intel MKL-DNN as the default/bundled BLAS implementation (replacing OpenBLAS as the previous default)
  • Deeplearning4j: Added Out-of-memory (OOM) crash dump reporting functionality. Provides a dump with memory use and configuration if training/inference OOMs (to assist with debugging and tuning memory configuration).
  • Deeplearning4j - new layers: Locally connected 1d Link, Locally connected 2d Link


Deeplearning4J: New Features

  • Added new SameDiff layers (automatic differentiation - only single class, forward pass definition required) to DL4J with full training support - SameDiffLayer, SameDiffVertex, SameDiffOutputLayer, SameDiffLambdaLayer, SameDiffLambdaVertex - note that these are CPU-only execution for now Link Link Link
  • Resource (datasets, pretrained models) storage directory can now be configured via DL4JResources.setBaseDirectory method or org.deeplearning4j.resources.directory system property. Note that it is also possible to set a different base location for downloads (for local mirrors of DL4J resources) Link
  • Added Out-of-memory (OOM) crash dump reporting functionality. Provides a dump with memory use and configuration if training/inference OOMs. Same information is available (without a crash) for MultiLayerNetwork/ComputationGraph.memoryInfo methods. Can be disabled (or output directory set) using system properties - Link
  • Added Composite[Multi]DataSetPreProcessor to enable multiple [Multi]DataSetPreProcessors to be applied in a single iterator Link
  • Added ComputationGraph evaluate methods for multi-output networks: evaluate(DataSetIterator, Map<Integer,IEvaluation[]>) and evaluate(MultiDataSetIterator, Map<Integer,IEvaluation[]>) Link
  • Added JointMultiDataSetIterator - utility iterator used to create MultiDataSetIterator from multiple DataSetIterators Link
  • GraphVertices may now have trainable parameters directly (not just enclose layers with trainable parameters) Link
  • Added MultiLayerNetwork/ComputationGraph getLearningRate methods Link
  • Added RandomDataSetIterator and RandomMultiDataSetIterator (mainly for testing/debugging) Link Link
  • Added cyclical "1cycle" schedule for learning rate schedules etc - Link
  • RDD repartitioning for Spark training is more configurable (adds Repartitioner interface) Link
  • Added ComputationGraph.getIterationCount() and .getEpochCount() for consistency with MultiLayerNetwork Link
  • Added locally connected 1d layer Link Link
  • Spark "data loader" API (mainly for Spark) Link Link Link
  • Spark evaluation: added evaluation method overloads that allow specifying the number of evaluation workers (less than number of Spark threads) Link
  • CnnSentenceDataSetIterator now has a Format argument, and supports outputting data for RNNs and 1D CNNs Link
  • Added ComputationGraph/MultiLayerNetwork.pretrain((Multi)DataSetIterator, int epochs) method overloads Link
  • MultiLayerNetwork and ComputationGraph now have output method overloads where the network output can be placed in the user-specified workspace, instead of being detached Link Link. This can be used to avoid creating INDArrays that need to be garbage collected before native memory can be freed.
  • EmbeddingSequenceLayer now supports [minibatch,1,seqLength] format sequence data in addition to [minibatch,seqLength] format data Link
  • CuDNN batch norm implementation will now be used for rank 2 input, not just rank 4 input Link
  • Environment variables and system properties for DL4J have been centralized into DL4JResources and DL4JEnvironmentVars classes, with proper descriptions Link Link
  • MultiLayerNetwork and ComputationGraph output/feedForward/fit methods are now thread-safe via synchronization. Note that concurrent use is not recommended due to performance (instead: use ParallelInference); however the now-synchronized methods should avoid obscure errors due to concurrent modifications Link
  • BarnesHutTSNE now throws a useful exception in the case where the distance metric is undefined (for example, all zeros plus cosine similarity) Link

Deeplearning4J: Bug Fixes and Optimizations

  • ComputationGraph.addListeners was not working correctly if listeners were already present Link, Link
  • TinyImageNetDataSetIterator did not validate/correctly use input shape configuration Link, Link
  • BatchNormalization layer now correctly asserts that nOut is set if required (instead of unfriendly shape errors later) Link
  • Fixed issue where OutputLayer may not initialize parameter constraints correctly Link
  • Fixed performance issue with Nesterov updater using CPU-only op for CUDA execution Link
  • Removed TerminationCondition for DL4J optimizers - was not used in practice, and had minor overhead Link
  • Fixed issue where EvaluativeListener could hit a workspace validation exception when workspaces are enabled Link
  • Fixed issue where TrainingListener.onEpochStart/onEpochEnd were not being called correctly for ComputationGraph Link
  • Fixed workspace issue with TensorFlowCnnToFeedForwardPreProcessor Link
  • Performance optimization for BatchNormalization when using CuDNN Link
  • Performance optimization: Dropout will be applied in-place when safe to do so, avoiding a copy Link
  • Added CuDNN implementation of Dropout Link
  • Reduced memory use for CuDNN: CuDNN working memory is now shared and reused between layers within a network Link
  • CuDNN batch normalization implementation would fail with FP16 datatype Link
  • Fixed issue Bidirectional LSTM may incorrectly use workspaces causing an exception Link
  • Fixed issue with early stopping where scores to be maximized (accuracy, f1, etc) were not properly triggering termination conditions Link
  • Fixed issue where label mask counter could be incorrectly incremented in ComputationGraph.computeGradientAndScore() Link
  • ComputationGraph was not setting lastEtlTime field during training Link
  • Fixed issue with AutoEncoder layer when workspaces are enabled Link
  • Fixed issue with EmbeddingSequenceLayer use of mask arrays Link
  • Lombok is now provided scope everywhere, isn't on user classpath when using DL4J Link
  • Fixed issue where WordVectorSerializer.readParagraphVectors(File) initialization of label source Link
  • Spark training (gradient sharing) now properly handles empty partition edge case when encountered during training Link
  • Errors are propagated better/more consistently for Spark gradient sharing training Link
  • Fixed issue with 1D CNN layers with mask arrays and stride > 1 (masks not being correctly downsized) Link
  • DL4J Batch norm implementation was not correctly adding epsilon value during inference, only during training (CuDNN unaffected) Link
  • CuDNN subsampling layers with max pooling and ConvolutionMode.SAME may have taken padding value (0) as the maximum for border values when all non-padding values are less than 0 Link
  • Spark training with gradient sharing now passes listeners to workers correctly Link
  • Fixed rare (and non-terminal) concurrent modification issue with UI and FileStatsStorage Link
  • CuDNN convolution layer now supports dilation > 2 (previously: used DL4J conv layer implementation as a fallback) Link
  • Yolo2OutputLayer now implements computeScoreForExamples() Link
  • SequenceRecordReeaderDataSetIterator now handles the "no labels" case correctly Link
  • Fixed issue where BarnesHutTSNE could hit a workspace validation exception Link
  • EMNIST iterator could produce incorrect data in some cases after a reset Link

Deeplearning4J: API Changes (Transition Guide): 1.0.0-beta to 1.0.0-beta2

  • GravesLSTM has been deprecated in favor of LSTM due to lack of CuDNN support but otherwise similar accuracy to in practice. Use LSTM class instead.
  • deeplearning4j-modelexport-solr: now uses Lucene/Solr version 7.4.0 (was 7.3.0) Link
  • Mask arrays for CNN2d layers must be in broadcastable 4d format: [minibatch,depth or 1, height or 1, width or 1] - previously they were 2d with shape [minibatch,height] or [minibatch,width]. This provents ambiguity in later cases (pooling layers), and allows for more complex masking scenarios (such as masking for different image sizes in same minibatch). Link
  • Some older/deprecated Model and Layer methods have been removed. (validateInput(), initParams()). Some custom layers may need to be updated as a result Link

Deelpearning4J: 1.0.0-beta2 Known Issues

  • Windows users are unable to load the HDF5 files used in SvhnLabelProvider (used in HouseNumberDetection example). Linux/Mac users are unaffected. A workaround for windows users is to add the sonatype snapshot dependency org.bytedeco.javacpp-presets:hdf5-platform:jar:1.10.2-1.4.3-SNAPSHOT Link

Deeplearing4J: Keras Import

  • Keras model import now imports every Keras application
  • Supports GlobalPooling3D layer import
  • Supports RepeatVector layer import
  • Supports LocallyConnected1D and LocallyConnected2D layers
  • Keras Lambda layers can now be imported by registering custom SameDiff layers
  • All Keras optimizers are now supported
  • All advanced activation functions can now be imported.
  • Many minor bugs have been fixed, including proper weight setting for all configurations of BatchNormalization, improvements to Reshape SeparableConvolution2D, and full support of Bidirectional layers.


ND4J: New Features

  • ND4J: all indexing is now done with longs instead of ints to allow for arrays with dimensions and lengths greater than Integer.MAX_VALUE (approx. 2.1 billion)
  • Added the ability to write Numpy .npy format using Nd4j.writeAsNumpy(INDArray,File) and convert an INDArray to a numpy strict in-memory using Nd4j.convertToNumpy(INDArray) Link
  • ND4j-common ClassPathResource: added ClassPathResource.copyDirectory(File) Link
  • SameDiff: A significant number of new ops, and backprop implementations for existing ops
  • Added Nd4j.randomBernoulli/Binomial/Exponential convenience methods Link
  • Added way to disable/suppress ND4J initialization logging via org.nd4j.log.initialization system property Link
  • SameDiff class - most op/constructor methods now have complete/useful javadoc Link
  • Workspaces can now be disabled globally, ignoring workspace configuration. This is mainly used for debugging; use Nd4j.getWorkspaceManager().setDebugMode(DebugMode.DISABLED) or Nd4j.getWorkspaceManager().setDebugMode(DebugMode.SPILL_EVERYTHING); to enable this. Link [Link]
  • Added EnvironmentalAction API for environment variable processing Link
  • ND4J environment variables and system properties have been centralized in ND4jEnvironmentVars and ND4jSystemProperties classes Link and Link

ND4J: Bug Fixes and Optimizations

  • SameDiff: a significant number of bug fixes for execution and individual ops
  • Fixed issue where INDArray.toDoubleArray() with true scalars (rank 0 arrays) Link
  • Fixed issue with DataSet.sample() not working for rank 3+ features Link
  • IActivation implementations now validate/enforce same shape for activations and gradients Link
  • Fixed issue with muliColumnVector where vector is 1d Link
  • ImagePreProcessingScaler now supports serialization via NormalizerSerializerStrategy and ModelSerializer Link
  • Performance optimization for threshold encoding used in DL4J's Spark gradient sharing distributed training implementation Link
  • SameDiff: Fixed issue where memory wasn't always released after execution Link
  • DataSet.save() and MultiDataSet.save() methods now save example metadata when present Link
  • Fixed issue with KFoldIterator when dataset does not divide equally into folds with no remainder Link
  • Fixed issue where version check functionality could fail to load resources if resources are on a path with spaces Link

ND4J: Known Issues

ND4J: API Changes (Transition Guide): 1.0.0-beta to 1.0.0-beta2

  • CUDA 9.1 support has been removed. CUDA 8.0, 9.0 and 9.2 support is available
  • Due to long indexing changes, long/long[] should be used in place of int/int[] in some places (such as INDArray.size(int), INDArray.shape())
  • Simplified DataSetIterator API: totalExamples(), cursor() and numExamples() - these were unsupported on most DataSetIterator implementations, and not used in practice for training. Custom iterators should remove these methods also Link
  • Long-deprecated DataSet.getFeatureMatrix() has been removed. Use DataSet.getFeatures() instead. Link
  • Unused and not properly tested/maintained utility class BigDecimalMath has been removed. Users should find an aternative library for this functionality, if required.
  • Not properly maintained complex number support classes (IComplexNumber, IComplexNDArray) have been removed entirely Link


DataVec: New Features

  • Added AnalyzeLocal class to mirror functionality of AnalyzeSpark (but without Spark dependency) Link
  • Added JacksonLineSequenceRecordReader: RecordReader used for multi-example JSON/XML where each line in a file is an independent example Link
  • Added RecordConvert.toRecord(Schema, List<Object>) Link
  • Added missing FloatColumnCondition Link
  • Added CSVLineSequenceRecordReader for "each line in CSV is a sequence, and sequence is single-valued/univariate" Link
  • Added CSVMultiSequenceRecordReader for "multiple multi-valued sequences in a single CSV" data Link

DataVec: Optimizations and Bug Fixes

  • Fixed issue with NativeImageLoader on Android Link
  • Fixed issue with ExcelRecordReader Link
  • Fixed issue where bad args for CSVRecordReader.next(int) could cause an unnecessarily large list to be generated Link

DataVec: API Changes (Transition Guide): 1.0.0-beta to 1.0.0-beta2


Arbiter: New Features

  • Added DataSource interface. Unlike old DataProvider, this does not require JSON serializability (only a no-arg constructor) Link
  • Added numerous enhancements and missing configuration options (constraints, dilation, etc) Link Link

Arbiter: Fixes

  • DataProvider has been deprecated. Use DataSource instead.


  • stepCounter, epochCounter and historyProcessor can now be set Link
  • Random seed is now loaded for ACPolicy is loaded Link