Highlights - 1.0.0-beta6 Release

  • Added support for CUDA 10.2. 1.0.0-beta6 released with CUDA 9.2, 10.0, 10.1 and 10.2 support
  • SameDiff optimizations - memory use for inference and training significantly reduced, with some performance improvements also
  • Deeplearning4j UI - Play framework replaced with Vertx; deeplearning4j-ui dependency now no longer has Scala dependency or Scala version suffix Link
    • Note: No API changes, only artifact ID change: replace deeplearning4j-ui_2.1x with deeplearning4j-ui
  • ND4j namespace operation methods: operations are available through the Nd4j.math, Nd4j.random, Nd4j.bitwise, Nd4j.nn (neural network), for example Nd4j.math.abs(INDArray), Nd4j.random.logNormal etc Link.
    • Note that additional ND4J namespaces API will have additions (new namespaces and methods), and may have some API changes, in the next release
  • OpenMP replaced with thread pool c++ parallelism framework; enabled c++ parallelism for platforms without C++-level threading for operations


Deeplearning4J: Features and Enhancements

  • DNNL (MKL-DNN) upgraded to version 1.1
  • Added causal convolution mode for Convolution1D layer (ConvolutionMode.Causal) and added causal conv1d support for Keras import Link
  • Keras import now supports scaled identity weight initialization Link
  • Added Mish activation function Link, Link
  • BertIterator now has a BertIterator.featurizeSentences(List<String>) method for inference Link, Link
  • BertIterator now supports sentence pairs for supervised training Link
  • Added Sparse multi-class cross entropy for both Deeplearning4j and Keras import Link, Link
  • Deeplearning4j UI: migrated from Play to Vertx for web serving backend, also removing dependency on Scala libraries; no API changes, only artifact ID change - replace deeplearning4j-ui_2.1x with deeplearning4j-ui Link, Link
  • Added TimeDistributed wrapper layer Link

Deeplearning4J: Bug Fixes and Optimizations

  • KDTree implementation optimized Link
  • Deeplearning4j zoo models and datasets hosting location updated Link
  • Fixed nIn validation for Deconv2D layer Link
  • Fixed an issue with incorrect Deconvolution2d results for Keras import models Link
  • Added DNNL/MKLDNN support for batch normalization layer Link, Link
  • Fixed various integer casts to avoid overflows for very large arrays (with dimensions or length > Integer.MAX_VALUE) Link
  • Fixed an issue with UNet non-pretrained model architecture (last layer kernel size) Link
  • Deeplearning4j SameDiff layers now use DL4J workspaces for better performance and reduced memory consumption Link
  • Updated broken links in afew error messages Link
  • Cleaned up a few unused dependencies in various modules Link
  • Cleaned up duplicate SamplingDataSetIterator class Link
  • Fixed an issue where ComputationGraph instances with a single input going into multiple embedding layers could throw a NPE Link
  • Fixed an issue where loss function weights were not automatically cast to network datatype, resulting in an exception if not already correct type Link
  • Shaded Jackson version upgraded from 2.9.9/ to 2.10.1 Link
  • Fixed an issue with KNN where getMostPopulatedClusters actually returned the least populated clusters Link

Deeplearning4j: Transition Guide, 1.0.0-beta5 to 1.0.0-beta6

  • Deeplearning4j UI artifact ID has changed: deeplearning4j-ui_2.1x (beta5 and earlier) with deeplearning4j-ui

ND4J and SameDiff

ND4J/SameDiff: Features and Enhancements

  • Added suport for CUDA 10.2 Link
  • DNNL (MKL-DNN) upgraded to version 1.1 Link
  • Added ND4j namespaces to match SameDiff: Nd4j.math, Nd4j.random, Nd4j.bitwise, Nd4j.nn (neural network) Link
  • Added SameDiff.calculateGradientsAndOutputs method Link Link
  • Additional SameDiff single batch .output method overloads for DataSet/MultiDataSet added Link
  • TensorFlow import ops coverage enhanced (significant number of additional ops supported) Link, Link, Link, Link, Link
  • PRelu op added Link
  • adjust_contrast, igamma and igammac ops added Link
  • ND4J/SameDiff: BitCast, CompareAndBitpack, DivideNoNan, DrawBoundingBoxes, FakeQuantWithMinMaxVarsPerChannel ops added Link
  • non_max_suppression_overlaps op added Link
  • ImagePreProcessingScaler now supports segmentation use cases Link
  • concat operation now supports the concatenation axis being specified via the last input array Link
  • Added Gamma and Poisson RNG distributions Link
  • SameDiff’s use of DeviceLocal for variables/constants etc is now configurable Link
  • Uniform distribution op now supports random integer generation, not just random floating point generation Link
  • SameDiff: Added simple OpBenchmarkListener for benchmarking purposes Link
  • Added the ability to disable platform helpers (DNNL/MKLDNN etc) via Nd4jCPU.Environment.getInstance().allowHelpers(false); and Nd4jCuda.Environment.getInstance().allowHelpers(false); Link
  • Added draw_bounding_boxes operation Link
  • Added resize_bicubic operation Link
  • Added causal padding mode to conv1d operation Link
  • DNNL (MKLDNN) is included and enabled by default for non-AVX builds Link
  • Added SameDiff ArraySavingListener for debugging purposes Link

ND4J/SameDiff: Bug Fixes and Optimizations

  • OpenMP replaced with ThreadPool abstraction, enables parallelism for platforms without OpenMP support Link
  • SameDiff memory management overheauled for (in some cases significantlny) reduced memory consumption and improved performance Link, Link
  • Switched to Clang instead of gcc for OSX compilation to avoid compiler-related issues Link
  • Removed SameDiff.outputs() “best guess” output inference due to being unreliable, in favor of explicit SameDiff.setOutputs(String...) call Link
  • Fixed an issue with Nd4j.hstack on 1D arrays Link
  • SameDiff no longer allows empty arrays for variables Link
  • Fixed an issue with Nadam updater LR schedules not being cloned Link
  • Cleaned up IActivation interface Link
  • Added new LSTM op implementation with DNNL/MKLDNN support (forward pass only so far) Link
  • SameDiff API cleaned up; deprecated methods removed Link
  • Switched SameDiff variable initialization to non-lazy, to avoid unexpected behaviour when mixing execution and ND4J RNG seed setting Link
  • and .one methods now create constants, not vairables Link
  • Moved CUDA build version and device logging to Java logging, from c++ stdout to enable disabling logging (via ND4J config or slf4j config) Link
  • Added DNNL/MKLDNN support for batch normalization Link
  • SameDiff: Fixed an issue where listeners weren’t being called for gradient calculation Link
  • Added DNNL/MKLDNN support for deconv2d/3d operations Link
  • Fixed an issue with biasadd_bp operation and NHWC data format Link
  • Fixed an issue with certain strided slice backprop configurations Link, Link
  • Fixed an issue with LogSumExp reduction operation backprop for along dimension case Link, Link
  • INDArray.toString() now has correct brackets for rank 1+ scalars to avoid ambiguity Link
  • Fixed an issue where some ND4J methods could fail when the library is compiled on Java 9+ but run on Java 8 Link
  • Fixed empty array input case for is_strictly_increasing, non_decreasing and non_max_suppression ops Link, Link
  • Fixed empty input arrays for legacy ops (transform, scalar, pairwise, broadcast) Link
  • CUDA compute capability 3.0 is supported again Link
  • Improved performance for Scatter operations (1D case) + index validation Link
  • Fixed an issue where SameDiff TrainingConfig serialization would fail if evaluation instances are set Link, Link
  • SameDiff execution will now throw an exception when assertion operations in the graph fail Link
  • PolyGamma function now returns NaNs when passed double for args requiring integer values Link
  • Fixed some issues for pad and mirror_pad ops to ensure they conform with Tensorflow for imported networks Link
  • Updated and fixed some issues for TensorFlow graph runner Link
  • Improved performance for Reverse operation Link
  • Removed/cleanup up unused ND4J list functionality Link
  • Fixed reduce bool operation results (such as any, all, IsInf, etc) for empty array inputs Link

ND4J: Transition Guide, 1.0.0-beta5 to 1.0.0-beta6

  • SameDiff.outputs() now requires user to call SameDiff.setOutputs(String...) first; previous “best guess” output inference was unreliable Link
  • and .one methods now create constants, not vairables Link


DataVec: Bug Fixes and Optimizations

  • NativeImageLoader now checks for empty input streams and throws an exception instead of crashing Link
  • NDArrayScalarOpTransform now supports modulus operator Link


RL4J: Features and Enhancements

  • Added AsyncTrainingListener Link
  • Replaced multiple uses of java.util.Random with ND4J Random Link
  • Added Observable and LegacyMDPWrapper Link

RL4J: Bug Fixes and Optimizations

  • Refactored RL4J video recording to separate VideoRecorder class Link
  • Fixed an issue with target for DQN Link, Link
  • Refactoring for DQN and double DQN for improved maintainability Link
  • Internal refactoring and various bug fixes Link


PyDataVec Features and Enhancements

  • PyDataVec TransformProcess now supports non-inplace operations Link

PyDataVec Bug Fixes and Optimizations

  • Fixed various issues with PyDataVec Link
  • Fixed an issue with data locality that could cause incorrect results under some circumstances when running on CUDA Link