Highlights - 1.0.0-beta6 Release

    Added support for CUDA 10.2. 1.0.0-beta6 released with CUDA 9.2, 10.0, 10.1 and 10.2 support
    SameDiff optimizations - memory use for inference and training significantly reduced, with some performance improvements also
    Deeplearning4j UI - Play framework replaced with Vertx; deeplearning4j-ui dependency now no longer has Scala dependency or Scala version suffix Link
      Note: No API changes, only artifact ID change: replace deeplearning4j-ui_2.1x with deeplearning4j-ui
    ND4j namespace operation methods: operations are available through the Nd4j.math, Nd4j.random, Nd4j.bitwise, Nd4j.nn (neural network), for example Nd4j.math.abs(INDArray), Nd4j.random.logNormal etc Link.
      Note that additional ND4J namespaces API will have additions (new namespaces and methods), and may have some API changes, in the next release
    OpenMP replaced with thread pool c++ parallelism framework; enabled c++ parallelism for platforms without C++-level threading for operations


Deeplearning4J: Features and Enhancements

    DNNL (MKL-DNN) upgraded to version 1.1
    Added causal convolution mode for Convolution1D layer (ConvolutionMode.Causal) and added causal conv1d support for Keras import Link
    Keras import now supports scaled identity weight initialization Link
    Added Mish activation function Link, Link
    BertIterator now has a BertIterator.featurizeSentences(List<String>) method for inference Link, Link
    BertIterator now supports sentence pairs for supervised training Link
    Added Sparse multi-class cross entropy for both Deeplearning4j and Keras import Link, Link
    Deeplearning4j UI: migrated from Play to Vertx for web serving backend, also removing dependency on Scala libraries; no API changes, only artifact ID change - replace deeplearning4j-ui_2.1x with deeplearning4j-ui Link, Link
    Added TimeDistributed wrapper layer Link

Deeplearning4J: Bug Fixes and Optimizations

    KDTree implementation optimized Link
    Deeplearning4j zoo models and datasets hosting location updated Link
    Fixed nIn validation for Deconv2D layer Link
    Fixed an issue with incorrect Deconvolution2d results for Keras import models Link
    Added DNNL/MKLDNN support for batch normalization layer Link, Link
    Fixed various integer casts to avoid overflows for very large arrays (with dimensions or length > Integer.MAX_VALUE) Link
    Fixed an issue with UNet non-pretrained model architecture (last layer kernel size) Link
    Deeplearning4j SameDiff layers now use DL4J workspaces for better performance and reduced memory consumption Link
    Updated broken links in afew error messages Link
    Cleaned up a few unused dependencies in various modules Link
    Cleaned up duplicate SamplingDataSetIterator class Link
    Fixed an issue where ComputationGraph instances with a single input going into multiple embedding layers could throw a NPE Link
    Fixed an issue where loss function weights were not automatically cast to network datatype, resulting in an exception if not already correct type Link
    Shaded Jackson version upgraded from 2.9.9/ to 2.10.1 Link
    Fixed an issue with KNN where getMostPopulatedClusters actually returned the least populated clusters Link

Deeplearning4j: Transition Guide, 1.0.0-beta5 to 1.0.0-beta6

    Deeplearning4j UI artifact ID has changed: deeplearning4j-ui_2.1x (beta5 and earlier) with deeplearning4j-ui

ND4J and SameDiff

ND4J/SameDiff: Features and Enhancements

    Added suport for CUDA 10.2 Link
    DNNL (MKL-DNN) upgraded to version 1.1 Link
    Added ND4j namespaces to match SameDiff: Nd4j.math, Nd4j.random, Nd4j.bitwise, Nd4j.nn (neural network) Link
    Added SameDiff.calculateGradientsAndOutputs method Link Link
    Additional SameDiff single batch .output method overloads for DataSet/MultiDataSet added Link
    TensorFlow import ops coverage enhanced (significant number of additional ops supported) Link, Link, Link, Link, Link
    PRelu op added Link
    adjust_contrast, igamma and igammac ops added Link
    ND4J/SameDiff: BitCast, CompareAndBitpack, DivideNoNan, DrawBoundingBoxes, FakeQuantWithMinMaxVarsPerChannel ops added Link
    non_max_suppression_overlaps op added Link
    ImagePreProcessingScaler now supports segmentation use cases Link
    concat operation now supports the concatenation axis being specified via the last input array Link
    Added Gamma and Poisson RNG distributions Link
    SameDiff’s use of DeviceLocal for variables/constants etc is now configurable Link
    Uniform distribution op now supports random integer generation, not just random floating point generation Link
    SameDiff: Added simple OpBenchmarkListener for benchmarking purposes Link
    Added the ability to disable platform helpers (DNNL/MKLDNN etc) via Nd4jCPU.Environment.getInstance().allowHelpers(false); and Nd4jCuda.Environment.getInstance().allowHelpers(false); Link
    Added draw_bounding_boxes operation Link
    Added resize_bicubic operation Link
    Added causal padding mode to conv1d operation Link
    DNNL (MKLDNN) is included and enabled by default for non-AVX builds Link
    Added SameDiff ArraySavingListener for debugging purposes Link

ND4J/SameDiff: Bug Fixes and Optimizations

    OpenMP replaced with ThreadPool abstraction, enables parallelism for platforms without OpenMP support Link
    SameDiff memory management overheauled for (in some cases significantlny) reduced memory consumption and improved performance Link, Link
    Switched to Clang instead of gcc for OSX compilation to avoid compiler-related issues Link
    Removed SameDiff.outputs() “best guess” output inference due to being unreliable, in favor of explicit SameDiff.setOutputs(String...) call Link
    Fixed an issue with Nd4j.hstack on 1D arrays Link
    SameDiff no longer allows empty arrays for variables Link
    Fixed an issue with Nadam updater LR schedules not being cloned Link
    Cleaned up IActivation interface Link
    Added new LSTM op implementation with DNNL/MKLDNN support (forward pass only so far) Link
    SameDiff API cleaned up; deprecated methods removed Link
    Switched SameDiff variable initialization to non-lazy, to avoid unexpected behaviour when mixing execution and ND4J RNG seed setting Link
    SameDiff.zero and .one methods now create constants, not vairables Link
    Moved CUDA build version and device logging to Java logging, from c++ stdout to enable disabling logging (via ND4J config or slf4j config) Link
    Added DNNL/MKLDNN support for batch normalization Link
    SameDiff: Fixed an issue where listeners weren’t being called for gradient calculation Link
    Added DNNL/MKLDNN support for deconv2d/3d operations Link
    Fixed an issue with biasadd_bp operation and NHWC data format Link
    Fixed an issue with certain strided slice backprop configurations Link, Link
    Fixed an issue with LogSumExp reduction operation backprop for along dimension case Link, Link
    INDArray.toString() now has correct brackets for rank 1+ scalars to avoid ambiguity Link
    Fixed an issue where some ND4J methods could fail when the library is compiled on Java 9+ but run on Java 8 Link
    Fixed empty array input case for is_strictly_increasing, non_decreasing and non_max_suppression ops Link, Link
    Fixed empty input arrays for legacy ops (transform, scalar, pairwise, broadcast) Link
    CUDA compute capability 3.0 is supported again Link
    Improved performance for Scatter operations (1D case) + index validation Link
    Fixed an issue where SameDiff TrainingConfig serialization would fail if evaluation instances are set Link, Link
    SameDiff execution will now throw an exception when assertion operations in the graph fail Link
    PolyGamma function now returns NaNs when passed double for args requiring integer values Link
    Fixed some issues for pad and mirror_pad ops to ensure they conform with Tensorflow for imported networks Link
    Updated and fixed some issues for TensorFlow graph runner Link
    Improved performance for Reverse operation Link
    Removed/cleanup up unused ND4J list functionality Link
    Fixed reduce bool operation results (such as any, all, IsInf, etc) for empty array inputs Link

ND4J: Transition Guide, 1.0.0-beta5 to 1.0.0-beta6

    SameDiff.outputs() now requires user to call SameDiff.setOutputs(String...) first; previous “best guess” output inference was unreliable Link
    SameDiff.zero and .one methods now create constants, not vairables Link


DataVec: Bug Fixes and Optimizations

    NativeImageLoader now checks for empty input streams and throws an exception instead of crashing Link
    NDArrayScalarOpTransform now supports modulus operator Link


RL4J: Features and Enhancements

    Added AsyncTrainingListener Link
    Replaced multiple uses of java.util.Random with ND4J Random Link
    Added Observable and LegacyMDPWrapper Link

RL4J: Bug Fixes and Optimizations

    Refactored RL4J video recording to separate VideoRecorder class Link
    Fixed an issue with target for DQN Link, Link
    Refactoring for DQN and double DQN for improved maintainability Link
    Internal refactoring and various bug fixes Link


PyDataVec Features and Enhancements

    PyDataVec TransformProcess now supports non-inplace operations Link

PyDataVec Bug Fixes and Optimizations

    Fixed various issues with PyDataVec Link
    Fixed an issue with data locality that could cause incorrect results under some circumstances when running on CUDA Link