Comment on page
Release Notes
New changes in each release of Eclipse Deeplearning4j.
Read the announcement at https://blog.konduit.ai/2020/05/14/deeplearning4j-1-0-0-beta7-released/ for the highlights of this release.
- Full inference and training support is available for ops/layers in the tf.keras namespace; inference only for general Tensorflow operations outside of the tf.keras namespace
- Note also improvements to Keras import for reshape, permute, etc operations due to NHWC and NWC support in DL4J
- DL4J now supports NWC (channels last - [minibatch, sequence_length, size]) for all RNN and CNN 1D layers, in addition to NCW Link
- Useful exceptions are now thrown when attempting to perform unsupported operations on FastText Link
- Deeplearning4j UI: Webjars versions locked down using dependency management to avoid check on each build Link
- Fixed an issue with GlobalPooling layer with masks of different datatype to the activations datatype Link
- Fixed an issue where SameDiff layers in DL4J could throw an exception when used with transfer learning Link
- Weight initialization for EmbeddingLayer and EmbeddingSequenceLayer now no longer depend on the vocabulary size (only the vector size) Link
- Fixed an issue where RecordReaderDataSetIterator builder collectMetaData configuration was not being applied Link
- Fixed an issue with Spark training SharedTrainingMaster when training with a ComputationGraph and MultiDataSets Link
- deelpearning4j-nlp-korean will no longer be released for Scala 2.12 due to required dependency only having Scala 2.11 version avairable Link
- Fixed an issue where dataset and model zoo downloads could get stuck if the server fails to send any data (now: timeout + retry) Link
- Fixed an issue where TfidfVectorizer.vectorize() could throw a NPE when fit from LabelAwareIterator Link
- Added new Random operations namespace operations:
- Added new NN namespace operations:
- Added new CNN namespace operations:
- Mapped operations for Tensorflow import:
- HSVToRGB, RGBToHSV, Igamma, Igammac, RandomGamma, RandomPoisson, RandomPoissonV2, RandomShuffle Link
- libnd4j (c++ codebase underlying DL4J, ND4J and SameDiff) refactored to be more easily embeddable in other C++ projects Link
- SameDiff operation namespaces (sd.math, sd.image, etc) are now code generated to ensure SameDiff and ND4J namespaces are identical (all operations included, same API) Link
- Added ND4J
ArchiveUtils.unzipFileTo(String, String, boolean logFiles)
overload to enable/disable extracted file path logging Link - Added weight format configuration for following operations: conv1D, conv2D, conv3D, deconv2d, deconv3d, depthwiseConv2d, pointwiseConv2d, sconv2d Link
- SameDiff: DifferentialFunctionFactory class removed in favor of namespace methods (sd.math, sd.linalg, etc) Link
- Upgraded assorted dependency versions: javax.activation:activation (1.1 -> 1.1.1), stream analytics (2.7.0->2.9.8), Apache Spark (2.4.3->2.4.5), Jackson databind (2.10.1 -> 2.10.3), Vertx (3.8.3 -> 3.9.0) Link
- Fixed an issue where ArchiveUtils could fail to create the top level destination directory when it does not exist Link
- Fixed an issue where hashcode operation shape function wasn't always returning int64/long dtype Link
- Improved performance on C++ SameDiff graph execution via reduced array zeroing where safe to do so Link
- Nd4j.gemm now uses Mmul operation internally to avoid potential threading issues with direct BLAS calls on CUDA Link
- Fixed some operation implementations when operating on views (Batch/Space to Space/Batch/Depth; batchnorm_bp) Link
- Fixed an issue where exponential distribution random number generation operation could produce infinities extremely rarely (~1 in 10^9 values) Link
- Memory for memory mapped workspaces are now deallocated immediately when workspace is destroyed, instead of waiting for GC to free memory Link
- Fixed an issue with LineRecordReader where initialization was performed unnecessarily (adding performance overhead) Link
- Added support for CUDA 10.2. 1.0.0-beta6 released with CUDA 9.2, 10.0, 10.1 and 10.2 support
- SameDiff optimizations - memory use for inference and training significantly reduced, with some performance improvements also
- Deeplearning4j UI - Play framework replaced with Vertx; deeplearning4j-ui dependency now no longer has Scala dependency or Scala version suffix Link
- Note: No API changes, only artifact ID change: replace
deeplearning4j-ui_2.1x
withdeeplearning4j-ui
- ND4j namespace operation methods: operations are available through the Nd4j.math, Nd4j.random, Nd4j.bitwise, Nd4j.nn (neural network), for example
Nd4j.math.abs(INDArray)
,Nd4j.random.logNormal
etc Link.- Note that additional ND4J namespaces API will have additions (new namespaces and methods), and may have some API changes, in the next release
- OpenMP replaced with thread pool c++ parallelism framework; enabled c++ parallelism for platforms without C++-level threading for operations
- DNNL (MKL-DNN) upgraded to version 1.1
- Added causal convolution mode for Convolution1D layer (ConvolutionMode.Causal) and added causal conv1d support for Keras import Link
- Fixed various integer casts to avoid overflows for very large arrays (with dimensions or length > Integer.MAX_VALUE) Link
- Deeplearning4j SameDiff layers now use DL4J workspaces for better performance and reduced memory consumption Link
- Fixed an issue where ComputationGraph instances with a single input going into multiple embedding layers could throw a NPE Link
- Fixed an issue where loss function weights were not automatically cast to network datatype, resulting in an exception if not already correct type Link
- Fixed an issue with KNN where getMostPopulatedClusters actually returned the least populated clusters Link
- Deeplearning4j UI artifact ID has changed:
deeplearning4j-ui_2.1x
(beta5 and earlier) withdeeplearning4j-ui
- Added ND4j namespaces to match SameDiff: Nd4j.math, Nd4j.random, Nd4j.bitwise, Nd4j.nn (neural network) Link
- ND4J/SameDiff: BitCast, CompareAndBitpack, DivideNoNan, DrawBoundingBoxes, FakeQuantWithMinMaxVarsPerChannel ops added Link
- Uniform distribution op now supports random integer generation, not just random floating point generation Link
- Added the ability to disable platform helpers (DNNL/MKLDNN etc) via
Nd4jCPU.Environment.getInstance().allowHelpers(false);
andNd4jCuda.Environment.getInstance().allowHelpers(false);
Link
- OpenMP replaced with ThreadPool abstraction, enables parallelism for platforms without OpenMP support Link
- Removed
SameDiff.outputs()
“best guess” output inference due to being unreliable, in favor of explicitSameDiff.setOutputs(String...)
call Link - Switched SameDiff variable initialization to non-lazy, to avoid unexpected behaviour when mixing execution and ND4J RNG seed setting Link
- Moved CUDA build version and device logging to Java logging, from c++ stdout to enable disabling logging (via ND4J config or slf4j config) Link
- Fixed an issue where some ND4J methods could fail when the library is compiled on Java 9+ but run on Java 8 Link
- Fixed some issues for pad and mirror_pad ops to ensure they conform with Tensorflow for imported networks Link
- Added model server - remote inference of SameDiff and DL4J models using JSON or (optionally) binary serialization
- Added Scala 2.12 support, dropped Scala 2.10 support. Modules with Scala dependencies are now released with Scala 2.11 and 2.12 versions
- Apache Spark 1.x support dropped (now only Spark 2.x is supported). Note: Spark version suffix dropped: For upgrading:
1.0.0-beta4_spark2 -> 1.0.0-beta5
- Added FastText support to deeplearning4j-nlp
- CUDA support for all ND4J/SameDiff Operations
- In 1.0.0-beta4, some operations were CPU only. Now, all operations have full CUDA support
- Added support for new data types in ND4J (and DL4J/SameDiff): BFLOAT16, UINT16, UINT32, UINT64
- ND4J: Implicit broadcasting support added to INDArray (already present in SameDiff - for example shape
[3,1]+[3,2]=[3,2]
) - CUDA 9.2, 10.0 and 10.1-Update2 still supported
- NOTE: For CUDA 10.1, CUDA 10.1 update 2 is recommended. CUDA 10.1 and 10.1 Update 1 will still run, but rare internal cuBLAS issues may be encountered in heavily multi-threaded code on some systems
- Dependency upgrades: Jackson (2.5.1 to 2.9.9/2.9.9.3), Commons Compress (1.16.1 to 1.18), Play Framework (2.4.8 to 2.7.3), Guava: (20.0 to 28.0-jre, and shaded to avoid dependency clashes)
- CUDA: now host (RAM) buffers are only allocated when required (previously: host buffers were always allocated), in addition to device (GPU) buffer
- Added model server (DL4J and SameDiff models, JSON and binary communication) - JsonModelServer, JsonRemoteInference, Link, Link
- BertIterator: added option to prepend token to the output (such as
[cls]
expected by some models) (Link) - Added trace level logging to MultiLayerNetwork and ComputationGraph assist with debugging certain issues (Link)
- LSTM and Dropout will now fall back on built-in implementations if an exception is encountered from cuDNN (same as Subsampling/ConvolutionLayer) (Link)
- Improved validation when initializing networks with a non rank-2 (shape [1, numParams]) array (Link)
- Fixed issue where some Keras import models could fail with
Could not read abnormally long HDF5 attribute
(Link) - Fixed a Keras import issue where updater learning rate schedule could be imported incorrectly (Link)
- MultiLayerNetwork/ComputationGraph will now log the original exception if a second exception occurs during workspace closing, instead of swallowing it (inference/fit operation try/finally blocks) (Link)
- Upgraded dependencies: Jackson (2.5.1 to 2.9.9/2.9.9.3), Commons Compress (1.16.1 to 1.18), Play Framework (2.4.8 to 2.7.3), Guava: (20.0 to 28.0-jre, shaded to avoid dependency clashes) (Link)
- Logging framework can now be configured for DL4J UI (due to Play framework dependency upgrade) (Link)
- Reduced amount of garbage produced by MnistDataFetcher (impacts MNIST and EMNIST DataSetIterators) (Link)
- DL4J AsyncDataSetIterator and AsyncMultiDataSetIterator moved to ND4J, use
org.nd4j.linalg.dataset.Async(Multi)DataSetIterator
instead - Saved models with custom layers from 1.0.0-alpha and before can no longer be loaded. Workaround: load in 1.0.0-beta4, and re-save the model (Link). Models without custom layers can still be loaded back to 0.5.0
- Apache Spark 1.x support dropped (now only Spark 2.x is supported). Note: Spark version suffix dropped: For upgrading, change versions as follows:
1.0.0-beta4_spark2 -> 1.0.0-beta5
- Scala 2.10 dropped, Scala 2.12 added (for modules with Scala dependencies)
- Some layers (such as LSTM) may run slower on 1.0.0-beta5 than 1.0.0-beta4 on CUDA when not using cuDNN, due to added synchronization. This synchronization will be removed in the next release after 1.0.0-beta5
- CUDA 10.1: Rare internal cuBLAS issues may be encountered in heavily multi-threaded code on some systems, when running CUDA 10.1 Update 1 (and maybe 10.1). CUDA 10.1 update 2 is recommended.
- Added model server (DL4J and SameDiff models, JSON and binary communication) - JsonModelServer, JsonRemoteInference, Link, Link
- CUDA: now host (RAM) buffers are only allocated when required (previously: host buffers were always allocated), in addition to device (GPU) buffer
- Improved SameDiff training API - added "in line" test set evaluation, returning History object with loss curve, etc (Link)
- Added c++ benchmark suite callable from Java - call using
Nd4j.getExecutioner().runLightBenchmarkSuit()
andNd4j.getExecutioner().runFullBenchmarkSuit()
(Link) - Added axis configuraiton for evaluation instances (Evaluation, RegressionEvaluation, ROC, etc - getAxis and setAxis methods) to allow different data formats (NCHW vs. NHWC for CNNs, for example) (Link)
- SameDiff: Added support to convert constants to placeholders, via SDVariable.convertToConstant() method (Link)
- SameDiff: Added GradCheckUtil.checkActivationGradients method to check activation gradients for SameDiff instance (not just parameter gradients as in existing gradient check methods) (Link)
- Added INDArray reduction methods with "keep dimensions" option - for example,
INDArray.mean(boloean, int... dimension)
(Link)