Last updated
Was this helpful?
Last updated
Was this helpful?
Added support for CUDA 10.2. 1.0.0-beta6 released with CUDA 9.2, 10.0, 10.1 and 10.2 support
SameDiff optimizations - memory use for inference and training significantly reduced, with some performance improvements also
Deeplearning4j UI - Play framework replaced with Vertx; deeplearning4j-ui dependency now no longer has Scala dependency or Scala version suffix
Note: No API changes, only artifact ID change: replace deeplearning4j-ui_2.1x
with deeplearning4j-ui
ND4j namespace operation methods: operations are available through the Nd4j.math, Nd4j.random, Nd4j.bitwise, Nd4j.nn (neural network), for example Nd4j.math.abs(INDArray)
, Nd4j.random.logNormal
etc .
Note that additional ND4J namespaces API will have additions (new namespaces and methods), and may have some API changes, in the next release
OpenMP replaced with thread pool c++ parallelism framework; enabled c++ parallelism for platforms without C++-level threading for operations
DNNL (MKL-DNN) upgraded to version 1.1
Added causal convolution mode for Convolution1D layer (ConvolutionMode.Causal) and added causal conv1d support for Keras import
Keras import now supports scaled identity weight initialization
Added Mish activation function ,
BertIterator now has a BertIterator.featurizeSentences(List<String>)
method for inference ,
BertIterator now supports sentence pairs for supervised training
Added Sparse multi-class cross entropy for both Deeplearning4j and Keras import ,
Deeplearning4j UI: migrated from Play to Vertx for web serving backend, also removing dependency on Scala libraries; no API changes, only artifact ID change - replace deeplearning4j-ui_2.1x
with deeplearning4j-ui
,
Added TimeDistributed wrapper layer
Deeplearning4j UI artifact ID has changed: deeplearning4j-ui_2.1x
(beta5 and earlier) with deeplearning4j-ui
KDTree implementation optimized
Deeplearning4j zoo models and datasets hosting location updated
Fixed nIn validation for Deconv2D layer
Fixed an issue with incorrect Deconvolution2d results for Keras import models
Added DNNL/MKLDNN support for batch normalization layer ,
Fixed various integer casts to avoid overflows for very large arrays (with dimensions or length > Integer.MAX_VALUE)
Fixed an issue with UNet non-pretrained model architecture (last layer kernel size)
Deeplearning4j SameDiff layers now use DL4J workspaces for better performance and reduced memory consumption
Updated broken links in afew error messages
Cleaned up a few unused dependencies in various modules
Cleaned up duplicate SamplingDataSetIterator class
Fixed an issue where ComputationGraph instances with a single input going into multiple embedding layers could throw a NPE
Fixed an issue where loss function weights were not automatically cast to network datatype, resulting in an exception if not already correct type
Shaded Jackson version upgraded from 2.9.9/2.9.9.3 to 2.10.1
Fixed an issue with KNN where getMostPopulatedClusters actually returned the least populated clusters
Added suport for CUDA 10.2
DNNL (MKL-DNN) upgraded to version 1.1
Added ND4j namespaces to match SameDiff: Nd4j.math, Nd4j.random, Nd4j.bitwise, Nd4j.nn (neural network)
Added SameDiff.calculateGradientsAndOutputs method
Additional SameDiff single batch .output method overloads for DataSet/MultiDataSet added
TensorFlow import ops coverage enhanced (significant number of additional ops supported) , , , ,
PRelu op added
adjust_contrast, igamma and igammac ops added
ND4J/SameDiff: BitCast, CompareAndBitpack, DivideNoNan, DrawBoundingBoxes, FakeQuantWithMinMaxVarsPerChannel ops added
non_max_suppression_overlaps op added
ImagePreProcessingScaler now supports segmentation use cases
concat operation now supports the concatenation axis being specified via the last input array
Added Gamma and Poisson RNG distributions
SameDiff’s use of DeviceLocal for variables/constants etc is now configurable
Uniform distribution op now supports random integer generation, not just random floating point generation
SameDiff: Added simple OpBenchmarkListener for benchmarking purposes
Added the ability to disable platform helpers (DNNL/MKLDNN etc) via Nd4jCPU.Environment.getInstance().allowHelpers(false);
and Nd4jCuda.Environment.getInstance().allowHelpers(false);
Added draw_bounding_boxes operation
Added resize_bicubic operation
Added causal padding mode to conv1d operation
DNNL (MKLDNN) is included and enabled by default for non-AVX builds
Added SameDiff ArraySavingListener for debugging purposes
OpenMP replaced with ThreadPool abstraction, enables parallelism for platforms without OpenMP support
SameDiff memory management overheauled for (in some cases significantlny) reduced memory consumption and improved performance ,
Switched to Clang instead of gcc for OSX compilation to avoid compiler-related issues
Removed SameDiff.outputs()
“best guess” output inference due to being unreliable, in favor of explicit SameDiff.setOutputs(String...)
call
Fixed an issue with Nd4j.hstack on 1D arrays
SameDiff no longer allows empty arrays for variables
Fixed an issue with Nadam updater LR schedules not being cloned
Cleaned up IActivation interface
Added new LSTM op implementation with DNNL/MKLDNN support (forward pass only so far)
SameDiff API cleaned up; deprecated methods removed
Switched SameDiff variable initialization to non-lazy, to avoid unexpected behaviour when mixing execution and ND4J RNG seed setting
SameDiff.zero and .one methods now create constants, not vairables
Moved CUDA build version and device logging to Java logging, from c++ stdout to enable disabling logging (via ND4J config or slf4j config)
Added DNNL/MKLDNN support for batch normalization
SameDiff: Fixed an issue where listeners weren’t being called for gradient calculation
Added DNNL/MKLDNN support for deconv2d/3d operations
Fixed an issue with biasadd_bp operation and NHWC data format
Fixed an issue with certain strided slice backprop configurations ,
Fixed an issue with LogSumExp reduction operation backprop for along dimension case ,
INDArray.toString() now has correct brackets for rank 1+ scalars to avoid ambiguity
Fixed an issue where some ND4J methods could fail when the library is compiled on Java 9+ but run on Java 8
Fixed empty array input case for is_strictly_increasing, non_decreasing and non_max_suppression ops ,
Fixed empty input arrays for legacy ops (transform, scalar, pairwise, broadcast)
CUDA compute capability 3.0 is supported again
Improved performance for Scatter operations (1D case) + index validation
Fixed an issue where SameDiff TrainingConfig serialization would fail if evaluation instances are set ,
SameDiff execution will now throw an exception when assertion operations in the graph fail
PolyGamma function now returns NaNs when passed double for args requiring integer values
Fixed some issues for pad and mirror_pad ops to ensure they conform with Tensorflow for imported networks
Updated and fixed some issues for TensorFlow graph runner
Improved performance for Reverse operation
Removed/cleanup up unused ND4J list functionality
Fixed reduce bool operation results (such as any, all, IsInf, etc) for empty array inputs
SameDiff.outputs()
now requires user to call SameDiff.setOutputs(String...)
first; previous “best guess” output inference was unreliable
SameDiff.zero and .one methods now create constants, not vairables
NativeImageLoader now checks for empty input streams and throws an exception instead of crashing
NDArrayScalarOpTransform now supports modulus operator
Added AsyncTrainingListener
Replaced multiple uses of java.util.Random with ND4J Random
Added Observable and LegacyMDPWrapper
Refactored RL4J video recording to separate VideoRecorder class
Fixed an issue with target for DQN ,
Refactoring for DQN and double DQN for improved maintainability
Internal refactoring and various bug fixes
PyDataVec TransformProcess now supports non-inplace operations
Fixed various issues with PyDataVec
Fixed an issue with data locality that could cause incorrect results under some circumstances when running on CUDA