Comment on page
1.0.0-beta3
- ND4J/Deeplearning4j: Added support for CUDA 10.0. Dropped support for CUDA 8.0. (1.0.0-beta3 release has CUDA 9.0, 9.2 and 10.0 support)
- SameDiff now supports training and evaluation from DataSetIterator and MultiDataSetIterator. Evaluation classes have been moved to ND4J.
- DL4J Spark training (gradient sharing) is now fully fault tolerant, and has improvements for threshold adaption (potentially more robust convergence). Ports can now be easily configured independently on master/workers.
- Added validation for incompatible loss/activation function combinations (such as softmax+nOut=1, or sigmoid+mcxent). New validation can be disabled using outputValidation(false) Link
- Spark training: overhauled gradient sharing threshold adaption algorithms; made it possible to customize threshold settings, plus made defaults more robust to initial threshold configuration improving convergence speed in some cases. Link
- Spark training: implemented chunked messaging to reduce memory requirements (and insufficient buffer length issues) for large messages Link
- Spark network data pipelines: added FileBatch, FileBatchRecordReader etc for "small files" (images etc) distributed training use cases Link
- ParallelWrapper module now no longer has a Scala version suffix for artifact id; new artifact id is
deeplearning4j-parallel-wrapper
Link - Spark training: added SharedTrainingMaster.Builder.workerTogglePeriodicGC and .workerPeriodicGCFrequency to easily configure the ND4J garbage collection configuration on workers. Set default GC to 5 seconds on workers Link
- Spark training: added threshold encoding debug mode (logs current threshold and encoding statistics on each worker during training). Enable using
SharedTrainingConfiguration.builder.encodingDebugMode(true)
. Note this operation has computational overhead. Link
- Fixed an issue where L1/L2 and updaters (Adam, Nesterov, etc) were applied before dividing gradients by minibatch to obtain average gradient. To maintain old behaviour, use
NeuralNetConfiguration.Builder.legacyBatchScaledL2(true)
Link.- Note that learning rates may need to be decreased for some updaters (such as Adam) to account for this change vs. earlier versions. Some other updaters (such as SGD, NoOp, etc) should be unaffected.
- Note that deserialized (loaded) configurations/networks saved in 1.0.0-beta2 or earlier will default to old behaviour for backward compatibility. All new networks (created in 1.0.0-beta3) will default to the new behaviour.
- Fixed an issue where EarlyStoppingScoreCalculator would not correctly handle "maximize score" cases instead of minimizing Link
- Optimization: A workspace is now used for ComputationGraph/MultiLayerNetwork evaluation methods (avoids allocating off-heap memory during evaluation that must be cleaned up by garbage collector) Link
- Fixed an issue where shuffling combined with a subset for MnistDataSetIterator would not maintain the same subset between resets Link
- Fixed issue with CuDNN workspaces where backpropagation is performed outside of a standard fit call Link
- Fixed an issue with dropout masks being cleared prematurely on output layers in ComputationGraph Link
- Fixed bug in multi input/output ComputationGraphs with TBPTT combined with both masking and different number of input/output arrays Link
- Fixed bug with TransferLearning GraphBuilder nOutReplace when combined with subsampling layers Link
- Fixed an issue where gradient normalization could result in NaNs if gradient is exactly 0.0 for all parameters in a layer Link
- Fixed an issue where MultiLayerNetwork/ComputationGraph.setLearningRate could throw an exception for SGD and NoOp updaters Link
- DL4J Spark training: Fixed issues with thread/device affinity for multi-GPU training + evaluation Link
- DL4J Spark training: Made all Aeron threads daemon threads to prevent Aeron from stopping JVM shutdown when all other threads have completed Link
- Added cudnnAllowFallback configuration for BatchNormalization layer (fallback to built-in implementation if CuDNN fails unexpectedly) Link
- Fixed an issue with BatchNormalization layers that prevented the mean/variance estimates from being synced properly on each worker for GradientSharing training, causing convergence issues Link
- DL4J Spark training and evaluation: methods now use Hadoop Configuration from Spark context to ensure runtime-set configuration is available in Spark functions reading directly from remote storage (HDFS etc) Link
- Added data validation for Nd4j.readTxt - now throws exception on invalid input instead of returning incorrect values Link
- Fixed an issue with KNN implementation where a deadlock could occur if an invalid distance function (one returning "distances" less than 0) was utilized Link
- Added synchronization to loading of Keras import models to avoid thread safety issues in the underlying HDFS library used for loading Link
- IEvaluation classes in DL4J have been deprecated and moved to ND4J so they are available for SameDiff training. Functionality and APIs are unchanged
- MultiLayerConfiguration/ComputationGraphConfiguration
pretrain(boolean)
andbackprop(boolean)
have been deprecated and are no longer used. Use fit and pretrain/pretrainLayer methods instead. Link - ParallelWrapper module now no longer has a Scala version suffix for artifact id; new artifact id is
deeplearning4j-parallel-wrapper
which should be used instead Link - deeplearning4j-nlp-korean module now has Scala version suffix due to scala dependencies; new artifact ID is
deeplearning4j-nlp-korean_2.10
anddeeplearning4j-nlp-korean_2.11
Link
- Running multiple Spark training jobs simultaneously on the one physical node (i.e., multiple JVMs from one or more Spark jobs) may cause problems with network communication. A workaround for this is to manually set a unique stream ID manually in the VoidConfiguration. Use a unique (or random) integer value for different jobs Link
- Added SameDiff training and evaluation: SameDiff instances can now be trained directly using DataSetIterator and MultiDataSetIterator, and evaluated using IEvaluation instances (that have been moved from ND4J to DL4J) Link
- Added GraphServer implementation: c++ inference server for SameDiff (and Tensorflow, via TF import) with Java API Link
- Libnd4j new ops:
- CudaGridExecutioner now warns that exception stack traces may be delayed to avoid confusion in debugging exceptions occuring during asynchronous execution of ops Link
- Libnd4j native op fixes:
- Fixed issue where INDArray.permute would not correctly throw an exception for invalid length case Link
- INDArray.assign(INDArray) no longer allows assigning different shape arrays (other than scalar/vector cases) Link
- Transforms.softmax now throws an error on unsupported shapes instead of simply not applying operation Link
- VersionCheck functionality: handle case where SimpleFileVisitor is not available on earlier versions of Android Link
- SameDiff convolution layer configuration (Conv2dConfig/Conv3dConfig/Pooling3dConfig etc) have had parameter names aligned Link
- CUDA 8.0 support has been removed. CUDA 9.0, 9.2 and 10.0 support is available in 1.0.0-beta3
- nd4j-base64 module contents have been deprecated; use the equivalent classes in nd4j-api from now on Link
- Some classes in nd4j-jackson module has been deprecated; use the equivalent classes in nd4j-api from now on Link
- Android users may need to manually exclude the (now deprecated) module nd4j-base64. This is due to
org.nd4j.serde.base64.Nd4jBase64
class being present in both nd4j-api and nd4j-base64 modules. Both versions have identical content. Useexclude group: 'org.nd4j', module: 'nd4j-base64'
to exclude.
Last modified 1yr ago