Highlights - 1.0.0-beta2 Release

    ND4J/Deeplearning4j: Added support for CUDA 9.2. Dropped support for CUDA 9.1. (1.0.0-beta2 release has CUDA 8.0, 9.0 and 9.2 support)
    Deeplearning4j: New SameDiff layers with training support - Link Link
    Deeplearning4j resource (datasets, pretrained models) storage directory can now be configured via DL4JResources.setBaseDirectory method or org.deeplearning4j.resources.directory system property
    ND4J: all indexing is now done with longs instead of ints to allow for arrays with dimensions and lengths greater than Integer.MAX_VALUE (approx. 2.1 billion)
    ND4J: nd4j-native-platform will now use Intel MKL-DNN as the default/bundled BLAS implementation (replacing OpenBLAS as the previous default)
    Deeplearning4j: Added Out-of-memory (OOM) crash dump reporting functionality. Provides a dump with memory use and configuration if training/inference OOMs (to assist with debugging and tuning memory configuration).
    Deeplearning4j - new layers: Locally connected 1d Link, Locally connected 2d Link


Deeplearning4J: New Features

    Added new SameDiff layers (automatic differentiation - only single class, forward pass definition required) to DL4J with full training support - SameDiffLayer, SameDiffVertex, SameDiffOutputLayer, SameDiffLambdaLayer, SameDiffLambdaVertex - note that these are CPU-only execution for now Link Link Link
    Resource (datasets, pretrained models) storage directory can now be configured via DL4JResources.setBaseDirectory method or org.deeplearning4j.resources.directory system property. Note that it is also possible to set a different base location for downloads (for local mirrors of DL4J resources) Link
    Added Out-of-memory (OOM) crash dump reporting functionality. Provides a dump with memory use and configuration if training/inference OOMs. Same information is available (without a crash) for MultiLayerNetwork/ComputationGraph.memoryInfo methods. Can be disabled (or output directory set) using system properties - Link
    Added Composite[Multi]DataSetPreProcessor to enable multiple [Multi]DataSetPreProcessors to be applied in a single iterator Link
    Added ComputationGraph evaluate methods for multi-output networks: evaluate(DataSetIterator, Map<Integer,IEvaluation[]>) and evaluate(MultiDataSetIterator, Map<Integer,IEvaluation[]>) Link
    Added JointMultiDataSetIterator - utility iterator used to create MultiDataSetIterator from multiple DataSetIterators Link
    GraphVertices may now have trainable parameters directly (not just enclose layers with trainable parameters) Link
    Added MultiLayerNetwork/ComputationGraph getLearningRate methods Link
    Added RandomDataSetIterator and RandomMultiDataSetIterator (mainly for testing/debugging) Link Link
    Added cyclical "1cycle" schedule for learning rate schedules etc - Link
    RDD repartitioning for Spark training is more configurable (adds Repartitioner interface) Link
    Added ComputationGraph.getIterationCount() and .getEpochCount() for consistency with MultiLayerNetwork Link
    Added locally connected 1d layer Link Link
    Spark "data loader" API (mainly for Spark) Link Link Link
    Spark evaluation: added evaluation method overloads that allow specifying the number of evaluation workers (less than number of Spark threads) Link
    CnnSentenceDataSetIterator now has a Format argument, and supports outputting data for RNNs and 1D CNNs Link
    Added ComputationGraph/MultiLayerNetwork.pretrain((Multi)DataSetIterator, int epochs) method overloads Link
    MultiLayerNetwork and ComputationGraph now have output method overloads where the network output can be placed in the user-specified workspace, instead of being detached Link Link. This can be used to avoid creating INDArrays that need to be garbage collected before native memory can be freed.
    EmbeddingSequenceLayer now supports [minibatch,1,seqLength] format sequence data in addition to [minibatch,seqLength] format data Link
    CuDNN batch norm implementation will now be used for rank 2 input, not just rank 4 input Link
    Environment variables and system properties for DL4J have been centralized into DL4JResources and DL4JEnvironmentVars classes, with proper descriptions Link Link
    MultiLayerNetwork and ComputationGraph output/feedForward/fit methods are now thread-safe via synchronization. Note that concurrent use is not recommended due to performance (instead: use ParallelInference); however the now-synchronized methods should avoid obscure errors due to concurrent modifications Link
    BarnesHutTSNE now throws a useful exception in the case where the distance metric is undefined (for example, all zeros plus cosine similarity) Link

Deeplearning4J: Bug Fixes and Optimizations

    ComputationGraph.addListeners was not working correctly if listeners were already present Link, Link
    TinyImageNetDataSetIterator did not validate/correctly use input shape configuration Link, Link
    BatchNormalization layer now correctly asserts that nOut is set if required (instead of unfriendly shape errors later) Link
    Fixed issue where OutputLayer may not initialize parameter constraints correctly Link
    Fixed performance issue with Nesterov updater using CPU-only op for CUDA execution Link
    Removed TerminationCondition for DL4J optimizers - was not used in practice, and had minor overhead Link
    Fixed issue where EvaluativeListener could hit a workspace validation exception when workspaces are enabled Link
    Fixed issue where TrainingListener.onEpochStart/onEpochEnd were not being called correctly for ComputationGraph Link
    Fixed workspace issue with TensorFlowCnnToFeedForwardPreProcessor Link
    Performance optimization for BatchNormalization when using CuDNN Link
    Performance optimization: Dropout will be applied in-place when safe to do so, avoiding a copy Link
    Added CuDNN implementation of Dropout Link
    Reduced memory use for CuDNN: CuDNN working memory is now shared and reused between layers within a network Link
    CuDNN batch normalization implementation would fail with FP16 datatype Link
    Fixed issue Bidirectional LSTM may incorrectly use workspaces causing an exception Link
    Fixed issue with early stopping where scores to be maximized (accuracy, f1, etc) were not properly triggering termination conditions Link
    Fixed issue where label mask counter could be incorrectly incremented in ComputationGraph.computeGradientAndScore() Link
    ComputationGraph was not setting lastEtlTime field during training Link
    Fixed issue with AutoEncoder layer when workspaces are enabled Link
    Fixed issue with EmbeddingSequenceLayer use of mask arrays Link
    Lombok is now provided scope everywhere, isn't on user classpath when using DL4J Link
    Fixed issue where WordVectorSerializer.readParagraphVectors(File) initialization of label source Link
    Spark training (gradient sharing) now properly handles empty partition edge case when encountered during training Link
    Errors are propagated better/more consistently for Spark gradient sharing training Link
    Fixed issue with 1D CNN layers with mask arrays and stride > 1 (masks not being correctly downsized) Link
    DL4J Batch norm implementation was not correctly adding epsilon value during inference, only during training (CuDNN unaffected) Link
    CuDNN subsampling layers with max pooling and ConvolutionMode.SAME may have taken padding value (0) as the maximum for border values when all non-padding values are less than 0 Link
    Spark training with gradient sharing now passes listeners to workers correctly Link
    Fixed rare (and non-terminal) concurrent modification issue with UI and FileStatsStorage Link
    CuDNN convolution layer now supports dilation > 2 (previously: used DL4J conv layer implementation as a fallback) Link
    Yolo2OutputLayer now implements computeScoreForExamples() Link
    SequenceRecordReeaderDataSetIterator now handles the "no labels" case correctly Link
    Fixed issue where BarnesHutTSNE could hit a workspace validation exception Link
    EMNIST iterator could produce incorrect data in some cases after a reset Link

Deeplearning4J: API Changes (Transition Guide): 1.0.0-beta to 1.0.0-beta2

    GravesLSTM has been deprecated in favor of LSTM due to lack of CuDNN support but otherwise similar accuracy to in practice. Use LSTM class instead.
    deeplearning4j-modelexport-solr: now uses Lucene/Solr version 7.4.0 (was 7.3.0) Link
    Mask arrays for CNN2d layers must be in broadcastable 4d format: [minibatch,depth or 1, height or 1, width or 1] - previously they were 2d with shape [minibatch,height] or [minibatch,width]. This provents ambiguity in later cases (pooling layers), and allows for more complex masking scenarios (such as masking for different image sizes in same minibatch). Link
    Some older/deprecated Model and Layer methods have been removed. (validateInput(), initParams()). Some custom layers may need to be updated as a result Link

Deelpearning4J: 1.0.0-beta2 Known Issues

    Windows users are unable to load the HDF5 files used in SvhnLabelProvider (used in HouseNumberDetection example). Linux/Mac users are unaffected. A workaround for windows users is to add the sonatype snapshot dependency org.bytedeco.javacpp-presets:hdf5-platform:jar:1.10.2-1.4.3-SNAPSHOT Link

Deeplearing4J: Keras Import

    Keras model import now imports every Keras application
    Supports GlobalPooling3D layer import
    Supports RepeatVector layer import
    Supports LocallyConnected1D and LocallyConnected2D layers
    Keras Lambda layers can now be imported by registering custom SameDiff layers
    All Keras optimizers are now supported
    All advanced activation functions can now be imported.
    Many minor bugs have been fixed, including proper weight setting for all configurations of BatchNormalization, improvements to Reshape SeparableConvolution2D, and full support of Bidirectional layers.


ND4J: New Features

    ND4J: all indexing is now done with longs instead of ints to allow for arrays with dimensions and lengths greater than Integer.MAX_VALUE (approx. 2.1 billion)
    Added the ability to write Numpy .npy format using Nd4j.writeAsNumpy(INDArray,File) and convert an INDArray to a numpy strict in-memory using Nd4j.convertToNumpy(INDArray) Link
    ND4j-common ClassPathResource: added ClassPathResource.copyDirectory(File) Link
    SameDiff: A significant number of new ops, and backprop implementations for existing ops
    Added Nd4j.randomBernoulli/Binomial/Exponential convenience methods Link
    Added way to disable/suppress ND4J initialization logging via org.nd4j.log.initialization system property Link
    SameDiff class - most op/constructor methods now have complete/useful javadoc Link
    Workspaces can now be disabled globally, ignoring workspace configuration. This is mainly used for debugging; use Nd4j.getWorkspaceManager().setDebugMode(DebugMode.DISABLED) or Nd4j.getWorkspaceManager().setDebugMode(DebugMode.SPILL_EVERYTHING); to enable this. Link [Link]
    Added EnvironmentalAction API for environment variable processing Link
    ND4J environment variables and system properties have been centralized in ND4jEnvironmentVars and ND4jSystemProperties classes Link and Link

ND4J: Bug Fixes and Optimizations

    SameDiff: a significant number of bug fixes for execution and individual ops
    Fixed issue where INDArray.toDoubleArray() with true scalars (rank 0 arrays) Link
    Fixed issue with DataSet.sample() not working for rank 3+ features Link
    IActivation implementations now validate/enforce same shape for activations and gradients Link
    Fixed issue with muliColumnVector where vector is 1d Link
    ImagePreProcessingScaler now supports serialization via NormalizerSerializerStrategy and ModelSerializer Link
    Performance optimization for threshold encoding used in DL4J's Spark gradient sharing distributed training implementation Link
    SameDiff: Fixed issue where memory wasn't always released after execution Link
    DataSet.save() and MultiDataSet.save() methods now save example metadata when present Link
    Fixed issue with KFoldIterator when dataset does not divide equally into folds with no remainder Link
    Fixed issue where version check functionality could fail to load resources if resources are on a path with spaces Link

ND4J: Known Issues

ND4J: API Changes (Transition Guide): 1.0.0-beta to 1.0.0-beta2

    CUDA 9.1 support has been removed. CUDA 8.0, 9.0 and 9.2 support is available
    Due to long indexing changes, long/long[] should be used in place of int/int[] in some places (such as INDArray.size(int), INDArray.shape())
    Simplified DataSetIterator API: totalExamples(), cursor() and numExamples() - these were unsupported on most DataSetIterator implementations, and not used in practice for training. Custom iterators should remove these methods also Link
    Long-deprecated DataSet.getFeatureMatrix() has been removed. Use DataSet.getFeatures() instead. Link
    Unused and not properly tested/maintained utility class BigDecimalMath has been removed. Users should find an aternative library for this functionality, if required.
    Not properly maintained complex number support classes (IComplexNumber, IComplexNDArray) have been removed entirely Link


DataVec: New Features

    Added AnalyzeLocal class to mirror functionality of AnalyzeSpark (but without Spark dependency) Link
    Added JacksonLineSequenceRecordReader: RecordReader used for multi-example JSON/XML where each line in a file is an independent example Link
    Added RecordConvert.toRecord(Schema, List<Object>) Link
    Added missing FloatColumnCondition Link
    Added CSVLineSequenceRecordReader for "each line in CSV is a sequence, and sequence is single-valued/univariate" Link
    Added CSVMultiSequenceRecordReader for "multiple multi-valued sequences in a single CSV" data Link

DataVec: Optimizations and Bug Fixes

    Fixed issue with NativeImageLoader on Android Link
    Fixed issue with ExcelRecordReader Link
    Fixed issue where bad args for CSVRecordReader.next(int) could cause an unnecessarily large list to be generated Link

DataVec: API Changes (Transition Guide): 1.0.0-beta to 1.0.0-beta2


Arbiter: New Features

    Added DataSource interface. Unlike old DataProvider, this does not require JSON serializability (only a no-arg constructor) Link
    Added numerous enhancements and missing configuration options (constraints, dilation, etc) Link Link

Arbiter: Fixes

    DataProvider has been deprecated. Use DataSource instead.


    stepCounter, epochCounter and historyProcessor can now be set Link
    Random seed is now loaded for ACPolicy is loaded Link