FP16 support for CUDA
Better performance for multi-gpu
Including optional P2P memory access support
Normalization support for time series and images
Normalization support for labels
Removal of Canova and shift to DataVec: Javadoc, Github Repo
Numerous bug fixes
Spark improvements