Deeplearning4j
Community ForumND4J JavadocDL4J Javadoc
EN 1.0.0-M2
EN 1.0.0-M2
  • Deeplearning4j Suite Overview
  • Release Notes
    • 1.0.0-M2
    • 1.0.0-M1.1
    • 1.0.0-M1
    • 1.0.0-beta7
    • 1.0.0-beta6
    • 1.0.0-beta5
    • 1.0.0-beta4
    • 1.0.0-beta3
    • 1.0.0-beta2
    • 1.0.0-beta
    • 1.0.0-alpha
    • 0.9.1
    • 0.9.0
    • 0.8.0
    • 0.7.2
    • 0.7.1
    • 0.7.0
    • 0.6.0
    • 0.5.0
    • 0.4.0
    • 1.00-M2.2
  • Multi-Project
    • Tutorials
      • Beginners
      • Quickstart
    • How To Guides
      • Import in to your favorite IDE
      • Contribute
        • Eclipse Contributors
      • Developer Docs
        • Github Actions/Build Infra
        • Javacpp
        • Release
        • Testing
      • Build From Source
      • Benchmark
      • Beginners
    • Reference
      • Examples Tour
    • Explanation
      • The core workflow
      • Configuration
        • Backends
          • Performance Issues
          • CPU
          • Cudnn
        • Memory
          • Workspaces
      • Build Tools
      • Snapshots
      • Maven
  • Deeplearning4j
    • Tutorials
      • Quick Start
      • Language Processing
        • Doc2Vec
        • Sentence Iterator
        • Tokenization
        • Vocabulary Cache
    • How To Guides
      • Custom Layers
      • Keras Import
        • Functional Models
        • Sequential Models
        • Custom Layers
        • Keras Import API Overview
          • Advanced Activations
          • Convolutional Layers
          • Core Layers
          • Embedding Layers
          • Local Layers
          • Noise Layers
          • Normalization Layers
          • Pooling Layers
          • Recurrent Layers
          • Wrapper Layers
        • Supported Features Overview
          • Activations
          • Constraints
          • Initializers
          • Losses
          • Optimizers
          • Regularizers
      • Tuning and Training
        • Visualization
        • Troubleshooting Training
        • Early Stopping
        • Evaluation
        • Transfer Learning
    • Reference
      • Model Zoo
        • Zoo Models
      • Activations
      • Auto Encoders
      • Computation Graph
      • Convolutional Layers
      • DataSet Iterators
      • Layers
      • Model Listeners
      • Saving and Loading Models
      • Multi Layer Network
      • Recurrent Layers
      • Updaters/Optimizers
      • Vertices
      • Word2vec/Glove/Doc2Vec
    • Explanation
  • datavec
    • Tutorials
      • Overview
    • How To Guides
    • Reference
      • Analysis
      • Conditions
      • Executors
      • Filters
      • Normalization
      • Operations
      • Transforms
      • Readers
      • Records
      • Reductions
      • Schemas
      • Serialization
      • Visualization
    • Explanation
  • Nd4j
    • Tutorials
      • Quickstart
    • How To Guides
      • Other Framework Interop
        • Tensorflow
        • TVM
        • Onnx
      • Matrix Manipulation
      • Element wise Operations
      • Basics
    • Reference
      • Op Descriptor Format
      • Tensor
      • Syntax
    • Explanation
  • Samediff
    • Tutorials
      • Quickstart
    • How To Guides
      • Importing Tensorflow
      • Adding Operations
        • codegen
    • Reference
      • Operation Namespaces
        • Base Operations
        • Bitwise
        • CNN
        • Image
        • LinAlg
        • Loss
        • Math
        • NN
        • Random
        • RNN
      • Variables
    • Explanation
      • Model Import Framework
  • Libnd4j
    • How To Guides
      • Building on Windows
      • Building for raspberry pi or Jetson Nano
      • Building on ios
      • How to Add Operations
      • How to Setup CLion
    • Reference
      • Understanding graph execution
      • Overview of working with libnd4j
      • Helpers Overview (CUDNN, OneDNN,Armcompute)
    • Explanation
  • Python4j
    • Tutorials
      • Quickstart
    • How To Guides
      • Write Python Script
    • Reference
      • Python Types
      • Python Path
      • Garbage Collection
      • Python Script Execution
    • Explanation
  • Spark
    • Tutorials
      • DL4J on Spark Quickstart
    • How To Guides
      • How To
      • Data How To
    • Reference
      • Parameter Server
      • Technical Reference
    • Explanation
      • Spark API Reference
  • codegen
Powered by GitBook
On this page
  • Why normalize?
  • Available preprocessors
  • NormalizerMinMaxScaler
  • Normalizer
  • ImageFlatteningDataSetPreProcessor
  • MinMaxStrategy
  • ImagePreProcessingScaler
  • CompositeMultiDataSetPreProcessor
  • MultiNormalizerMinMaxScaler
  • MultiDataNormalization
  • ImageMultiPreProcessingScaler
  • NormalizerStandardize
  • StandardizeStrategy
  • NormalizerStrategy
  • MultiNormalizerHybrid
  • CompositeDataSetPreProcessor
  • MultiNormalizerStandardize
  • VGG16ImagePreProcessor
  • DataNormalization

Was this helpful?

Export as PDF
  1. datavec
  2. Reference

Normalization

PreviousFiltersNextOperations

Last updated 3 years ago

Was this helpful?

Why normalize?

Neural networks work best when the data they’re fed is normalized, constrained to a range between -1 and 1. There are several reasons for that. One is that nets are trained using gradient descent, and their activation functions usually having an active range somewhere between -1 and 1. Even when using an activation function that doesn’t saturate quickly, it is still good practice to constrain your values to this range to improve performance.

Available preprocessors

NormalizerMinMaxScaler

Pre processor for DataSets that normalizes feature values (and optionally label values) to lie between a minimum and maximum value (by default between 0 and 1)

NormalizerMinMaxScaler

public NormalizerMinMaxScaler(double minRange, double maxRange)

Preprocessor can take a range as minRange and maxRange

  • param minRange

  • param maxRange

load

public void load(File... statistics) throws IOException

Load the given min and max

  • param statistics the statistics to load

  • throws IOException

save

public void save(File... files) throws IOException

Save the current min and max

  • param files the statistics to save

  • throws IOException

  • deprecated use {- link NormalizerSerializer instead}

Normalizer

Base interface for all normalizers

ImageFlatteningDataSetPreProcessor

A DataSetPreProcessor used to flatten a 4d CNN features array to a flattened 2d format (for use in networks such as a DenseLayer/multi-layer perceptron)

MinMaxStrategy

statistics of the upper and lower bounds of the population

MinMaxStrategy

public MinMaxStrategy(double minRange, double maxRange)
  • param minRange the target range lower bound

  • param maxRange the target range upper bound

preProcess

public void preProcess(INDArray array, INDArray maskArray, MinMaxStats stats)

Normalize a data array

  • param array the data to normalize

  • param stats statistics of the data population

revert

public void revert(INDArray array, INDArray maskArray, MinMaxStats stats)

Denormalize a data array

  • param array the data to denormalize

  • param stats statistics of the data population

ImagePreProcessingScaler

Created by susaneraly on 6/23/16. A preprocessor specifically for images that applies min max scaling Can take a range, so pixel values can be scaled from 0->255 to minRange->maxRange default minRange = 0 and maxRange = 1; If pixel values are not 8 bits, you can specify the number of bits as the third argument in the constructor For values that are already floating point, specify the number of bits as 1

ImagePreProcessingScaler

public ImagePreProcessingScaler(double a, double b, int maxBits)

Preprocessor can take a range as minRange and maxRange

  • param a, default = 0

  • param b, default = 1

  • param maxBits in the image, default = 8

fit

public void fit(DataSet dataSet)

Fit a dataset (only compute based on the statistics from this dataset0

  • param dataSet the dataset to compute on

fit

public void fit(DataSetIterator iterator)

Iterates over a dataset accumulating statistics for normalization

  • param iterator the iterator to use for collecting statistics.

transform

public void transform(DataSet toPreProcess)

Transform the data

  • param toPreProcess the dataset to transform

CompositeMultiDataSetPreProcessor

A simple Composite MultiDataSetPreProcessor - allows you to apply multiple MultiDataSetPreProcessors sequentially on the one MultiDataSet, in the order they are passed to the constructor

CompositeMultiDataSetPreProcessor

public CompositeMultiDataSetPreProcessor(MultiDataSetPreProcessor... preProcessors)
  • param preProcessors Preprocessors to apply. They will be applied in this order

MultiNormalizerMinMaxScaler

Pre processor for MultiDataSet that normalizes feature values (and optionally label values) to lie between a minimum and maximum value (by default between 0 and 1)

MultiNormalizerMinMaxScaler

public MultiNormalizerMinMaxScaler(double minRange, double maxRange)

Preprocessor can take a range as minRange and maxRange

  • param minRange the target range lower bound

  • param maxRange the target range upper bound

MultiDataNormalization

An interface for multi dataset normalizers. Data normalizers compute some sort of statistics over a MultiDataSet and scale the data in some way.

ImageMultiPreProcessingScaler

A preprocessor specifically for images that applies min max scaling to one or more of the feature arrays in a MultiDataSet. Can take a range, so pixel values can be scaled from 0->255 to minRange->maxRange default minRange = 0 and maxRange = 1; If pixel values are not 8 bits, you can specify the number of bits as the third argument in the constructor For values that are already floating point, specify the number of bits as 1

ImageMultiPreProcessingScaler

public ImageMultiPreProcessingScaler(double a, double b, int maxBits, int[] featureIndices)

Preprocessor can take a range as minRange and maxRange

  • param a, default = 0

  • param b, default = 1

  • param maxBits in the image, default = 8

  • param featureIndices Indices of feature arrays to process. If only one feature array is present, this should always be 0

NormalizerStandardize

Created by susaneraly, Ede Meijer variance and mean Pre processor for DataSet that normalizes feature values (and optionally label values) to have 0 mean and a standard deviation of 1

load

public void load(File... files) throws IOException

Load the means and standard deviations from the file system

  • param files the files to load from. Needs 4 files if normalizing labels, otherwise 2.

save

public void save(File... files) throws IOException
  • param files the files to save to. Needs 4 files if normalizing labels, otherwise 2.

  • deprecated use {- link NormalizerSerializer} instead

Save the current means and standard deviations to the file system

StandardizeStrategy

of the means and standard deviations of the population

preProcess

public void preProcess(INDArray array, INDArray maskArray, DistributionStats stats)

Normalize a data array

  • param array the data to normalize

  • param stats statistics of the data population

revert

public void revert(INDArray array, INDArray maskArray, DistributionStats stats)

Denormalize a data array

  • param array the data to denormalize

  • param stats statistics of the data population

NormalizerStrategy

Interface for strategies that can normalize and denormalize data arrays based on statistics of the population

MultiNormalizerHybrid

Pre processor for MultiDataSet that can be configured to use different normalization strategies for different inputs and outputs, or none at all. Can be used for example when one input should be normalized, but a different one should be untouched because it’s the input for an embedding layer. Alternatively, one might want to mix standardization and min-max scaling for different inputs and outputs.

By default, no normalization is applied. There are methods to configure the desired normalization strategy for inputs and outputs either globally or on an individual input/output level. Specific input/output strategies will override global ones.

MultiNormalizerHybrid

public MultiNormalizerHybrid standardizeAllInputs()

Apply standardization to all inputs, except the ones individually configured

  • return the normalizer

minMaxScaleAllInputs

public MultiNormalizerHybrid minMaxScaleAllInputs()

Apply min-max scaling to all inputs, except the ones individually configured

  • return the normalizer

minMaxScaleAllInputs

public MultiNormalizerHybrid minMaxScaleAllInputs(double rangeFrom, double rangeTo)

Apply min-max scaling to all inputs, except the ones individually configured

  • param rangeFrom lower bound of the target range

  • param rangeTo upper bound of the target range

  • return the normalizer

standardizeInput

public MultiNormalizerHybrid standardizeInput(int input)

Apply standardization to a specific input, overriding the global input strategy if any

  • param input the index of the input

  • return the normalizer

minMaxScaleInput

public MultiNormalizerHybrid minMaxScaleInput(int input)

Apply min-max scaling to a specific input, overriding the global input strategy if any

  • param input the index of the input

  • return the normalizer

minMaxScaleInput

public MultiNormalizerHybrid minMaxScaleInput(int input, double rangeFrom, double rangeTo)

Apply min-max scaling to a specific input, overriding the global input strategy if any

  • param input the index of the input

  • param rangeFrom lower bound of the target range

  • param rangeTo upper bound of the target range

  • return the normalizer

standardizeAllOutputs

public MultiNormalizerHybrid standardizeAllOutputs()

Apply standardization to all outputs, except the ones individually configured

  • return the normalizer

minMaxScaleAllOutputs

public MultiNormalizerHybrid minMaxScaleAllOutputs()

Apply min-max scaling to all outputs, except the ones individually configured

  • return the normalizer

minMaxScaleAllOutputs

public MultiNormalizerHybrid minMaxScaleAllOutputs(double rangeFrom, double rangeTo)

Apply min-max scaling to all outputs, except the ones individually configured

  • param rangeFrom lower bound of the target range

  • param rangeTo upper bound of the target range

  • return the normalizer

standardizeOutput

public MultiNormalizerHybrid standardizeOutput(int output)

Apply standardization to a specific output, overriding the global output strategy if any

  • param output the index of the input

  • return the normalizer

minMaxScaleOutput

public MultiNormalizerHybrid minMaxScaleOutput(int output)

Apply min-max scaling to a specific output, overriding the global output strategy if any

  • param output the index of the input

  • return the normalizer

minMaxScaleOutput

public MultiNormalizerHybrid minMaxScaleOutput(int output, double rangeFrom, double rangeTo)

Apply min-max scaling to a specific output, overriding the global output strategy if any

  • param output the index of the input

  • param rangeFrom lower bound of the target range

  • param rangeTo upper bound of the target range

  • return the normalizer

getInputStats

public NormalizerStats getInputStats(int input)

Get normalization statistics for a given input.

  • param input the index of the input

  • return implementation of NormalizerStats corresponding to the normalization strategy selected

getOutputStats

public NormalizerStats getOutputStats(int output)

Get normalization statistics for a given output.

  • param output the index of the output

  • return implementation of NormalizerStats corresponding to the normalization strategy selected

fit

public void fit(@NonNull MultiDataSet dataSet)

Get the map of normalization statistics per input

  • return map of input indices pointing to NormalizerStats instances

fit

public void fit(@NonNull MultiDataSetIterator iterator)

Iterates over a dataset accumulating statistics for normalization

  • param iterator the iterator to use for collecting statistics

transform

public void transform(@NonNull MultiDataSet data)

Transform the dataset

  • param data the dataset to pre process

revert

public void revert(@NonNull MultiDataSet data)

Undo (revert) the normalization applied by this DataNormalization instance (arrays are modified in-place)

  • param data MultiDataSet to revert the normalization on

revertFeatures

public void revertFeatures(@NonNull INDArray[] features)

Undo (revert) the normalization applied by this DataNormalization instance to the entire inputs array

  • param features The normalized array of inputs

revertFeatures

public void revertFeatures(@NonNull INDArray[] features, INDArray[] maskArrays)

Undo (revert) the normalization applied by this DataNormalization instance to the entire inputs array

  • param features The normalized array of inputs

  • param maskArrays Optional mask arrays belonging to the inputs

revertFeatures

public void revertFeatures(@NonNull INDArray[] features, INDArray[] maskArrays, int input)

Undo (revert) the normalization applied by this DataNormalization instance to the features of a particular input

  • param features The normalized array of inputs

  • param maskArrays Optional mask arrays belonging to the inputs

  • param input the index of the input to revert normalization on

revertLabels

public void revertLabels(@NonNull INDArray[] labels)

Undo (revert) the normalization applied by this DataNormalization instance to the entire outputs array

  • param labels The normalized array of outputs

revertLabels

public void revertLabels(@NonNull INDArray[] labels, INDArray[] maskArrays)

Undo (revert) the normalization applied by this DataNormalization instance to the entire outputs array

  • param labels The normalized array of outputs

  • param maskArrays Optional mask arrays belonging to the outputs

revertLabels

public void revertLabels(@NonNull INDArray[] labels, INDArray[] maskArrays, int output)

Undo (revert) the normalization applied by this DataNormalization instance to the labels of a particular output

  • param labels The normalized array of outputs

  • param maskArrays Optional mask arrays belonging to the outputs

  • param output the index of the output to revert normalization on

CompositeDataSetPreProcessor

A simple Composite DataSetPreProcessor - allows you to apply multiple DataSetPreProcessors sequentially on the one DataSet, in the order they are passed to the constructor

CompositeDataSetPreProcessor

public CompositeDataSetPreProcessor(DataSetPreProcessor... preProcessors)
  • param preProcessors Preprocessors to apply. They will be applied in this order

MultiNormalizerStandardize

Pre processor for MultiDataSet that normalizes feature values (and optionally label values) to have 0 mean and a standard deviation of 1

load

public void load(@NonNull List<File> featureFiles, @NonNull List<File> labelFiles) throws IOException

Load means and standard deviations from the file system

  • param featureFiles source files for features, requires 2 files per input, alternating mean and stddev files

  • param labelFiles source files for labels, requires 2 files per output, alternating mean and stddev files

save

public void save(@NonNull List<File> featureFiles, @NonNull List<File> labelFiles) throws IOException
  • param featureFiles target files for features, requires 2 files per input, alternating mean and stddev files

  • param labelFiles target files for labels, requires 2 files per output, alternating mean and stddev files

  • deprecated use {- link MultiStandardizeSerializerStrategy} instead

Save the current means and standard deviations to the file system

VGG16ImagePreProcessor

fit

public void fit(DataSet dataSet)

Fit a dataset (only compute based on the statistics from this dataset0

  • param dataSet the dataset to compute on

fit

public void fit(DataSetIterator iterator)

Iterates over a dataset accumulating statistics for normalization

  • param iterator the iterator to use for collecting statistics.

transform

public void transform(DataSet toPreProcess)

Transform the data

  • param toPreProcess the dataset to transform

DataNormalization

An interface for data normalizers. Data normalizers compute some sort of statistics over a dataset and scale the data in some way.

This is a preprocessor specifically for VGG16. It subtracts the mean RGB value, computed on the training set, from each pixel as reported in:

[source]
[source]
[source]
[source]
[source]
[source]
[source]
[source]
[source]
[source]
[source]
[source]
[source]
[source]
[source]
[source]
https://arxiv.org/pdf/1409.1556.pdf
[source]