Deeplearning4j
Community ForumND4J JavadocDL4J Javadoc
EN 1.0.0-M1.1
EN 1.0.0-M1.1
  • Deeplearning4j Suite Overview
  • Release Notes
    • 1.0.0-M1.1
    • 1.0.0-M1
    • 1.0.0-beta7
    • 1.0.0-beta6
    • 1.0.0-beta5
    • 1.0.0-beta4
    • 1.0.0-beta3
    • 1.0.0-beta2
    • 1.0.0-beta
    • 1.0.0-alpha
    • 0.9.1
    • 0.9.0
    • 0.8.0
    • 0.7.2
    • 0.7.1
    • 0.7.0
    • 0.6.0
    • 0.5.0
    • 0.4.0
  • Multi-Project
    • Tutorials
      • Beginners
      • Quickstart
    • How To Guides
      • Import in to your favorite IDE
      • Contribute
        • Eclipse Contributors
      • Developer Docs
        • Github Actions/Build Infra
        • Javacpp
        • Release
        • Testing
      • Build From Source
      • Benchmark
      • Beginners
    • Reference
      • Examples Tour
    • Explanation
      • The core workflow
      • Configuration
        • Backends
          • Performance Issues
          • CPU
          • Cudnn
        • Memory
          • Workspaces
      • Build Tools
      • Snapshots
      • Maven
  • Deeplearning4j
    • Tutorials
      • Quick Start
      • Language Processing
        • Doc2Vec
        • Sentence Iterator
        • Tokenization
        • Vocabulary Cache
    • How To Guides
      • Custom Layers
      • Keras Import
        • Functional Models
        • Sequential Models
        • Custom Layers
        • Keras Import API Overview
          • Advanced Activations
          • Convolutional Layers
          • Core Layers
          • Embedding Layers
          • Local Layers
          • Noise Layers
          • Normalization Layers
          • Pooling Layers
          • Recurrent Layers
          • Wrapper Layers
        • Supported Features Overview
          • Activations
          • Constraints
          • Initializers
          • Losses
          • Optimizers
          • Regularizers
      • Tuning and Training
        • Visualization
        • Troubleshooting Training
        • Early Stopping
        • Evaluation
        • Transfer Learning
    • Reference
      • Model Zoo
        • Zoo Models
      • Activations
      • Auto Encoders
      • Computation Graph
      • Convolutional Layers
      • DataSet Iterators
      • Layers
      • Model Listeners
      • Saving and Loading Models
      • Multi Layer Network
      • Recurrent Layers
      • Updaters/Optimizers
      • Vertices
      • Word2vec/Glove/Doc2Vec
    • Explanation
  • datavec
    • Tutorials
      • Overview
    • How To Guides
    • Reference
      • Analysis
      • Conditions
      • Executors
      • Filters
      • Normalization
      • Operations
      • Transforms
      • Readers
      • Records
      • Reductions
      • Schemas
      • Serialization
      • Visualization
    • Explanation
  • Nd4j
    • Tutorials
      • Quickstart
    • How To Guides
      • Other Framework Interop
        • Tensorflow
        • TVM
        • Onnx
      • Matrix Manipulation
      • Element wise Operations
      • Basics
    • Reference
      • Op Descriptor Format
      • Tensor
      • Syntax
    • Explanation
  • Samediff
    • Tutorials
      • Quickstart
    • How To Guides
      • Importing Tensorflow
      • Adding Operations
        • codegen
    • Reference
      • Operation Namespaces
        • Base Operations
        • Bitwise
        • CNN
        • Image
        • LinAlg
        • Loss
        • Math
        • NN
        • Random
        • RNN
      • Variables
    • Explanation
      • Model Import Framework
  • Libnd4j
    • How To Guides
      • Building on Windows
      • Building for raspberry pi or Jetson Nano
      • Building on ios
      • How to Add Operations
      • How to Setup CLion
    • Reference
      • Understanding graph execution
      • Overview of working with libnd4j
      • Helpers Overview (CUDNN, OneDNN,Armcompute)
    • Explanation
  • Python4j
    • Tutorials
      • Quickstart
    • How To Guides
      • Write Python Script
    • Reference
      • Python Types
      • Python Path
      • Garbage Collection
      • Python Script Execution
    • Explanation
  • RL4j
    • Tutorials
    • How To Guides
    • Reference
    • Explanation
  • Spark
    • Tutorials
      • DL4J on Spark Quickstart
    • How To Guides
      • How To
      • Data How To
    • Reference
      • Parameter Server
      • Technical Reference
    • Explanation
      • Spark API Reference
  • codegen
Powered by GitBook
On this page
  • Using filters
  • Available filters
  • ConditionFilter
  • Filter
  • FilterInvalidValues
  • InvalidNumColumns

Was this helpful?

Edit on Git
Export as PDF
  1. datavec
  2. Reference

Filters

Selection of data using conditions.

PreviousExecutorsNextNormalization

Was this helpful?

Using filters

Filters are a part of transforms and gives a DSL for you to keep parts of your dataset. Filters can be one-liners for single conditions or include complex boolean logic.

TransformProcess tp = new TransformProcess.Builder(inputDataSchema)
    .filter(new ConditionFilter(new CategoricalColumnCondition("MerchantCountryCode", ConditionOp.NotInSet, new HashSet<>(Arrays.asList("USA","CAN")))))
    .build();

You can also write your own filters by implementing the Filter interface, though it is much more often that you may want to create a custom condition instead.

Available filters

ConditionFilter

If condition is satisfied (returns true): remove the example or sequence If condition is not satisfied (returns false): keep the example or sequence

removeExample

public boolean removeExample(Object writables)
  • param writables Example

  • return true if example should be removed, false to keep

removeSequence

public boolean removeSequence(Object sequence)
  • param sequence sequence example

  • return true if example should be removed, false to keep

transform

public Schema transform(Schema inputSchema)

Get the output schema for this transformation, given an input schema

  • param inputSchema

outputColumnName

public String outputColumnName()

The output column name after the operation has been applied

  • return the output column name

columnName

public String columnName()

The output column names This will often be the same as the input

  • return the output column names

Filter

Filter: a method of removing examples (or sequences) according to some condition

FilterInvalidValues

FilterInvalidValues: a filter operation that removes any examples (or sequences) if the examples/sequences contains invalid values in any of a specified set of columns. Invalid values are determined with respect to the schema

transform

public Schema transform(Schema inputSchema)
  • param columnsToFilterIfInvalid Columns to check for invalid values

removeExample

public boolean removeExample(Object writables)
  • param writables Example

  • return true if example should be removed, false to keep

removeSequence

public boolean removeSequence(Object sequence)
  • param sequence sequence example

  • return true if example should be removed, false to keep

outputColumnName

public String outputColumnName()

The output column name after the operation has been applied

  • return the output column name

columnName

public String columnName()

The output column names This will often be the same as the input

  • return the output column names

InvalidNumColumns

Remove invalid records of a certain size.

removeExample

public boolean removeExample(Object writables)
  • param writables Example

  • return true if example should be removed, false to keep

removeSequence

public boolean removeSequence(Object sequence)
  • param sequence sequence example

  • return true if example should be removed, false to keep

removeExample

public boolean removeExample(List<Writable> writables)
  • param writables Example

  • return true if example should be removed, false to keep

removeSequence

public boolean removeSequence(List<List<Writable>> sequence)
  • param sequence sequence example

  • return true if example should be removed, false to keep

transform

public Schema transform(Schema inputSchema)

Get the output schema for this transformation, given an input schema

  • param inputSchema

outputColumnName

public String outputColumnName()

The output column name after the operation has been applied

  • return the output column name

columnName

public String columnName()

The output column names This will often be the same as the input

  • return the output column names

[source]
[source]
[source]
[source]