Filters
Selection of data using conditions.
Using filters
Filters are a part of transforms and gives a DSL for you to keep parts of your dataset. Filters can be one-liners for single conditions or include complex boolean logic.
You can also write your own filters by implementing the Filter
interface, though it is much more often that you may want to create a custom condition instead.
Available filters
ConditionFilter
If condition is satisfied (returns true): remove the example or sequence If condition is not satisfied (returns false): keep the example or sequence
removeExample
param writables Example
return true if example should be removed, false to keep
removeSequence
param sequence sequence example
return true if example should be removed, false to keep
transform
Get the output schema for this transformation, given an input schema
param inputSchema
outputColumnName
The output column name after the operation has been applied
return the output column name
columnName
The output column names This will often be the same as the input
return the output column names
Filter
Filter: a method of removing examples (or sequences) according to some condition
FilterInvalidValues
FilterInvalidValues: a filter operation that removes any examples (or sequences) if the examples/sequences contains invalid values in any of a specified set of columns. Invalid values are determined with respect to the schema
transform
param columnsToFilterIfInvalid Columns to check for invalid values
removeExample
param writables Example
return true if example should be removed, false to keep
removeSequence
param sequence sequence example
return true if example should be removed, false to keep
outputColumnName
The output column name after the operation has been applied
return the output column name
columnName
The output column names This will often be the same as the input
return the output column names
InvalidNumColumns
Remove invalid records of a certain size.
removeExample
param writables Example
return true if example should be removed, false to keep
removeSequence
param sequence sequence example
return true if example should be removed, false to keep
removeExample
param writables Example
return true if example should be removed, false to keep
removeSequence
param sequence sequence example
return true if example should be removed, false to keep
transform
Get the output schema for this transformation, given an input schema
param inputSchema
outputColumnName
The output column name after the operation has been applied
return the output column name
columnName
The output column names This will often be the same as the input
return the output column names
Last updated