> For the complete documentation index, see [llms.txt](https://deeplearning4j.konduit.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://deeplearning4j.konduit.ai/en-1.0.0-beta7/datavec.md).

# DataVec

- [Overview](https://deeplearning4j.konduit.ai/en-1.0.0-beta7/datavec/overview.md): DataVec ETL framework — purpose, architecture, and the data pipeline from raw data to DataSet
- [Schema](https://deeplearning4j.konduit.ai/en-1.0.0-beta7/datavec/schema.md): Defining data schemas — Schema, ColumnType, SequenceSchema, and schema inference
- [Record Readers](https://deeplearning4j.konduit.ai/en-1.0.0-beta7/datavec/readers.md): RecordReader implementations — CSV, JSON, image, regex, line, and custom readers
- [Transforms](https://deeplearning4j.konduit.ai/en-1.0.0-beta7/datavec/transforms.md): TransformProcess — data transformations, column operations, type conversions, and sequences
- [Conditions](https://deeplearning4j.konduit.ai/en-1.0.0-beta7/datavec/conditions.md): Conditional operations in DataVec — filtering and transforming data based on conditions
- [Filters](https://deeplearning4j.konduit.ai/en-1.0.0-beta7/datavec/filters.md): Data filtering in DataVec — removing records based on conditions
- [Normalization](https://deeplearning4j.konduit.ai/en-1.0.0-beta7/datavec/normalization.md): Data normalization in DataVec — normalizer implementations and serialization
- [Operations](https://deeplearning4j.konduit.ai/en-1.0.0-beta7/datavec/operations.md): DataVec operations — calculators, reducers, and data analysis tools
- [Reductions](https://deeplearning4j.konduit.ai/en-1.0.0-beta7/datavec/reductions.md): DataVec reduction operations — aggregating, grouping, and summarizing records and sequences
- [Analysis](https://deeplearning4j.konduit.ai/en-1.0.0-beta7/datavec/analysis.md): DataVec data analysis tools — profiling datasets, detecting quality issues, and computing statistics locally and on Spark
- [Serialization](https://deeplearning4j.konduit.ai/en-1.0.0-beta7/datavec/serialization.md): Data serialization in DataVec — saving and loading schemas, transform processes, and normalized data
- [Executors](https://deeplearning4j.konduit.ai/en-1.0.0-beta7/datavec/executors.md): Running DataVec transform processes — LocalTransformExecutor and SparkTransformExecutor
- [Records and Writables](https://deeplearning4j.konduit.ai/en-1.0.0-beta7/datavec/records.md): DataVec record format — Writable types, Record, and the data representation layer
- [Image Data](https://deeplearning4j.konduit.ai/en-1.0.0-beta7/datavec/image.md): Loading and preprocessing image data — ImageRecordReader, NativeImageLoader, and image transforms
- [Apache Arrow](https://deeplearning4j.konduit.ai/en-1.0.0-beta7/datavec/arrow.md): Apache Arrow integration in DataVec — ArrowRecordReader and zero-copy data exchange


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://deeplearning4j.konduit.ai/en-1.0.0-beta7/datavec.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
