> For the complete documentation index, see [llms.txt](https://deeplearning4j.konduit.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://deeplearning4j.konduit.ai/en-1.0.0-rewrite/deeplearning4j/build-from-source.md).

# Building from Source

> **Most users should use the releases on Maven Central** and do not need to build from source. Building from source is intended for contributors, those testing unreleased features, or teams maintaining a custom fork.

### Overview

The DL4J stack is a monorepo at [github.com/deeplearning4j/deeplearning4j](https://github.com/deeplearning4j/deeplearning4j). A full source build produces:

* **libnd4j** — native C++ compute engine (compiled with CMake, invoked through Maven via JavaCPP)
* **nd4j** — Java ND4J array library, SameDiff autodiff, backend bindings (CPU, CUDA)
* **deeplearning4j** — neural network layers, LLM/VLM pipelines, PEFT, training infrastructure
* **datavec** — data pipeline and ETL
* **omnihub** — model hub and `AutoModel` loader
* **python4j** — embedded CPython from the JVM

### Prerequisites

#### Required Tools

| Tool      | Minimum Version | Notes                                                                           |
| --------- | --------------- | ------------------------------------------------------------------------------- |
| JDK       | 11              | JDK 17 recommended                                                              |
| Maven     | 3.6.3           | Earlier 3.x releases have known issues with the monorepo                        |
| CMake     | 3.19            | Required to compile libnd4j                                                     |
| gcc / g++ | 9.x             | Or Clang 12+. Must support C++17                                                |
| ccache    | any             | Essential — first build: 30–45 min; subsequent builds with ccache: \~30 seconds |

#### Install ccache

ccache caches compiled object files. Without it, every native build recompiles everything from scratch.

```bash
# Ubuntu/Debian
sudo apt-get install ccache

# macOS
brew install ccache

# Fedora/RHEL
sudo dnf install ccache
```

Verify it's working: `ccache --show-stats`

**Never run `ccache -C` or `ccache --clear`** — this forces a full 30–45 minute rebuild.

#### Platform-Specific Setup

**Linux (Ubuntu / Debian)**

```bash
sudo apt-get update
sudo apt-get install -y \
    build-essential cmake ccache git \
    libgomp1 openjdk-17-jdk
```

**Linux (Fedora / RHEL)**

```bash
sudo dnf groupinstall 'Development Tools'
sudo dnf install cmake ccache java-17-openjdk-devel
```

**macOS**

```bash
xcode-select --install
brew update
brew install cmake ccache maven openjdk@17
```

**Windows**

libnd4j requires a Unix-compatible toolchain. Install [MSYS2](https://www.msys2.org/) and run inside the MSYS2 shell:

```bash
pacman -S \
    mingw-w64-x86_64-gcc \
    mingw-w64-x86_64-cmake \
    mingw-w64-x86_64-ccache \
    make pkg-config grep sed gzip tar \
    mingw64/mingw-w64-x86_64-openblas
```

Add `C:\msys64\mingw64\bin` to your system `PATH`.

### Cloning the Repository

```bash
git clone https://github.com/deeplearning4j/deeplearning4j.git
cd deeplearning4j
```

### Building: CPU

The Maven build compiles both the C++ native backend (libnd4j) and the Java modules in a single invocation. JavaCPP invokes CMake internally — you do not need to run CMake or make directly.

```bash
mvn -Pcpu \
  -Dlibnd4j.buildthreads=$(nproc) \
  -pl libnd4j,:nd4j-cpu-backend-common,:nd4j-native \
  clean install -DskipTests
```

This builds libnd4j for CPU and the Java CPU backend bindings. First build takes 30–45 minutes; subsequent builds with ccache take \~30 seconds for small changes.

After the native backend is built, build the remaining Java modules:

```bash
mvn install -DskipTests -pl nd4j,deeplearning4j,datavec,omnihub,python4j
```

Or build everything at once (slower, compiles all modules):

```bash
mvn -Pcpu \
  -Dlibnd4j.buildthreads=$(nproc) \
  clean install -DskipTests
```

### Building: CUDA

#### CUDA prerequisites

* NVIDIA CUDA Toolkit (default: 12.9, configurable via `-Dcuda.version`)
* Compatible NVIDIA driver (525.60+ for CUDA 12.9)
* On Windows: Visual Studio 2019 or 2022

Install CUDA from [developer.nvidia.com/cuda-downloads](https://developer.nvidia.com/cuda-downloads).

#### Build command

```bash
mvn -Pcuda \
  -Dlibnd4j.chip=cuda \
  -Dlibnd4j.buildthreads=$(nproc) \
  -pl libnd4j,:nd4j-cuda-12.9 \
  clean install -DskipTests
```

To enable Triton JIT compilation (produces the `-compile` classifier variant):

```bash
mvn -Pcuda \
  -Dlibnd4j.chip=cuda \
  -Dlibnd4j.triton=ON \
  -Dlibnd4j.buildthreads=$(nproc) \
  -pl libnd4j,:nd4j-cuda-12.9 \
  clean install -DskipTests
```

#### Building for a different CUDA version

`cuda.version` is a Maven property. To build against CUDA 12.6 instead of 12.9:

```bash
mvn -Pcuda \
  -Dcuda.version=12.6 \
  -Dlibnd4j.chip=cuda \
  -Dlibnd4j.buildthreads=$(nproc) \
  -pl libnd4j,:nd4j-cuda-12.6 \
  clean install -DskipTests
```

Note: the pre-built artifacts on Maven Central target CUDA 12.9. Building against a different version requires a source build.

#### Key build flags

| Flag                       | Description                                                                                                                     |
| -------------------------- | ------------------------------------------------------------------------------------------------------------------------------- |
| `-Dlibnd4j.chip=cuda`      | Build the CUDA backend (default is CPU)                                                                                         |
| `-Dlibnd4j.buildthreads=N` | Parallel compiler jobs for native build                                                                                         |
| `-Dlibnd4j.triton=ON`      | Enable Triton MLIR JIT compilation stack                                                                                        |
| `-Dlibnd4j.compute=XX`     | Target specific CUDA compute capability (e.g., `86` for Ampere). **Changing this invalidates the ccache for all CUDA objects.** |
| `-Dcuda.version=X.Y`       | Target a specific CUDA toolkit version (default: 12.9)                                                                          |
| `-Pcpu` / `-Pcuda`         | Maven profile for CPU or CUDA build                                                                                             |

### Building Specific Modules

Maven's `-pl` flag restricts the build to specific submodules. Use `-am` (`--also-make`) to include transitive dependencies:

```bash
# Build only deeplearning4j-core (with its dependencies)
mvn install -DskipTests -pl deeplearning4j/deeplearning4j-core -am

# Build only the GGML import module
mvn install -DskipTests -pl nd4j/nd4j-ggml -am

# Build only the LLM module
mvn install -DskipTests -pl deeplearning4j/samediff-llm -am
```

### Running Tests

**All tests live in `platform-tests/`.** Tests are never placed in the modules being tested. The individual library modules (`nd4j/`, `deeplearning4j/`, `datavec/`) don't include a concrete backend — `platform-tests` is the only module that declares one, making it the single place where tests can actually execute against real native code.

Always run from the `platform-tests` directory:

```bash
cd platform-tests

# Run a single test class
mvn test -Dtest=MyTestClass

# Run a single test method
mvn test -Dtest=MyTestClass#myTestMethod

# Run a parameterized test method (requires trailing wildcard)
mvn test -Dtest=MyTestClass#parameterizedMethod*
```

**Never run `mvn test` from the project root** — it triggers full native rebuilds and runs every test suite.

#### Selecting a backend

The `backend.artifactId` Maven property controls which backend tests run against:

```bash
# CPU (default)
mvn test -Dtest=MyTestClass

# CUDA
mvn test -Dtest=MyTestClass -Dbackend.artifactId=nd4j-cuda-12.9

# CPU with specific native classifier
mvn test -Dtest=MyTestClass -Dplatform.classifier=linux-x86_64-onednn-avx2
```

#### Memory configuration

The default heap is `32g` — override for machines with less RAM:

```bash
mvn test -Dtest=MyTestClass -Dtest.heap.size=6g -Dtest.offheap.size=6g
```

#### Selecting tests by tag

Tests are tagged with JUnit 5 tags (`samediff`, `onnx`, `training`, `long-running-test`, etc.):

```bash
# Run only SameDiff tests
mvn test -Dtests=samediff

# Exclude long-running tests
mvn test -DexcludedTests="long-running-test,large-resources"
```

#### Domain test scripts

`platform-tests/` includes convenience scripts:

| Script                    | What it runs                                                      |
| ------------------------- | ----------------------------------------------------------------- |
| `run-onnx-tests.sh`       | ONNX SameDiff import tests                                        |
| `run-tensorflow-tests.sh` | TensorFlow SameDiff import tests                                  |
| `run-keras-tests.sh`      | Keras model import tests                                          |
| `run-benchmarks.sh`       | Standalone JUnit launcher with valgrind/compute-sanitizer support |
| `bootstrap-onnx.sh`       | Downloads \~65 ONNX Zoo models (data setup, not a test runner)    |

For a comprehensive guide to the platform-tests architecture — backend selection internals, JUnit 5 extensions, base test classes, test tags, memory profiling, and the shaded-JAR benchmark workflow — see [Contributing — Platform Tests](https://github.com/KonduitAI/deeplearning4j-docs/blob/en-1.0.0-rewrite/docs/m2.1/deeplearning4j/contributing/README.md#platform-tests).

### IDE Setup: IntelliJ IDEA

IntelliJ IDEA is the recommended IDE.

1. **Open** the repo as a Maven project (File > Open, select `pom.xml` at the repo root).
2. **Install the Lombok plugin**: Settings > Plugins > search "Lombok" > Install. Without it, the IDE shows false errors throughout the codebase.
3. **Enable annotation processing**: Settings > Build, Execution, Deployment > Compiler > Annotation Processors > check "Enable annotation processing".
4. **Import Maven profiles** as needed (`cpu` or `cuda`).
5. Increase IntelliJ's JVM heap: Help > Change Memory Settings > 4096 MB or higher.

### Using Local Build Artifacts

After a successful build, use the local snapshot in a downstream project:

```xml
<dependency>
    <groupId>org.deeplearning4j</groupId>
    <artifactId>deeplearning4j-core</artifactId>
    <version>1.0.0-SNAPSHOT</version>
</dependency>
```

Check the current snapshot version in the [root POM](https://github.com/deeplearning4j/deeplearning4j/blob/master/pom.xml).

### CI Build Artifacts

The project CI builds native artifacts for every platform and configuration. Understanding the classifier system helps when selecting the right JAR for your deployment target.

#### Artifact classifier format

Pre-built artifacts use classifiers to encode platform, helper library, and SIMD extension:

```
<artifactId>-<version>-<os>-<arch>[-<helper>][-<extension>].jar
```

For example:

* `nd4j-native-1.0.0-SNAPSHOT-linux-x86_64.jar` — base CPU
* `nd4j-native-1.0.0-SNAPSHOT-linux-x86_64-onednn-avx2.jar` — oneDNN + AVX2
* `nd4j-native-1.0.0-SNAPSHOT-linux-x86_64-compile.jar` — MLIR/Triton compile stack

#### Supported build matrix

| Platform                | Helpers available | Extensions   |
| ----------------------- | ----------------- | ------------ |
| Linux x86\_64           | onednn, compile   | avx2, avx512 |
| Linux ARM64             | —                 | —            |
| macOS ARM64             | —                 | —            |
| Windows x86\_64         | onednn            | avx2, avx512 |
| Android ARM64           | —                 | —            |
| Android x86\_64         | —                 | —            |
| Linux x86\_64 CUDA 12.6 | cudnn, compile    | —            |
| Linux x86\_64 CUDA 12.9 | cudnn, compile    | —            |

#### Building a specific classifier locally

To build with a helper or extension, pass the corresponding Maven properties:

```bash
# CPU with oneDNN helper and AVX2
mvn -Pcpu \
  -Dlibnd4j.helper=onednn \
  -Dlibnd4j.extension=avx2 \
  -Dlibnd4j.buildthreads=$(nproc) \
  -pl libnd4j,:nd4j-cpu-backend-common,:nd4j-native \
  clean install -DskipTests

# CUDA with cuDNN helper
mvn -Pcuda \
  -Dlibnd4j.chip=cuda \
  -Dlibnd4j.helper=cudnn \
  -Dlibnd4j.buildthreads=$(nproc) \
  -pl libnd4j,:nd4j-cuda-12.9 \
  clean install -DskipTests
```

For full CI/CD infrastructure details (build matrix, test suites, snapshot deployment), see [Contributing — CI/CD Build Environment](https://github.com/KonduitAI/deeplearning4j-docs/blob/en-1.0.0-rewrite/docs/m2.1/deeplearning4j/contributing/README.md#cicd-build-environment).

### Common Build Issues

#### Native build takes forever (no ccache)

**Symptom:** Every build takes 30–45 minutes even for small changes.

**Fix:** Install ccache and verify it's on your `PATH`. Check `ccache --show-stats` — the hit rate should be high after the first build.

#### CMake version too old

**Symptom:** CMake policy errors or missing CMake features.

**Fix:** Install CMake 3.19+. On older systems: `pip install cmake` or download from [cmake.org](https://cmake.org/download/).

#### Out of memory during Maven build

**Symptom:** `java.lang.OutOfMemoryError` or `GC overhead limit exceeded`.

**Fix:**

```bash
export MAVEN_OPTS="-Xmx4g -XX:MaxMetaspaceSize=1g"
```

#### CUDA compute capability mismatch

**Symptom:** `no kernel image is available for execution on the device`

**Fix:** Check your GPU's compute capability and pass it explicitly:

```bash
nvidia-smi --query-gpu=compute_cap --format=csv,noheader
# Then build with: -Dlibnd4j.compute=XX
```

#### Windows: DLL not found at runtime

**Symptom:** `Can't find dependent libraries` when running from an IDE.

**Fix:** Add `C:\msys64\mingw64\bin` to your system `PATH` and restart the IDE.

### Related Pages

* [Contributing](https://github.com/KonduitAI/deeplearning4j-docs/blob/en-1.0.0-rewrite/docs/m2.1/deeplearning4j/contributing/README.md) — pull request workflow, coding conventions, Eclipse CLA
* [Maven Setup](/en-1.0.0-rewrite/configuration/maven.md) — using pre-built Maven Central artifacts
* [GPU and CPU Setup](/en-1.0.0-rewrite/configuration/gpu-cpu.md) — configuring backends for end users