Instructions to build all DL4J libraries from source.
A reference for building dl4j from source can be found for every platform in our workflows. For maintenance reasons, we would prefer to have a canonical source of up to date build information for users rather than out of date install instructions in this guide. This guide will contain specific long lived tips for how to interpret the workflows and what to consider when building.
For an overview of the GitHub actions workflows see the overview doc
This document will cover the specific components of the build by platform rather than step through what's already in the workflows. If you have suggestions for improving this document, please comment over at the community forums
Core steps:
Building libnd4j for your specific platform
Linking the nd4j backend you want to compile for against libnd4j via JavaCPP
Compiling the rest of the code in to jar files
Libnd4j is a CMake based c++ project that supports running optimized math code on different architectures. Its sole focus is being a tiny self contained library for running math kernels. It can link against optimized BLAS routines, platform specific CNN libraries such as OneDNN and CuDNN, and contains hundreds of math kernels for implementing neural networks and other math routines.
Maven: Maven is the core build tool for deeplearning4j. Understanding maven is key to building deeplearning4j from source
Maven and CMake: For compiling libnd4j, we invoke a buildnativeoperations.sh wrapper script via maven. buildnativeoperations.sh in turn automatically sets up CMake to then build the c++ project
pi_build.sh: This is our build script for embedded and ARM based platforms. It focuses on cross compilation running on a Linux x86 based platform.
buildnativeoperations.sh: The main build script for libnd4j. It initializes CMake and invokes CMake compilation for the user on whatever platform the user is currently on unless the user specifies an alternative platform. Specifying a different platform is possible for android for example.
The main considerations for building on x86_64 are:
Whether to compile for avx2 or avx512
Whether to use OpenBLAS or MKL
Whether to link against OneDNN
From there, the normal platform specific libraries should be installed before hand. Up to date install instructions can be found in our CPU builds for Windows, Mac and Linux
ARM based builds all link against the armcompute library by default and, as mentioned above, use the pi_build.sh script for building libnd4j on specific platforms. Note that pi_build.sh can also be used to compile all of dl4j for a specific project.
pi_build.sh mainly focuses on cross compilation.
In order to properly use the pi_build.sh script, a number of environment variables should be set. Per platform, you can find these environment variables in the final build step under the environment section.
If you would like to compile deeplearning4j on an actual ARM device, please use the normal buildnativeoperations.sh workflow.
In order to compile deeplearning4j for a particular version, you must first invoke change-cuda-versions.sh
in the root directory:
This will ensure that all library versions are set to the appropriate version. Ensure that the CUDA toolkit you need is installed. If you intend on using CuDNN, ensure that is also installed correctly. For installing CUDA, consider using our install scripts as a reference if you intend on doing automated installs.
Jetson nano users: please see this thread for successfully compiling deeplearning4j on Jetson nano.
In short: It relies on CUDA 10.0. The JavaCPP presets for CUDA are also only compiled for arm64 for CUDA 10.0. You can find the supported CUDA versions for CUDA 10.0 here If you would like something more up to date, please feel free to contact us over at our forums As of 1.0.0-M1.1 you can also use updated dependencies:
We use msys2 for compiling libnd4j. CUDA requires MSVC in order to be installed in order to properly compile CUDA kernels. If you want to compile libnd4j for CUDA from source, please ensure you first invoke the vcvars.bat
script in a cmd terminal, then launch msys2 manually. For more specifics, please see our Windows CUDA 11 and 11.2 build files.