ND4J is a scientific computing library for the JVM. It is meant to be used in production environments rather than as a research tool, which means routines are designed to run fast with minimum RAM requirements. The main features are:
A versatile n-dimensional array object.
Linear algebra and signal processing functions.
Multiplatform functionality including GPUs.
all major operating systems: win/linux/osx/android.
architectures: x86, arm, ppc.
This quickstart follows the same layout and approach of the Numpy quickstart. This should help people familiar with Python and Numpy get started quickly with Nd4J.
Prerequisites
You can use Nd4J from any JVM Language. (For example: Scala, Kotlin). You can use Nd4J with any build tool. The sample code in this quick start uses the following:
To improve readability we show you the output of System.out.println(...). But we have not show the print statement in the sample code. If you are confident you know how to use maven and git, please feel free to skip to the Basics. In the remainder of this section we will build a small 'hello ND4J' application to verify the prequisites are set up correctly.
Execute the following commands to get the project from github.
git clone https://github.com/RobAltena/HelloNd4J.git
cd HelloNd4J
mvn install
mvn exec:java -Dexec.mainClass="HelloNd4j"
When everything is set up correctly you should see the following output:
[ 0, 0]
Basics
The main feature of Nd4j is the versatile n-dimensional array interface called INDArray. To improve performance Nd4j uses off-heap memory to store data. The INDArray is different from standard Java arrays.
Some of the key properties and methods for an INDArray x are as follows:
importorg.nd4j.linalg.factory.Nd4j;importorg.nd4j.linalg.api.buffer.DataType;INDArray x =Nd4j.zeros(3,4);// The number of axes (dimensions) of the array.int dimensions =x.rank();// The dimensions of the array. The size in each dimension.long[] shape =x.shape();// The total number of elements.long length =x.length();// The type of the array elements. DataType dt =x.dataType();
Array Creation
To create INDArrays you use the static factory methods of the Nd4j class.
The Nd4j.createFromArray function is overloaded to make it easy to create INDArrays from regular Java arrays. The example below uses Java double arrays. Similar create methods are overloaded for float, int and long. The Nd4j.createFromArray function has overloads up to 4d for all types.
Nd4j can create arrays initialized with zeros and ones using the functions zeros and ones. The rand function allows you to create an array initialized with random values. The default datatype of the INDArray created is float. Some overloads allow you to set the datatype.
INDArray x =Nd4j.zeros(5);//[ 0, 0, 0, 0, 0], FLOATint [] shape = {5};x =Nd4j.zeros(DataType.DOUBLE,5);//[ 0, 0, 0, 0, 0], DOUBLE// For higher dimensions you can provide a shape array. 2D random matrix example:int rows =4;int cols =5;int[] shape = {rows, cols};INDArray x =Nd4j.rand(shape);
Use the arange functions to create an array of evenly spaces values:
INDArray x =Nd4j.arange(5);// [ 0, 1.0000, 2.0000, 3.0000, 4.0000]INDArray x =Nd4j.arange(2,7);// [ 2.0000, 3.0000, 4.0000, 5.0000, 6.0000]
The linspace function allows you to specify the number of points generated:
INDArray x =Nd4j.linspace(1,10,5); //start, stop, count.// [ 1.0000, 3.2500, 5.5000, 7.7500, 10.0000]// Evaluate a function over many points.importstaticorg.nd4j.linalg.ops.transforms.Transforms.sin;INDArray x =Nd4j.linspace(0.0,Math.PI,100,DataType.DOUBLE);INDArray y =sin(x);
Printing Arrays
The INDArray supports Java's toString() method. The current implementation has limited precision and a limited number of elements. The output is similar to printing NumPy arrays:
You will have to use INDArray methods to perform operations on your arrays. There are in-place and copy overloads and scalar and element wise overloaded versions. The in-place operators return a reference to the array so you can conveniently chain operations together. Use in-place operators where possible to improve performance. Copy operators have new array creation overhead.
//Copyarr_new =arr.add(scalar); // return a new array with scalar added to each element of arr.arr_new =arr.add(other_arr); // return a new array with element wise addition of arr and other_arr.//in place.arr_new =arr.addi(scalar); //Heads up: arr_new points to the same array as arr.arr_new =arr.addi(other_arr);
When you perform the basic operations you must make sure the underlying data types are the same.
int [] shape = {5};INDArray x =Nd4j.zeros(shape,DataType.DOUBLE);INDArray x2 =Nd4j.zeros(shape,DataType.INT);INDArray x3 =x.add(x2);// java.lang.IllegalArgumentException: Op.X and Op.Y must have the same data type, but got INT vs DOUBLE// casting x2 to DOUBLE solves the problem:INDArray x3 =x.add(x2.castTo(DataType.DOUBLE));
The INDArray has methods implementing reduction/accumulation operations such as sum, min, max.
int [] shape = {2,3};INDArray x =Nd4j.rand(shape);x;x.sum();x.min();x.max();/*[[ 0.8621, 0.9224, 0.8407], [ 0.1504, 0.5489, 0.9584]]4.28300.15040.9584*/
Provide a dimension argument to apply the operation across the specified dimension:
INDArray x =Nd4j.arange(12).reshape(3,4);/*[[ 0, 1.0000, 2.0000, 3.0000], [ 4.0000, 5.0000, 6.0000, 7.0000], [ 8.0000, 9.0000, 10.0000, 11.0000]]*/x.sum(0); // Sum of each column.//[ 12.0000, 15.0000, 18.0000, 21.0000]x.min(1); // Min of each row//[ 0, 4.0000, 8.0000]x.cumsum(1); // cumulative sum across each row,/*[[ 0, 1.0000, 3.0000, 6.0000], [ 4.0000, 9.0000, 15.0000, 22.0000], [ 8.0000, 17.0000, 27.0000, 38.0000]]*/
Transform operation
Nd4j provides familiar mathematical functions such as sin, cos, and exp. These are called transform operations. The result is returned as an INDArray.
Indexing, Slicing and Iterating is harder in Java than in Python. To retreive individual values from an INDArray you can use the getDouble, getFloat or getInt methods. INDArrays cannot be indexed like Java arrays. You can get a Java array from an INDArray using toDoubleVector(), toDoubleMatrix(), toFloatVector() and toFloatMatrix()
INDArray x =Nd4j.arange(12);// [ 0, 1.0000, 2.0000, 3.0000, 4.0000, 5.0000, 6.0000, 7.0000, 8.0000, 9.0000, 10.0000, 11.0000]float f =x.getFloat(3); // Single element access. Other methods: getDouble, getInt, ...// 3.0float [] fArr =x.toFloatVector(); //Convert to Java array.// [0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0]INDArray x2 =x.get(NDArrayIndex.interval(2,6));// [ 2.0000, 3.0000, 4.0000, 5.0000]// On a copy of x: From start to position 6, exclusive, set every 2nd element to -1.0INDArray y =x.dup();y.get(NDArrayIndex.interval(0,2,6)).assign(-1.0);//[ -1.0000, 1.0000, -1.0000, 3.0000, -1.0000, 5.0000, 6.0000, 7.0000, 8.0000, 9.0000, 10.0000, 11.0000]// reversed copy of y.INDArray y2 =Nd4j.reverse(y.dup());//[ 11.0000, 10.0000, 9.0000, 8.0000, 7.0000, 6.0000, 5.0000, -1.0000, 3.0000, -1.0000, 1.0000, -1.0000]
For multidimensional arrays you should use INDArray.get(NDArrayIndex...). The example below shows how to iterate over the rows and columns of a 2D array. Note that for 2D arrays we could have used the getColumn and getRow convenience methods.
When working with INDArrays the data is not always copied. Here are three cases you should be aware of.
No Copy at All
Simple assignments make no copy of the data. Java passes objects by reference. No copies are made on a method call.
INDArray x =Nd4j.rand(2,2);INDArray y = x; // y and x point to the same INData object.publicstaticvoidf(INDArray x){// No copy is made. Any changes to x are visible after the function call. }