> For the complete documentation index, see [llms.txt](https://deeplearning4j.konduit.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://deeplearning4j.konduit.ai/en-1.0.0-rewrite/nd4j/overview-2/operations.md).

# Operations

Operations in SameDiff consume one or more `SDVariable` inputs and produce a new `SDVariable` output of type `ARRAY`. They are the edges of the computation graph that connect variable nodes together. The total number of available operations, including overloads, runs into the hundreds — from simple elementwise addition to full LSTM layers.

This page gives an overview of where to find operations, how to use them, and what rules to keep in mind.

## Common Properties

Before looking at individual namespaces, there are a few properties that apply to **all** SameDiff operations:

* **Any variable type is valid as input**, as long as the data types match what the operation requires. Most numeric operations require floating-point inputs.
* **All operation outputs are `ARRAY`-type variables**. Operations never return `VARIABLE`, `CONSTANT`, or `PLACEHOLDER`.
* **Variables used in a single operation must all belong to the same `SameDiff` instance.** Mixing variables from different `SameDiff` objects in one operation is an error.
* **You may optionally name the output variable.** Pass the desired name as the first `String` argument:

```java
SDVariable linear = weights.mmul("matrix_product", input).add(bias);
SDVariable output = sd.nn.sigmoid("output", linear);
```

Named outputs can be retrieved from the graph later with `sd.getVariable(String name)`. If no name is supplied, a unique one is auto-generated from the operation name (e.g. `"mmul:0"`).

## Two Families of Operations

Operations live in two places:

1. **`SDVariable` instance methods** — called directly on a variable, e.g. `x.add(y)`.
2. **`SameDiff` namespace methods** — called via one of six namespace objects on the `SameDiff` instance.

### SDVariable Instance Methods

`SDVariable` exposes a rich set of methods for common operations. These are the most ergonomic to use because they can be chained:

```java
SDVariable result = weights.mmul(input).add(bias);
```

#### Linear algebra (BLAS-style)

| Method              | Description                                            |
| ------------------- | ------------------------------------------------------ |
| `add(y)`            | Elementwise addition `x + y` (broadcasting supported)  |
| `sub(y)`            | Elementwise subtraction `x - y`                        |
| `mul(y)`            | Elementwise multiplication `x * y` (or scalar scaling) |
| `div(y)`            | Elementwise division `x / y`                           |
| `neg()`             | Negate all elements                                    |
| `mmul(y)`           | Matrix multiplication                                  |
| `dot(y, dimension)` | Dot product along a dimension                          |
| `rdiv(y)`           | Reverse division `y / x`                               |
| `rsub(y)`           | Reverse subtraction `y - x`                            |

#### Comparison

| Method   | Description                                            |
| -------- | ------------------------------------------------------ |
| `gt(y)`  | Greater than (element vs scalar or element vs element) |
| `gte(y)` | Greater than or equal                                  |
| `lt(y)`  | Less than                                              |
| `lte(y)` | Less than or equal                                     |
| `eq(y)`  | Equal                                                  |
| `neq(y)` | Not equal                                              |

#### Reductions

Reductions take an optional `int... dimensions` argument. If omitted, the reduction is over all elements.

| Method                 | Description              |
| ---------------------- | ------------------------ |
| `sum(dimensions)`      | Sum of elements          |
| `mean(dimensions)`     | Mean of elements         |
| `min(dimensions)`      | Minimum value            |
| `max(dimensions)`      | Maximum value            |
| `norm1(dimensions)`    | L1 norm                  |
| `norm2(dimensions)`    | L2 norm                  |
| `prod(dimensions)`     | Product of elements      |
| `argmax(dimensions)`   | Index of maximum element |
| `argmin(dimensions)`   | Index of minimum element |
| `squaredDifference(y)` | Elementwise `(x - y)^2`  |

#### Shape manipulation

| Method                   | Description                                   |
| ------------------------ | --------------------------------------------- |
| `reshape(long... shape)` | Reshape to specified shape                    |
| `permute(int... dims)`   | Permute dimensions (transpose generalisation) |
| `shape()`                | Returns the shape as an integer `SDVariable`  |

#### Chaining example

```java
// Compute mean squared error in one chained expression
SDVariable mse = predictions.sub(labels).square().mean();
```

## SameDiff Namespace Operations

The `SameDiff` class provides six namespace objects. Access them as fields or method calls (both styles work):

```java
sd.math.sin(x);    // field-style
sd.math().sin(x);  // method-style
```

The six namespaces are: `math`, `random`, `nn`, `cnn`, `rnn`, and `loss`.

### `sd.math` — General Mathematical Operations

The `math` namespace provides a broad collection of mathematical functions, statistics, and linear-algebra primitives.

#### Power and exponential functions

```java
SDVariable y = sd.math.square(x);        // x^2
SDVariable y = sd.math.cube(x);          // x^3
SDVariable y = sd.math.sqrt(x);          // √x
SDVariable y = sd.math.pow(x, 3.0);      // x^3.0
SDVariable y = sd.math.reciprocal(x);    // 1/x
SDVariable y = sd.math.exp(x);           // e^x
SDVariable y = sd.math.log(x);           // natural log
SDVariable y = sd.math.log1p(x);         // log(1 + x)
```

#### Trigonometric and hyperbolic functions

```java
SDVariable s = sd.math.sin(x);
SDVariable c = sd.math.cos(x);
SDVariable t = sd.math.tan(x);
SDVariable a = sd.math.atan(x);
SDVariable h = sd.math.sinh(x);
SDVariable th = sd.math.tanh(x);
SDVariable ah = sd.math.atanh(x);
```

#### Elementwise miscellaneous

```java
SDVariable a = sd.math.abs(x);
SDVariable s = sd.math.sign(x);
SDVariable r = sd.math.round(x);
SDVariable c = sd.math.ceil(x);
SDVariable f = sd.math.floor(x);
SDVariable cl = sd.math.clipByValue(x, -1.0, 1.0);
SDVariable cn = sd.math.clipByNorm(x, 1.0, new int[]{1}); // clip L2 norm per row
```

#### Reductions

```java
SDVariable m  = sd.math.mean(x, 0);          // mean along dim 0
SDVariable mn = sd.math.min(x, 1);           // min along dim 1
SDVariable am = sd.math.amax(x);             // absolute maximum over all elements
SDVariable le = sd.math.logEntropy(x, 0);    // log-entropy along dim 0
```

#### Distance operations (between two identically-shaped variables)

```java
SDVariable ed = sd.math.euclideanDistance(x, y, 1);
SDVariable md = sd.math.manhattanDistance(x, y, 1);
SDVariable cd = sd.math.cosineDistance(x, y, 1);
SDVariable cs = sd.math.cosineSimilarity(x, y, 1);
```

#### Matrix operations

```java
SDVariable inv  = sd.math.matrixInverse(m);
SDVariable det  = sd.math.matrixDeterminant(m);
SDVariable diag = sd.math.diag(v);           // diagonal matrix from vector
SDVariable tr   = sd.math.trace(m);
SDVariable eye  = sd.math.eye(5);            // 5x5 identity matrix
```

#### Logical operations

```java
SDVariable andResult = sd.math.and(a, b);
SDVariable orResult  = sd.math.or(a, b);
SDVariable xorResult = sd.math.xor(a, b);
SDVariable notResult = sd.math.not(a);
```

#### Chaining in `math`

Chaining `math` ops is slightly more verbose than chaining `SDVariable` methods:

```java
// Matrix 1-norm: max column absolute sum
SDVariable norm1 = sd.math.max(sd.math.sum(sd.math.abs(matrix), 0));
```

### `sd.random` — Random Number Generators

The `random` namespace creates variables whose underlying arrays are filled with random values on each forward pass. These are useful for noise injection, dropout masks, or random initialisation inside the graph.

#### Fixed-shape random variables

```java
double mean = 0.0, stddev = 0.05;
SDVariable noise = sd.random.normal("noise", mean, stddev, new long[]{28, 28});

SDVariable uniform = sd.random.uniform("uniform", 0.0, 1.0, new long[]{64, 512});

SDVariable bernoulli = sd.random.bernoulli("mask", 0.5, new long[]{32, 256});

SDVariable binomial = sd.random.binomial("bin", 10, 0.3, new long[]{100});
```

#### Dynamic-shape random variables

When the shape depends on another variable in the graph (e.g. because the batch size is variable), pass an integer `SDVariable` as the shape:

```java
SDVariable windowShape = sd.placeHolder("window_shape", DataType.INT, 2);
SDVariable noise = sd.random.normal("audio_noise", 0.0, 0.02, windowShape);
```

The shape variable must have an integer data type.

### `sd.nn` — Neural Network Layers and Activations

The `nn` namespace covers operations commonly used in general neural networks that are not specific to convolutional or recurrent architectures.

#### Dense layers

```java
// Linear layer: output = input @ weights + bias
SDVariable linear = sd.nn.linear(input, weights, bias);

// ReLU layer in one call
SDVariable reluOut = sd.nn.reluLayer(input, weights, bias);

// Add bias separately
SDVariable withBias = sd.nn.biasAdd(features, bias);
```

#### Activation functions

```java
SDVariable r  = sd.nn.relu("relu", x, 0);         // second arg: leaky coefficient (0 = standard ReLU)
SDVariable lr = sd.nn.leakyRelu(x, 0.01);
SDVariable e  = sd.nn.elu(x);
SDVariable s  = sd.nn.sigmoid(x);
SDVariable t  = sd.nn.tanh(x);
SDVariable ht = sd.nn.hardTanh(x);
SDVariable sm = sd.nn.softmax("sm", x);
SDVariable sp = sd.nn.softplus(x);
SDVariable st = sd.nn.softsign(x);
SDVariable ge = sd.nn.gelu(x);                    // Gaussian Error Linear Unit
SDVariable sw = sd.nn.swish(x);
SDVariable mi = sd.nn.mish(x);
```

#### Regularisation

```java
// Dropout: keep probability 0.8
SDVariable dropped = sd.nn.dropout(x, 0.8);

// Layer normalisation
SDVariable normed = sd.nn.layerNorm("ln", x, gain, bias, false, 1);  // false = not channel-first; norm over dim 1

// Batch normalisation (inference mode)
SDVariable bnOut = sd.nn.batchNorm(x, mean, variance, gamma, beta, 1e-5, 1);
```

#### Padding

```java
// Pad a 2D array symmetrically with zeros, 2 elements on each side
SDVariable padded = sd.nn.pad(x, new int[][]{{2,2},{2,2}}, PadMode.CONSTANT, 0.0);
```

#### Full example: two-layer feedforward network

```java
SDVariable h1 = sd.nn.reluLayer("h1", input, w1, b1);
SDVariable h2 = sd.nn.reluLayer("h2", h1, w2, b2);
SDVariable out = sd.nn.softmax("output", h2.mmul(w3).add(b3));
```

### `sd.cnn` — Convolutional Neural Network Operations

The `cnn` namespace provides convolution, pooling, and related operations.

#### Convolution operations

Convolution layers are specified via **configuration objects** that bundle the many static hyperparameters (kernel size, stride, padding, dilation, data format, bias flag, etc.).

**1D convolution:**

```java
Conv1DConfig cfg1d = Conv1DConfig.builder()
    .k(3)     // kernel width
    .s(1)     // stride
    .p(1)     // padding
    .build();

SDVariable conv1d = sd.cnn.conv1d(input, weights, cfg1d);
// or with bias:
SDVariable conv1d = sd.cnn.conv1d(input, weights, bias, cfg1d);
```

**2D convolution:**

```java
Conv2DConfig cfg2d = Conv2DConfig.builder()
    .kH(3).kW(3)    // kernel height, width
    .pH(1).pW(1)    // padding
    .sH(1).sW(1)    // stride
    .hasBias(false) // no bias (add separately if desired)
    .dataFormat("NCHW")
    .build();

SDVariable conv2d = sd.cnn.conv2d(input, weights, cfg2d);
```

Input shape for `NCHW` format: `[batch, channels_in, height, width]`. Weight shape: `[channels_out, channels_in, kH, kW]`.

**3D convolution:**

```java
Conv3DConfig cfg3d = Conv3DConfig.builder()
    .kD(3).kH(3).kW(3)
    .build();

SDVariable conv3d = sd.cnn.conv3d(input, weights, cfg3d);
```

**Depthwise and separable convolutions:**

```java
SDVariable dwConv = sd.cnn.depthWiseConv2d(input, depthwiseWeights, cfg2d);
SDVariable sepConv = sd.cnn.separableConv2d(input, depthWeights, pointWeights, cfg2d);
```

#### Deconvolution (transposed convolution)

```java
DeConv2DConfig dcfg = DeConv2DConfig.builder().kH(3).kW(3).build();
SDVariable deconv2d = sd.cnn.deconv2d(input, weights, dcfg);
```

#### Pooling

```java
Pooling2DConfig pool2d = Pooling2DConfig.builder()
    .kH(2).kW(2)
    .sH(2).sW(2)
    .build();

SDVariable maxPool   = sd.cnn.maxPooling2d(x, pool2d);
SDVariable avgPool   = sd.cnn.avgPooling2d(x, pool2d);
SDVariable maxPool1d = sd.cnn.maxPooling1d(x, Pooling1DConfig.builder().k(2).s(2).build());
```

#### Upsampling

```java
SDVariable upsampled = sd.cnn.upsampling2d(x, 2);   // upsample by factor 2
```

#### Local response normalisation

```java
LocalResponseNormalizationConfig lrnCfg = LocalResponseNormalizationConfig.builder()
    .alpha(1e-4).beta(0.75).bias(1.0).depth(5).build();
SDVariable lrn = sd.cnn.localResponseNormalization(x, lrnCfg);
```

#### Full example: simple ConvNet block

```java
// Define config
Conv2DConfig cfg = Conv2DConfig.builder().kH(3).kW(3).pH(1).pW(1).hasBias(true).build();
Pooling2DConfig pool = Pooling2DConfig.builder().kH(2).kW(2).sH(2).sW(2).build();

// Build graph
SDVariable conv   = sd.cnn.conv2d("conv1", input, weights, bias, cfg);
SDVariable act    = sd.nn.relu("act1", conv, 0);
SDVariable pooled = sd.cnn.maxPooling2d("pool1", act, pool);
SDVariable flat   = pooled.reshape(-1, flatSize);
SDVariable logits = sd.nn.linear("logits", flat, fcWeights, fcBias);
SDVariable out    = sd.nn.softmax("output", logits);
```

### `sd.rnn` — Recurrent Neural Network Operations

The `rnn` namespace provides modules for sequence modelling.

#### Simple Recurrent Units (SRU)

```java
SRUConfiguration sruConfig = new SRUConfiguration(input, weights, bias, initialState);
SDVariable sruOutput = sd.rnn.sru(sruConfig);

// Cell-level SRU (single time step)
SRUCellConfiguration sruCellCfg = new SRUCellConfiguration(input, weights, bias, state);
SDVariable[] sruCellOut = sd.rnn.sruCell(sruCellCfg);
```

#### LSTM

```java
LSTMConfiguration lstmCfg = LSTMConfiguration.builder()
    .forgetBias(1.0)
    .clippingCellValue(3.0)
    .build();

// Full LSTM layer (processes all time steps)
SDVariable[] lstmOut = sd.rnn.lstmLayer(input, cLast, yLast, weights, lstmCfg);

// LSTM cell (single time step)
LSTMCellConfiguration cellCfg = LSTMCellConfiguration.builder()
    .x(input)
    .cx(cellState)
    .cs(cLast)
    .h(hLast)
    .wci(wci).wcf(wcf).wco(wco)
    .b(bias)
    .w(weights)
    .build();
SDVariable[] lstmCellOut = sd.rnn.lstmCell(cellCfg);
```

#### GRU

```java
GRUConfiguration gruCfg = GRUConfiguration.builder()
    .x(input)
    .hLast(hLast)
    .wRU(wRU).wC(wC)
    .bRU(bRU).bC(bC)
    .build();

SDVariable gruOutput = sd.rnn.gru(gruCfg);
```

All recurrent outputs are `ARRAY`-type and can be fed into subsequent operations.

### `sd.loss` — Loss Functions

The `loss` namespace provides standard loss functions for training. Most loss functions share a common signature:

```java
SDVariable loss = sd.loss.functionName("loss_name", labels, predictions [, weights, LossReduce]);
```

The `String` name is required (can be `null` for auto-naming). `weights` and `LossReduce` are optional.

#### Common loss functions

```java
// Binary cross-entropy
SDVariable bce = sd.loss.binaryCrossEntropy("bce", labels, predictions, null, LossReduce.MEAN_BY_WEIGHT);

// Softmax cross-entropy (logits, not probabilities)
SDVariable sce = sd.loss.softmaxCrossEntropy("sce", labels, logits, null);

// Log loss
SDVariable ll = sd.loss.logLoss("logLoss", labels, predictions);

// Mean squared error
SDVariable mse = sd.loss.meanSquaredError("mse", labels, predictions, null, LossReduce.MEAN_BY_WEIGHT);

// Mean absolute error
SDVariable mae = sd.loss.absoluteDifference("mae", labels, predictions, null, LossReduce.MEAN_BY_WEIGHT);

// Hinge loss
SDVariable hinge = sd.loss.hingeLoss("hinge", labels, predictions, null, LossReduce.SUM);

// Huber loss (smooth L1)
SDVariable huber = sd.loss.huberLoss("huber", labels, predictions, null, LossReduce.MEAN_BY_WEIGHT, 1.0);

// Cosine distance loss
SDVariable cos = sd.loss.cosineDistance("cosine", labels, predictions, 1, null, LossReduce.MEAN_BY_WEIGHT);
```

#### Reduction methods

The `LossReduce` enum controls how per-sample losses are aggregated over the minibatch:

| `LossReduce` value             | Formula                                       | Result shape  |
| ------------------------------ | --------------------------------------------- | ------------- |
| `NONE`                         | Leave per-sample values as-is                 | `[batchSize]` |
| `SUM`                          | `sum(weights * loss_i)`                       | scalar        |
| `MEAN_BY_WEIGHT`               | `sum(weights * loss_i) / sum(weights)`        | scalar        |
| `MEAN_BY_NONZERO_WEIGHT_COUNT` | `sum(weights * loss_i) / count(weights != 0)` | scalar        |

When no weights are specified, `MEAN_BY_WEIGHT` and `MEAN_BY_NONZERO_WEIGHT_COUNT` are equivalent to plain mean.

Use `MEAN_BY_NONZERO_WEIGHT_COUNT` when you want to average only over "valid" samples marked with `weight=1`, ignoring padding positions marked with `weight=0`.

#### Weighted loss example

```java
// Per-sample weights: 2x for positive class, 1x for negative
SDVariable classWeights = sd.placeHolder("class_weights", DataType.FLOAT, -1);

SDVariable weightedBce = sd.loss.binaryCrossEntropy(
    "weighted_bce",
    labels, predictions,
    classWeights,
    LossReduce.MEAN_BY_WEIGHT
);
```

## The Don'ts of Operations

A few patterns cause subtle bugs in SameDiff graphs. Avoid them:

### Don't mix variables from different `SameDiff` instances

```java
SameDiff sd0 = SameDiff.create();
SameDiff sd1 = SameDiff.create();

SDVariable x = sd0.var(DataType.FLOAT, 4);
SDVariable y = sd1.placeHolder(DataType.FLOAT, 4);

// BAD: x and y belong to different SameDiff instances
SDVariable z = x.add(y);   // will throw an exception or produce wrong results
```

All variables used in a single op must belong to the same `SameDiff`.

### Don't discard operation results

Every op call creates a new node in the graph. If you call an op without assigning the result to a variable, the node is created but nothing can reference it downstream. This is almost always a bug:

```java
SDVariable z = x.add(y);
z.mul(2);           // BAD: result discarded — this node goes nowhere
x = z.mul(y);      // BAD: reassigning x does not modify the graph node x was pointing to
```

The correct pattern is always to assign the result to a new variable:

```java
SDVariable z   = x.add(y);
SDVariable z2  = z.mul(2);
SDVariable out = z2.mul(y);
```

### Don't redefine existing named variables

If you call `sd.var("weights", ...)` twice with the same name, you will either get an exception or silently reference the same underlying variable. Always use unique names:

```java
// BAD if "w1" already exists in the graph
SDVariable w1 = sd.var("w1", DataType.FLOAT, 128, 64);

// GOOD: distinct names
SDVariable w_encoder = sd.var("encoder_w1", DataType.FLOAT, 128, 64);
SDVariable w_decoder = sd.var("decoder_w1", DataType.FLOAT, 64, 128);
```

## Finding Operations in the Javadoc

The full operation reference is in the [SameDiff javadoc](https://deeplearning4j.org/api/latest/). Navigate to:

* `org.nd4j.autodiff.samediff.SDVariable` for instance methods.
* `org.nd4j.autodiff.samediff.ops.SDMath`, `SDRandom`, `SDNN`, `SDCNN`, `SDRNN`, `SDLoss` for namespace ops.

IDE autocompletion is also effective: type `sd.nn.` and browse the suggestions.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://deeplearning4j.konduit.ai/en-1.0.0-rewrite/nd4j/overview-2/operations.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
