> For the complete documentation index, see [llms.txt](https://deeplearning4j.konduit.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://deeplearning4j.konduit.ai/en-1.0.0-rewrite/nd4j/overview-2/execution.md).

# Execution and Inference

Defining a SameDiff graph is separate from running it. Once the graph is built, you execute it by supplying values for all `PLACEHOLDER` variables and specifying which output variables you want computed. SameDiff then evaluates just the subgraph necessary to produce those outputs.

## How Execution Works

SameDiff uses an `InferenceSession` internally to execute the graph. The session:

1. Accepts a map of placeholder name → `INDArray` value.
2. Determines which nodes are needed to compute the requested outputs (topological ordering).
3. Evaluates each node in order, caching intermediate results.
4. Returns the requested output arrays.

You rarely interact with `InferenceSession` directly; instead you use the higher-level methods on `SameDiff` described below.

## Setting Placeholder Values

Before execution, every `PLACEHOLDER` variable in the graph must have a value. Values are supplied as a `Map<String, INDArray>`:

```java
import org.nd4j.linalg.factory.Nd4j;
import java.util.HashMap;
import java.util.Map;

INDArray inputBatch = Nd4j.create(/* your data */);

Map<String, INDArray> placeholders = new HashMap<>();
placeholders.put("input", inputBatch);
```

If your graph also has label placeholders (e.g. for computing a validation loss), include those too:

```java
placeholders.put("labels", labelBatch);
```

You do not need to supply values for `VARIABLE` or `CONSTANT` nodes — they are stored inside the `SameDiff` instance and used automatically.

## sd.output() — Standard Inference

`sd.output()` is the primary method for running the forward pass and retrieving results. It returns a `Map<String, INDArray>` whose keys are the names of the requested output variables.

```java
// Request a single output
Map<String, INDArray> results = sd.output(placeholders, "softmax");
INDArray predictions = results.get("softmax");
```

Request multiple outputs in one call to avoid recomputing the graph twice:

```java
Map<String, INDArray> results = sd.output(placeholders, "hidden", "softmax", "loss");

INDArray hiddenActivations = results.get("hidden");
INDArray outputProbabilities = results.get("softmax");
INDArray lossValue = results.get("loss");
```

Only the nodes required to compute the listed outputs are executed. If you do not request a particular output, its subgraph may be skipped entirely.

## outputSingle() — Convenience for One Output

When you only need a single output array and do not want to unwrap a map, use `outputSingle()`:

```java
INDArray predictions = sd.outputSingle(placeholders, "softmax");
```

This is equivalent to `sd.output(placeholders, "softmax").get("softmax")` but saves a map lookup.

## Evaluating Persistent Variables

For `VARIABLE` and `CONSTANT` nodes, you can retrieve their stored values directly without running the graph:

```java
INDArray weights = sd.getVariable("w1").eval();
INDArray bias    = sd.getVariable("b1").getArr();
```

`eval()` is equivalent to `getArr()` for persistent variables. For `ARRAY` or `PLACEHOLDER` nodes, you need to have executed the graph first.

## exec() — Low-Level Graph Execution

`exec()` runs the full forward (and optionally backward) pass and returns a `Map<String, INDArray>` of all computed values. It is lower-level than `output()` and is mainly used when you need access to every intermediate result or when you are driving the training loop manually.

```java
import org.nd4j.autodiff.samediff.execution.ExecResult;

// Forward pass only
Map<String, INDArray> allOutputs = sd.exec(placeholders, sd.outputs());

// Access any variable value
INDArray hiddenOut = allOutputs.get("hidden");
```

For most inference use cases, prefer `output()` over `exec()` because it executes only the necessary subgraph.

## Batch Inference

For large datasets, iterate and accumulate predictions batch by batch:

```java
DataSetIterator testIter = /* your iterator */;
List<INDArray> allPredictions = new ArrayList<>();

while (testIter.hasNext()) {
    DataSet batch = testIter.next();

    Map<String, INDArray> pv = Map.of("input", batch.getFeatures());
    INDArray batchPred = sd.outputSingle(pv, "softmax");
    allPredictions.add(batchPred);
}

// Concatenate all predictions
INDArray predictions = Nd4j.vstack(allPredictions);
```

If you are computing a metric over the whole test set, use the built-in evaluation API instead — it is more efficient and avoids materialising all predictions in memory at once:

```java
import org.nd4j.evaluation.classification.Evaluation;

Evaluation eval = new Evaluation();
sd.evaluate(testIter, "output", 0, eval);
System.out.println(eval.stats());
```

The `evaluate()` method feeds batches through the graph and accumulates metric statistics incrementally.

## Querying Output Variable Names

To see which variables are marked as "outputs" of the graph (i.e. the terminal nodes that produce final results):

```java
List<String> outputNames = sd.outputs();
```

You can also list all variable names:

```java
List<String> allVarNames = sd.variableNames();
```

## Placeholder Shape Inference

SameDiff propagates shape information through the graph at graph-definition time where possible. Use `-1` for dimensions that are only known at runtime (typically the batch dimension):

```java
SDVariable input = sd.placeHolder("input", DataType.FLOAT, -1, 784);
```

After execution, the actual shape of any `ARRAY` variable can be retrieved:

```java
// Before execution: may contain -1 for unknown dims
long[] inferredShape = sd.getShapeForVarName("hidden");

// After execution: concrete shape from the actual data
INDArray result = sd.outputSingle(placeholders, "hidden");
long[] actualShape = result.shape();
```

## Performance Considerations

### Avoid recreating the SameDiff graph per request

Building a `SameDiff` graph (calling `sd.var()`, `sd.placeHolder()`, etc.) is expensive. Build the graph once — at application startup or model load time — and reuse the same `SameDiff` instance for all inference requests.

```java
// At startup:
SameDiff model = SameDiff.load(new File("model.fb"), true);

// Per request:
INDArray pred = model.outputSingle(Map.of("input", request), "softmax");
```

### Thread safety

A single `SameDiff` instance is not safe to call from multiple threads concurrently during inference, because execution caches results in the instance's internal state. Options:

* **Lock per call**: `synchronized(model) { model.outputSingle(...); }`
* **Pool of instances**: pre-load N copies of the model from the same file and distribute requests round-robin.
* **Use separate instances per thread** via `ThreadLocal<SameDiff>`.

### Minimise requested outputs

Only request the output variables you actually need. Requesting fewer outputs means fewer graph nodes are evaluated:

```java
// Good: only compute the softmax output
INDArray pred = sd.outputSingle(placeholders, "softmax");

// Less efficient if you only need predictions: also computes hidden + loss
Map<String, INDArray> all = sd.output(placeholders, "hidden", "softmax", "loss");
```

### Reuse INDArray input buffers

Where possible, reuse the same `INDArray` object across calls (refilling its contents) rather than allocating a new one per batch. This reduces garbage-collection pressure:

```java
INDArray inputBuffer = Nd4j.create(DataType.FLOAT, 64, 784);

while (source.hasNext()) {
    source.fillBatch(inputBuffer);  // write new data in-place
    INDArray pred = sd.outputSingle(Map.of("input", inputBuffer), "softmax");
    // process pred ...
}
```

## Working with InferenceSession Directly

Advanced users can interact with `InferenceSession` directly for fine-grained control:

```java
import org.nd4j.autodiff.samediff.internal.InferenceSession;

InferenceSession session = new InferenceSession(sd);

Map<String, INDArray> placeholderValues = Map.of("input", myInput);
List<String> requiredOutputs = List.of("softmax");
Set<String> requiredActivations = Collections.emptySet();

Map<String, INDArray> result = session.output(
    requiredOutputs,
    requiredActivations,
    placeholderValues,
    Collections.emptyList(),   // listeners
    At.defaultAt(),
    MultiDataSets.singleton(null, null)
);

INDArray softmax = result.get("softmax");
```

This level of control is rarely needed. Use `sd.output()` or `sd.outputSingle()` in all normal circumstances.

## Checking Graph Validity Before Execution

SameDiff can validate the graph structure before you run it. This is useful during development to catch wiring errors early:

```java
// Validate that all required inputs are connected and shapes are consistent
sd.validate();
```

If the graph has any disconnected nodes, missing inputs, or shape mismatches that can be detected statically, `validate()` will throw a descriptive exception.

***

## Improved SameDiff Execution Framework (ADR-0048)

SameDiff 2.x ships a redesigned execution engine that addresses long-standing reliability problems with control-flow graphs (while loops, conditionals). This section describes the user-facing debugging APIs that come with the new framework.

### Background: What Changed

The original `initSubgraph` implementation re-analyzed the graph on every execution and had fundamental convergence issues that made graphs with loops unreliable or incorrect. The new engine replaces that approach with a DAG-based execution plan that is built once and cached:

```
First call  → ForwardExecutionDAGBuilder constructs the plan  (one-time cost)
Later calls → DAGCache returns the pre-built plan              (near-zero overhead)
```

Existing models benefit automatically — no code changes are required. Complex graphs that previously failed to initialize or produced wrong results should now work correctly.

### Enabling Full Execution Analysis

All analysis features are **opt-in** to avoid memory overhead in production. Enable them once, before any calls to `output()` or `exec()`:

```java
import org.nd4j.autodiff.samediff.SameDiff;
import org.nd4j.autodiff.samediff.analysis.AnalysisLevel;

SameDiff sd = SameDiff.create();
// ... build or load your graph ...

// Enable all analysis and debugging features
sd.enableExecutionAnalysis(AnalysisLevel.FULL);
```

Available levels:

| Level      | What it enables                                                        |
| ---------- | ---------------------------------------------------------------------- |
| `NONE`     | Default — no analysis overhead                                         |
| `BASIC`    | Loop termination detection only                                        |
| `STANDARD` | Loop termination + variable evolution tracking                         |
| `FULL`     | Everything: evolution, termination, cross-frame tracing, visualization |

Use `FULL` during development and debugging, then switch back to `NONE` (or omit the call entirely) before deploying to production.

***

### Variable Evolution Analysis

`VariableEvolutionAnalysis` tracks how each variable's value changes across loop iterations and classifies it into one of five patterns.

#### Detected patterns

| Pattern       | Meaning                                                               |
| ------------- | --------------------------------------------------------------------- |
| `CONVERGING`  | Values are approaching a fixed limit — the loop will likely terminate |
| `DIVERGING`   | Values are growing without bound — the loop will likely not terminate |
| `OSCILLATING` | Values alternate between states — the loop may cycle forever          |
| `STABLE`      | Values are not changing — a fixed point has been reached              |
| `CHAOTIC`     | No discernible pattern — further analysis is needed                   |

#### Reading evolution results

After execution completes (or after the loop is interrupted), retrieve the analysis:

```java
import org.nd4j.autodiff.samediff.analysis.VariableEvolutionAnalysis;
import org.nd4j.autodiff.samediff.analysis.VariableEvolutionAnalysis.VariablePattern;

// Run your graph
sd.output(placeholders, "result");

// Retrieve the evolution analysis produced during execution
VariableEvolutionAnalysis evolution = sd.getExecutionAnalysis().getVariableEvolution();

// Check the pattern for a specific variable
VariablePattern pattern = evolution.getPattern("counter");
System.out.println("counter pattern: " + pattern);   // e.g. CONVERGING

// Retrieve the full value history for a variable
List<VariableStateInfo> history = evolution.getHistory("loss");
for (VariableStateInfo state : history) {
    System.out.printf("iteration %d: value=%.6f%n", state.getIteration(), state.getScalarValue());
}

// Ask for an estimated number of remaining iterations before convergence
OptionalInt remaining = evolution.estimateRemainingIterations("loss");
remaining.ifPresent(n -> System.out.println("Estimated iterations to convergence: " + n));
```

#### Typical debugging workflow

If a loop runs longer than expected:

1. Enable `AnalysisLevel.FULL` and re-run the graph.
2. Call `evolution.getPattern(varName)` for your loop-condition variables.
3. A `DIVERGING` or `OSCILLATING` pattern means the condition will never be satisfied — inspect the logic that updates that variable.
4. A `CONVERGING` pattern but unexpectedly many iterations usually means the convergence rate is too slow — consider adjusting your learning rate or convergence threshold.

***

### Loop Termination Analysis

`LoopTerminationAnalyzer` answers two questions: *will* this loop terminate, and *why* (or why not)?

#### Checking termination predictions

```java
import org.nd4j.autodiff.samediff.analysis.LoopTerminationAnalyzer;
import org.nd4j.autodiff.samediff.analysis.TerminationPrediction;

LoopTerminationAnalyzer analyzer = sd.getExecutionAnalysis().getLoopAnalyzer();

// Predict termination for the loop named "whileLoop"
TerminationPrediction prediction = analyzer.analyzeLoop("whileLoop");

System.out.println("Will terminate:       " + prediction.willTerminate());
System.out.println("Estimated iterations: " + prediction.getEstimatedIterations());
System.out.println("Confidence:           " + prediction.getConfidence());
```

#### Diagnosing an infinite loop

When a loop is suspected to be infinite, call `diagnoseInfiniteLoop()` to get a structured root-cause analysis:

```java
import org.nd4j.autodiff.samediff.analysis.RootCauseAnalysis;

RootCauseAnalysis diagnosis = analyzer.diagnoseInfiniteLoop("whileLoop");

System.out.println("Primary cause: " + diagnosis.getPrimaryCause());
for (String variable : diagnosis.getProblematicVariables()) {
    System.out.println("  Problematic variable: " + variable
        + " — " + diagnosis.getExplanation(variable));
}
```

#### Human-readable reports

For quick inspection during development, generate a plain-text report:

```java
import org.nd4j.autodiff.samediff.analysis.LoopTerminationErrorReport;

LoopTerminationErrorReport report = analyzer.generateReport("whileLoop");
System.out.println(report.toFormattedString());
```

Example output:

```
=== Loop Termination Report: whileLoop ===
Status          : NOT TERMINATING (high confidence)
Root cause      : Variable 'delta' is DIVERGING — its absolute value is increasing
                  each iteration and the loop condition requires delta < 1e-4.
Problematic vars:
  delta         : DIVERGING  (last 5 values: 1.0, 1.8, 3.2, 5.8, 10.4)
  counter       : CONVERGING (last 5 values: 0, 1, 2, 3, 4)
Recommendation  : Check the operation that updates 'delta'. The update may be
                  missing a normalization step, or the learning rate is too large.
```

#### Nested loops

The analyzer handles nested loops. When the inner loop of a nested construct is misbehaving, the report identifies it by its fully-qualified name:

```java
// Nested loop names follow the pattern outerLoop/innerLoop
TerminationPrediction inner = analyzer.analyzeLoop("outerLoop/innerLoop");
```

***

### Cross-Frame Variable References

Control flow in SameDiff creates *execution frames* — scoped contexts within which variables live. A loop body, for example, runs in its own frame that is distinct from the surrounding graph. When a loop body reads a variable defined outside the loop, that is a cross-frame reference.

#### Why cross-frame references matter

Prior to ADR-0048, cross-frame references were handled ad-hoc and were a common source of subtle bugs — variables would silently resolve to the wrong value or to an uninitialized state. The new engine tracks them explicitly through three control-flow operations:

| Operation | Role                                                                                                                                                                                                              |
| --------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `Enter`   | Copies a value from an outer frame into an inner frame (loop entry). Creates a tracked alias so the inner frame can always find the original value.                                                               |
| `Switch`  | Routes a value along the true or false branch of a conditional. Records which branch was taken for debugging.                                                                                                     |
| `Merge`   | Combines values from multiple incoming frames (e.g. the two branches of an `if`/`else`, or successive loop iterations). Resolves inputs using a priority strategy: current frame first, then cross-frame aliases. |

#### Observing cross-frame resolution

When `AnalysisLevel.FULL` is active, each control-flow decision is recorded in the execution trace:

```java
import org.nd4j.autodiff.samediff.analysis.ExecutionTrace;
import org.nd4j.autodiff.samediff.analysis.ControlFlowEvent;

ExecutionTrace trace = sd.getExecutionAnalysis().getExecutionTrace();

for (ControlFlowEvent event : trace.getControlFlowEvents()) {
    System.out.printf("[%s] op=%s frame=%s decision=%s%n",
        event.getOpType(),       // ENTER, SWITCH, MERGE
        event.getOpName(),
        event.getFrameName(),
        event.getDecision());    // e.g. "branch=RIGHT", "alias=outerVar@frame0"
}
```

#### Diagnosing cross-frame bugs

If you see a `NullPointerException` or unexpectedly stale values inside a loop body, check the `Merge` resolution log for that variable:

```java
List<ControlFlowEvent> mergeEvents = trace.getControlFlowEvents().stream()
    .filter(e -> e.getOpType().equals("MERGE") && e.getOpName().equals("myVar/Merge"))
    .collect(Collectors.toList());

mergeEvents.forEach(e -> System.out.println(e.toDetailedString()));
```

A `Merge` that reports `resolved via alias` on the first iteration and then fails on later iterations typically indicates the `Enter` alias was not created, meaning the variable was not correctly wired through an `Enter` node into the loop frame.

***

### The initSubgraph Convergence Fix

The original `initSubgraph` method attempted to determine which subgraph to execute by iteratively expanding a set of required nodes. This process had convergence problems: on graphs with certain topologies (particularly those involving `Merge` nodes or back-edges), the expansion would either diverge (include too many nodes) or fail to include all required nodes.

The new engine replaces this entirely with `ForwardExecutionDAGBuilder`, which performs a single deterministic traversal:

1. Starting from the requested output nodes, walk backwards through the graph following dependencies.
2. Assign each node to one of four typed `ExecutionNode` categories:

   | Type              | Purpose                                       |
   | ----------------- | --------------------------------------------- |
   | `VARIABLE_INIT`   | Initialize constants and persistent variables |
   | `PLACEHOLDER_SET` | Inject placeholder values                     |
   | `OPERATION_EXEC`  | Execute a mathematical or control-flow op     |
   | `CONTROL_FLOW`    | Manage frame entry, exit, and merging         |
3. Cache the resulting DAG keyed on the set of requested outputs. Subsequent calls with the same output set reuse the cached plan.

No user action is required to benefit from this fix. If your graph previously failed with an error like:

```
IllegalStateException: initSubgraph did not converge after N iterations
```

or produced silently wrong results in loop-heavy graphs, re-running against the updated library should resolve the problem without any model changes.

***

### Execution Visualization

When `AnalysisLevel.FULL` is enabled, SameDiff records a detailed execution trace that can be inspected programmatically or printed for offline analysis.

#### Stepping through the execution log

```java
import org.nd4j.autodiff.samediff.analysis.ExecutionTrace;
import org.nd4j.autodiff.samediff.analysis.ExecutionStep;

ExecutionTrace trace = sd.getExecutionAnalysis().getExecutionTrace();

for (ExecutionStep step : trace.getSteps()) {
    System.out.printf("type=%-20s  op=%-30s  frame=%s  iter=%d%n",
        step.getExecType(),     // OP, CONTROL_FLOW, VARIABLE_INIT, etc.
        step.getOpName(),
        step.getFrame(),
        step.getIteration());
}
```

#### Focused control-flow debugging

For graphs that mix loops with conditionals, filter the trace to control-flow steps only:

```java
trace.getSteps().stream()
    .filter(s -> s.getExecType() == ExecType.CONTROL_FLOW)
    .forEach(s -> System.out.println(s.toDetailedString()));
```

#### Exporting the trace

For large graphs, export the trace to a file and analyze it separately:

```java
trace.exportToJson(new File("execution_trace.json"));
```

The JSON format records, for each step: op name, frame, iteration counter, input array shapes, output array shapes, and (for control-flow ops) the branch or alias decision made.

***

### Summary of Debugging APIs

| Class / Method                                                | What it tells you                                                                 |
| ------------------------------------------------------------- | --------------------------------------------------------------------------------- |
| `sd.enableExecutionAnalysis(AnalysisLevel.FULL)`              | Activates all analysis features                                                   |
| `VariableEvolutionAnalysis.getPattern(name)`                  | Whether a loop variable is converging, diverging, oscillating, stable, or chaotic |
| `VariableEvolutionAnalysis.getHistory(name)`                  | Full value history across iterations                                              |
| `VariableEvolutionAnalysis.estimateRemainingIterations(name)` | How many more iterations until convergence                                        |
| `LoopTerminationAnalyzer.analyzeLoop(name)`                   | Termination prediction with confidence score                                      |
| `LoopTerminationAnalyzer.diagnoseInfiniteLoop(name)`          | Root-cause analysis of non-terminating loops                                      |
| `LoopTerminationAnalyzer.generateReport(name)`                | Human-readable termination report                                                 |
| `ExecutionTrace.getControlFlowEvents()`                       | Per-op record of every `Enter`/`Switch`/`Merge` decision                          |
| `ExecutionTrace.exportToJson(file)`                           | Full trace for offline analysis                                                   |