Class InferenceSession

java.lang.Object
smile.onnx.InferenceSession
All Implemented Interfaces:
AutoCloseable

public class InferenceSession extends Object implements AutoCloseable
Represents an ONNX Runtime inference session for a single model.

An InferenceSession loads an ONNX model, compiles it (with optional graph optimisations), and exposes the run(Map) method for executing inference.

Typical usage

try (var session = InferenceSession.create("resnet50.onnx")) {
    // Inspect the model
    session.inputNames().forEach(System.out::println);

    // Build inputs
    float[] pixels = ...; // 1 × 3 × 224 × 224
    Map<String, OrtValue> inputs = Map.of(
        "input", OrtValue.fromFloatArray(pixels, new long[]{1, 3, 224, 224})
    );

    // Run inference and read the output
    try (var inputs = ...) {
        OrtValue[] outputs = session.run(inputs);
        float[] scores = outputs[0].toFloatArray();
        // close outputs when done
        for (var v : outputs) v.close();
    }
}

Thread safety

A single InferenceSession may be used concurrently from multiple threads. Each call to run(Map) is independent.

  • Method Details

    • create

      public static InferenceSession create(String modelPath)
      Creates an InferenceSession from a model file path using default session options.
      Parameters:
      modelPath - path to the .onnx model file.
      Returns:
      the loaded session.
    • create

      public static InferenceSession create(String modelPath, SessionOptions sessionOptions)
      Creates an InferenceSession from a model file path using the supplied session options.
      Parameters:
      modelPath - path to the .onnx model file.
      sessionOptions - session configuration.
      Returns:
      the loaded session.
    • create

      public static InferenceSession create(byte[] modelBytes, SessionOptions sessionOptions)
      Creates an InferenceSession from a model already loaded into a byte array (e.g. from a JAR resource).
      Parameters:
      modelBytes - the serialized ONNX model bytes.
      sessionOptions - session configuration.
      Returns:
      the loaded session.
    • create

      public static InferenceSession create(byte[] modelBytes)
      Creates an InferenceSession from a model byte array using default session options.
      Parameters:
      modelBytes - the serialized ONNX model bytes.
      Returns:
      the loaded session.
    • run

      public OrtValue[] run(Map<String,OrtValue> inputs)
      Runs inference using all model inputs and all model outputs with default run options.
      Parameters:
      inputs - a map of input name → OrtValue.
      Returns:
      an array of output OrtValues in the order returned by outputNames().
    • run

      public OrtValue[] run(Map<String,OrtValue> inputs, String[] outputNames)
      Runs inference for a selected set of outputs with default run options.
      Parameters:
      inputs - a map of input name → OrtValue.
      outputNames - the names of the outputs to compute.
      Returns:
      the requested outputs in the supplied order.
    • run

      public OrtValue[] run(Map<String,OrtValue> inputs, String[] outputNames, RunOptions runOptions)
      Runs inference with explicit run options.
      Parameters:
      inputs - a map of input name → OrtValue.
      outputNames - the names of the outputs to compute.
      runOptions - per-run options, or null for defaults.
      Returns:
      the requested outputs in the supplied order.
    • inputCount

      public int inputCount()
      Returns the number of model inputs.
      Returns:
      input count.
    • outputCount

      public int outputCount()
      Returns the number of model outputs.
      Returns:
      output count.
    • inputInfos

      public List<NodeInfo> inputInfos()
      Returns the input node information list.
      Returns:
      list of NodeInfo for each input.
    • outputInfos

      public List<NodeInfo> outputInfos()
      Returns the output node information list.
      Returns:
      list of NodeInfo for each output.
    • inputNames

      public List<String> inputNames()
      Returns the ordered list of input names.
      Returns:
      input names.
    • outputNames

      public List<String> outputNames()
      Returns the ordered list of output names.
      Returns:
      output names.
    • metadata

      public ModelMetadata metadata()
      Returns metadata associated with the model (producer name, version, etc.).
      Returns:
      the ModelMetadata.
    • close

      public void close()
      Specified by:
      close in interface AutoCloseable
    • toString

      public String toString()
      Overrides:
      toString in class Object