Namespace Microsoft.ML.OnnxRuntime

Classes

CheckpointState

Holds the state of the training session. This class holds the entire training session state that includes model parameters, their gradients, optimizer parameters, and user properties. The TrainingSession leverages the CheckpointState by accessing and updating the contained training state.

note

Note that the training session created with a checkpoint state uses this state to store the entire training state (including model parameters, its gradients, the optimizer states and the properties). The TrainingSession does not hold a copy of the CheckpointState and as a result, it is required that the checkpoint state outlives the lifetime of the training session.

DisposableNamedOnnxValue

This is a legacy class that is kept for backward compatibility. Use OrtValue based API.

This class serves as a container for model run output values including tensors, sequences of tensors, sequences and maps. The class must be disposed of. It disposes of _ortValueHolder that owns the underlying Ort output value and anything else that would need to be disposed by the instance of the class. Use factory method CreateFromOrtValue to obtain an instance of the class.

FixedBufferOnnxValue

This is a legacy class that is kept for backward compatibility. Use OrtValue based API.

Represents an OrtValue with its underlying buffer pinned

InferenceSession

Represents an Inference Session on an ONNX Model. This is a IDisposable class and it must be disposed of using either a explicit call to Dispose() method or a pattern of using() block. If this is a member of another class that class must also become IDisposable and it must dispose of InferenceSession in its Dispose() method.

MapMetadata

Represents Map MetaData. Key is always a tensor denoted by an element type with value type being a recursive structure that may contain other maps, sequences or tensors.

ModelMetadata

A class that queries and caches model metadata and exposes it as properties

NamedOnnxValue

This is a legacy class that is kept for backward compatibility. Use OrtValue based API.

The class associates a name with an Object. The name of the class is a misnomer, it does not hold any Onnx values, just managed representation of them.

The class is currently used as both inputs and outputs. Because it is non- disposable, it can not hold on to any native objects.

When used as input, we temporarily create OrtValues that map managed inputs directly. Thus we are able to avoid copying of contiguous data.

For outputs, tensor buffers works the same as input, providing it matches the expected output shape. For other types (maps and sequences) we create a copy of the data. This is because, the class is not Disposable and it is a public interface, thus it can not own the underlying OrtValues that must be destroyed before Run() returns.

To avoid data copying on output, use DisposableNamedOnnxValue class that is returned from Run() methods. This provides access to the native memory tensors and avoids copying.

It is a recursive structure that may contain Tensors (base case) Other sequences and maps. Although the OnnxValueType is exposed, the caller is supposed to know the actual data type contained.

The convention is that for tensors, it would contain a DenseTensor{T} instance or anything derived from Tensor{T}.

For sequences, it would contain a IList{T} where T is an instance of NamedOnnxValue that would contain a tensor or another type.

For Maps, it would contain a IDictionary{K, V} where K,V are primitive types or strings.

NodeMetadata

Resembles type and shape information of session-graph nodes, used for communicating the shape/type of input/output nodes

OnnxRuntimeException

The Exception that is thrown for errors related ton OnnxRuntime

OptionalMetadata

The class contains metadata for an optional input/output

OrtAllocator

The class exposes native internal allocator for Onnxruntime. This allocator enables you to allocate memory from the internal memory pools including device allocations. Useful for binding.

OrtArenaCfg

This class encapsulates arena configuration information that will be used to define the behavior of an arena based allocator See docs/C_API.md for more details

OrtCUDAProviderOptions

Holds the options for configuring a CUDA Execution Provider instance

OrtEnv

The singleton class OrtEnv contains the process-global ONNX Runtime environment. It sets up logging, creates system wide thread-pools (if Thread Pool options are provided) and other necessary things for OnnxRuntime to function.

Create or access OrtEnv by calling the Instance() method. Instance() can be called multiple times. It would return the same instance.

CreateInstanceWithOptions() provides a way to create environment with options. It must be called once before Instance() is called, otherwise it would not have effect.

If the environment is not explicitly created, it will be created as needed, e.g., when creating a SessionOptions instance.

OrtEpDevice

Represents the combination of an execution provider and a hardware device that the execution provider can utilize.

OrtExternalAllocation

This class represents an arbitrary buffer of memory allocated and owned by the user. It can be either a CPU, GPU or other device memory that can be suitably represented by IntPtr. This is just a composite of the buffer related information. The memory is assumed to be pinned if necessary and usable immediately in the native code.

OrtHardwareDevice

Represents a hardware device that is available on the current system.

OrtIoBinding

This class enables binding of inputs and/or outputs to pre-allocated memory. This enables interesting scenarios. For example, if your input already resides in some pre-allocated memory like GPU, you can bind that piece of memory to an input name and shape and onnxruntime will use that as input. Other traditional inputs can also be bound that already exists as Tensors.

Note, that this arrangement is designed to minimize data copies and to that effect your memory allocations must match what is expected by the model, whether you run on CPU or GPU. Data copy will still be made, if your pre-allocated memory location does not match the one expected by the model. However, copies with OrtIoBindings are only done once, at the time of the binding, not at run time. This means, that if your input data required a copy, your further input modifications would not be seen by onnxruntime unless you rebind it, even if it is the same buffer. If you require the scenario where data is copied, OrtIOBinding may not be the best match for your use case. The fact that data copy is not made during runtime also has performance implications.

Making OrtValue first class citizen in ORT C# API practically obsoletes all of the existing overloads because OrtValue can be created on top of the all other types of memory. No need to designate it as external or Ort allocation or wrap it in FixedBufferOnnxValue. The latter does not support rebinding or memory other than CPU anyway.

In fact, one can now create OrtValues over arbitrary pieces of memory, managed, native, stack and device(gpu) and feed them to the model and achieve the same effect without using IOBinding class.

OrtKeyValuePairs

Class to manage key-value pairs. These are most often used for options and metadata.

OrtLoraAdapter

Represents Lora Adapter in memory

OrtMemoryAllocation

This class represents memory allocation made by a specific onnxruntime allocator. Use OrtAllocator.Allocate() to obtain an instance of this class. It implements IDisposable and makes use of the original allocator used to allocate the memory. The lifespan of the allocator instance must eclipse the lifespan of the allocation. Or, if you prefer, all OrtMemoryAllocation instances must be disposed of before the corresponding allocator instances are disposed of.

OrtMemoryInfo

This class encapsulates and most of the time owns the underlying native OrtMemoryInfo instance. Instance returned from OrtAllocator will not own OrtMemoryInfo, the class must be disposed regardless.

Use this class to query and create OrtAllocator instances so you can pre-allocate memory for model inputs/outputs and use it for binding. Instances of the class can also used to created OrtValues bound to pre-allocated memory. In that case, the instance of OrtMemoryInfo contains the information about the allocator used to allocate the underlying memory.

OrtModelCompilationOptions

This class is used to set options for model compilation, and to produce a compiled model using those options. See https://onnxruntime.ai/docs/api/c/ for further details of various options.

OrtROCMProviderOptions

Holds the options for configuring a ROCm Execution Provider instance

OrtTensorRTProviderOptions

Holds the options for configuring a TensorRT Execution Provider instance

OrtThreadingOptions

This class allows to specify global thread pool options when instantiating the ONNX Runtime environment for the first time.

OrtTypeInfo

This class retrieves Type Information for input/outputs of the model.

OrtValue

Represents a disposable OrtValue. This class exposes a native instance of OrtValue. The class implements IDisposable and must be disposed of, otherwise native resources will leak and will eventually cause the application to slow down or crash.

If the OrtValue instance is constructed over a managed memory, and it is not disposed properly, the pinned memory will continue to be pinned and interfere with GC operation.

PrePackedWeightsContainer

This class holds pre-packed weights of shared initializers to be shared across sessions using these initializers and thereby provide memory savings by sharing the same pre-packed versions of these shared initializers

ProviderOptionsValueHelper

This helper class contains methods to handle values of provider options

RunOptions

Sets various runtime options.

SequenceMetadata

Represents sequence metadata

SessionOptions

Holds the options for creating an InferenceSession It forces the instantiation of the OrtEnv singleton.

SessionOptionsContainer

Helper to allow the creation/addition of session options based on pre-defined named entries.

TensorTypeAndShape

Represents tensor element type and its shapes

TrainingSession

Trainer class that provides training, evaluation and optimizer methods for training an ONNX model.

The training session requires four training artifacts

The training onnx model
The evaluation onnx model (optional)
The optimizer onnx model
The checkpoint directory

These artifacts can be generated using the onnxruntime-training python utility.

This is an IDisposable class and it must be disposed of using either an explicit call to Dispose() method or a pattern of using() block. If this is a member of another class that class must also become IDisposable and it must dispose of TrainingSession in its Dispose() method.

TrainingUtils

This class defines utility methods for training.

Structs

BFloat16

This value type represents A BFloat16 value. See https://cloud.google.com/blog/products/ai-machine-learning/bfloat16-the-secret-to-high-performance-on-cloud-tpus for details. it is blittable as defined in https://docs.microsoft.com/en-us/dotnet/framework/interop/blittable-and-non-blittable-types and as such, represented the same way in managed and native memories. This means that arrays of this type do not have to be copied to be passed to native memory but simply pinnned and read by native code. Thus, one can create a Tensor on top of an array of these structures and feed it directly to Onnxruntime library. Binary wise, it is the same as ushort[] (uint16_t in C++). However, we would like a separate type for type dispatching.

EnvironmentCreationOptions

Options you might want to supply when creating the environment. Everything is optional.

Float16

This value type represents A Float16 value it is blittable as defined in https://docs.microsoft.com/en-us/dotnet/framework/interop/blittable-and-non-blittable-types and as such, represented the same way in managed and native memories. This means that arrays of this type do not have to be copied to be passed to native memory but simply pinned and read by native code. Thus, one can create a Tensor on top of an array of these structures and feed it directly to Onnxruntime library. Binary wise, it is the same as ushort[] (uint16_t in C++). However, we would like a separate type for type dispatching.

The implementation is derived from https://source.dot.net/#System.Private.CoreLib/src/libraries/System.Private.CoreLib/src/System/Half.cs,7895d5942d33f974

MarshaledString

This class converts a string to a UTF8 encoded byte array and then copies it to an unmanaged buffer. This is done, so we can pass it to the native code and avoid pinning.

MarshaledStringArray

Keeps a list of MarshaledString instances and provides a way to dispose them all at once. It is a ref struct, so it can not be IDisposable.

OrtMapTypeInfo

Represents Map input/output information.

Maps are represented at run time by a tensor of primitive types and values are represented either by Tensor/Sequence/Optional or another map.

OrtSequenceOrOptionalTypeInfo

Represents Sequence type information.

OrtTensorTypeAndShapeInfo

This struct represents type and shape information for a tensor. It may describe a tensor type that is a model input or output or an information that can be extracted from a tensor in OrtValue.

Interfaces

IDisposableReadOnlyCollection<T>

Return immutable collection of results

Enums

CoreMLFlags

CoreML flags for use with SessionOptions. See https://github.com/microsoft/onnxruntime/blob/main/include/onnxruntime/core/providers/coreml/coreml_provider_factory.h

ExecutionMode

Controls whether you want to execute operators in the graph sequentially or in parallel. Usually when the model has many branches, setting this option to ExecutionMode.ORT_PARALLEL will give you better performance. See [ONNX_Runtime_Perf_Tuning.md] for more details.

ExecutionProviderDevicePolicy

Controls the execution provider selection when using automatic EP selection. Execution providers must be registered with the OrtEnv to be available for selection.

GraphOptimizationLevel

Graph optimization level to use with SessionOptions [https://github.com/microsoft/onnxruntime/blob/main/docs/ONNX_Runtime_Graph_Optimizations.md]

NnapiFlags

NNAPI flags for use with SessionOptions. See https://github.com/microsoft/onnxruntime/blob/main/include/onnxruntime/core/providers/nnapi/nnapi_provider_factory.h

OnnxValueType

A type of data that OrtValue encapsulates.

OrtAllocatorType

See documentation for OrtAllocatorType in C API

OrtCompileApiFlags

Flags representing options to enable when compiling a model. Matches OrtCompileApiFlags in the ORT C API.

OrtHardwareDeviceType

Represents the type of hardware device. Matches OrtHardwareDeviceType in the ORT C API.

OrtLoggingLevel

Log severity levels

Must in sync with OrtLoggingLevel in onnxruntime_c_api.h

OrtMemType

See documentation for OrtMemType in C API

Delegates

DOrtLoggingFunction

Delegate for logging function callback. Supply your function and register it with the environment to receive logging callbacks via EnvironmentCreationOptions

OrtValue.MapVisitor

A public delegate that will be invoked once with map keys and values. The delegate helps not to deal with the lifespan of intermediate OrtValues. Typically, when one uses GetValue() API, it creates a copy of OrtValue that points to the same buffer as keys or values. This API helps to deal with those temporary instances and avoid leaks.

According to ONNX standard map keys can be unmanaged types only (or strings). Those keys are contained in a single tensor within OrtValue keys. So you can query those directly from keys argument.

Map values, on the other hand, can be composite types. The values parameter can either contain a single tensor with unmanaged map values with the same number of elements as the keys, or it can be a sequence of OrtValues, each of those can be a composite type (tensor, sequence, map). If it is a sequence, then the number of elements must match the number of elements in keys.

Depending on the structure of the values, one will either directly query a single tensor from values, or will have to iterate over the sequence of OrtValues and visit each of those resulting in a recursive visitation.

OrtValue.SequenceElementVisitor

A delegate type that is expected to process each OrtValue in a sequence.

SessionOptions.EpSelectionDelegate

Delegate to select execution provider devices from a list of available devices.