Plugin Execution Provider Libraries
Contents
Background
An ONNX Runtime Execution Provider (EP) executes model operations on one or more hardware accelerators (e.g., GPU, NPU, etc.). ONNX Runtime provides a variety of built-in EPs, such as the default CPU EP. To enable further extensibility, ONNX Runtime supports user-defined plugin EP libraries that an application can register with ONNX Runtime for use in an ONNX Runtime inference session.
This page provides a reference for the APIs necessary to develop and use plugin EP libraries with ONNX Runtime.
Creating a plugin EP library
A plugin EP is built as a dynamic/shared library that exports the functions CreateEpFactories()
and ReleaseEpFactory()
. ONNX Runtime calls CreateEpFactories()
to obtain one or more instances of OrtEpFactory
. An OrtEpFactory
creates OrtEp
instances and specifies the hardware devices supported by the EPs it creates.
The ONNX Runtime repository includes a sample plugin EP library, which is referenced in the following sections.
Defining an OrtEp
An OrtEp
represents an instance of an EP that is used by an ONNX Runtime session to identify and execute the model operations supported by the EP.
The following table lists the required variables and functions that an implementer must define for an OrtEp
.
Field | Summary | Example implementation |
---|---|---|
ort_version_supported | The ONNX Runtime version with which the EP was compiled. Implementation should set to ORT_API_VERSION . | ExampleEp() |
GetName | Get the execution provider name. | ExampleEp::GetNameImpl() |
GetCapability | Get information about the nodes/subgraphs supported by the OrtEp instance. | ExampleEp::GetCapabilityImpl() |
Compile | Compile OrtGraph instances assigned to the OrtEp . Implementation must set a OrtNodeComputeInfo instance for each OrtGraph in order to define its computation function.If the session is configured to generate a pre-compiled model, the execution provider must return count number of EPContext nodes. | ExampleEp::CompileImpl() |
ReleaseNodeComputeInfos | Release OrtNodeComputeInfo instances. | ExampleEp::ReleaseNodeComputeInfosImpl() |
The following table lists the optional functions that an implementor may define for an OrtEp
. If an optional OrtEp
function is not defined, ONNX Runtime uses a default implementation.
Field | Summary | Example implementation |
---|---|---|
GetPreferredDataLayout | Get the EP's preferred data layout. If this function is not implemented, ORT assumes that the EP prefers the data layout OrtEpDataLayout::NCHW . | |
ShouldConvertDataLayoutForOp | Given an op with domain domain and type op_type , determine whether an associated node's data layout should be converted to a target_data_layout . If the EP prefers a non-default data layout, this function will be called during layout transformation with target_data_layout set to the EP's preferred data layoutImplementation of this function is optional. If an EP prefers a non-default data layout, it may implement this to customize the specific op data layout preferences at a finer granularity. | |
SetDynamicOptions | Set dynamic options on this EP. Dynamic options can be set by the application at any time after session creation with OrtApi::SetEpDynamicOptions() .Implementation of this function is optional. An EP should only implement this function if it needs to handle any dynamic options. | |
OnRunStart | Called by ORT to notify the EP of the start of a run. Implementation of this function is optional. An EP should only implement this function if it needs to handle application-provided options at the start of a run. | |
OnRunEnd | Called by ORT to notify the EP of the end of a run. Implementation of this function is optional. An EP should only implement this function if it needs to handle application-provided options at the end of a run. | |
CreateAllocator | Create an OrtAllocator for the given OrtMemoryInfo for an OrtSession .The OrtMemoryInfo instance will match one of the values set in the OrtEpDevice using EpDevice_AddAllocatorInfo . Any allocator specific options should be read from the session options.Implementation of this function is optional. If not provided, ORT will use `OrtEpFactory::CreateAllocator()`. | |
CreateSyncStreamForDevice | Create a synchronization stream for the given memory device for an OrtSession .This is used to create a synchronization stream for the execution provider and is used to synchronize operations on the device during model execution. Any stream specific options should be read from the session options. Implementation of this function is optional. If not provided, ORT will use `OrtEpFactory::CreateSyncStreamForDevice()`. | |
GetCompiledModelCompatibilityInfo | Get a string with details about the EP stack used to produce a compiled model. The compatibility information string can be used with OrtEpFactory::ValidateCompiledModelCompatibilityInfo to determine if a compiled model is compatible with the EP. |
Defining an OrtEpFactory
An OrtEpFactory
represents an instance of an EP factory that is used by an ONNX Runtime session to query device support, create allocators, create data transfer objects, and create instances of an EP (i.e., an OrtEp
).
The following table lists the required variables and functions that an implementer must define for an OrtEpFactory
.
Field | Summary | Example implementation |
---|---|---|
ort_version_supported | The ONNX Runtime version with which the EP was compiled. Implementation should set this to ORT_API_VERSION . | ExampleEpFactory() |
GetName | Get the name of the EP that the factory creates. Must match OrtEp::GetName() . | ExampleEpFactory::GetNameImpl() |
GetVendor | Get the name of the name of the vendor that owns the EP that the factory creates. | ExampleEpFactory::GetVendor() |
GetVendorId | Get the vendor ID of the vendor that owns the EP that the factory creates. This is typically the PCI vendor ID. | ExampleEpFactory::GetVendorId() |
GetVersion | Get the version of the EP that the factory creates. The version string should adhere to the Semantic Versioning 2.0 specification. | ExampleEpFactory::GetVersionImpl() |
GetSupportedDevices | Get information about the OrtHardwareDevice instances supported by an EP created by the factory. | ExampleEpFactory::GetSupportedDevicesImpl() |
CreateEp | Creates an OrtEp instance for use in an ONNX Runtime session. ORT calls OrtEpFactory::ReleaseEp() to release the instance. | ExampleEpFactory::CreateEpImpl() |
The following table lists the optional functions that an implementer may define for an OrtEpFactory
.
Field | Summary | Example implementation |
---|---|---|
ValidateCompiledModelCompatibilityInfo | Validate the compatibility of a compiled model with the EP. This function validates if a model produced with the supllied compatibility information string is supported by the underlying EP. The implementation should check if a compiled model is compatible with the EP and return the appropriate OrtCompiledModelCompatibility value. | |
CreateAllocator | Create an OrtAllocator that can be shared across sessions for the given OrtMemoryInfo .The factory that creates the EP is responsible for providing the allocators required by the EP. The OrtMemoryInfo instance will match one of the values set in the OrtEpDevice using EpDevice_AddAllocatorInfo . | ExampleEpFactory::CreateAllocatorImpl() |
ReleaseAllocator | Releases an OrtAllocator instance created by the factory. | ExampleEpFactory::ReleaseAllocatorImpl() |
CreateDataTransfer | Creates an OrtDataTransferImpl instance for the factory.An OrtDataTransferImpl can be used to copy data between devices that the EP supports. | ExampleEpFactory::CreateDataTransferImpl() |
IsStreamAware | Returns true if the EPs created by the factory are stream-aware. | ExampleEpFactory::IsStreamAwareImpl() |
CreateSyncStreamForDevice | Creates a synchronization stream for the given OrtMemoryDevice .This is use to create a synchronization stream for the OrtMemoryDevice that can be used for operations outside of a session. | ExampleEpFactory::CreateSyncStreamForDeviceImpl() |
Exporting functions to create and release factories
ONNX Runtime expects a plugin EP library to export certain functions/symbols. The following table lists the functions that have to be exported from the plugin EP library.
Function | Description | Example implementation |
---|---|---|
CreateEpFactories | ONNX Runtime calls this function to create OrtEpFactory instances. | ExampleEp: CreateEpFactories |
ReleaseEpFactory | ONNX Runtime calls this function to release an OrtEpFactory instance. | ExampleEp: ReleaseEpFactory |
Using a plugin EP library
Plugin EP library registration
The sample application code below uses the following API functions to register and unregister a plugin EP library.
const char* lib_registration_name = "ep_lib_name";
Ort::Env env;
// Register plugin EP library with ONNX Runtime.
env.RegisterExecutionProviderLibrary(
lib_registration_name, // Registration name can be anything the application chooses.
ORT_TSTR("ep_path.dll") // Path to the plugin EP library.
);
{
Ort::Session session(env, /*...*/);
// Run a model ...
}
// Unregister the library using the application-specified registration name.
// Must only unregister a library after all sessions that use the library have been released.
env.UnregisterExecutionProviderLibrary(lib_registration_name);
As shown in the following sequence diagram, registering a plugin EP library causes ONNX Runtime to load the library and call the library’s CreateEpFactories()
function. During the call to CreateEpFactories()
, ONNX Runtime determines the subset of hardware devices supported by each factory by calling OrtEpFactory::GetSupportedDevices()
with all hardware devices that ONNX Runtime discovered during initialization.
The factory returns OrtEpDevice
instances from OrtEpFactory::GetSupportedDevices()
. Each OrtEpDevice
instance pairs a factory with a hardware device that the factory supports. For example, if a single factory instance supports both CPU and NPU, then the call to OrtEpFactory::GetSupportedDevices()
returns two OrtEpDevice
instances:
- ep_device_0: (factory_0, CPU)
- ep_device_1: (factory_0, NPU)
Session creation with explicit OrtEpDevice(s)
The application code below uses the API function SessionOptionsAppendExecutionProvider_V2 to add an EP from a library to an ONNX Runtime session.
The application first calls GetEpDevices to get a list of OrtEpDevices
available to the application. Each OrtEpDevice
represents a hardware device supported by an OrtEpFactory
. The SessionOptionsAppendExecutionProvider_V2
function takes an array of OrtEpDevice
instances as input, where all OrtEpDevice
instances refer to the same OrtEpFactory
.
Ort::Env env;
env.RegisterExecutionProviderLibrary(/*...*/);
{
std::vector<Ort::ConstEpDevice> ep_devices = env.GetEpDevices();
// Find the Ort::EpDevice for "my_ep".
std::array<Ort::ConstEpDevice, 1> selected_ep_devices = { nullptr };
for (Ort::ConstEpDevice ep_device : ep_devices) {
if (std::strcmp(ep_device.GetName(), "my_ep") == 0) {
selected_ep_devices[0] = ep_device;
break;
}
}
if (selected_ep_devices[0] == nullptr) {
// Did not find EP. Report application error ...
}
Ort::KeyValuePairs ep_options(/*...*/); // Optional EP options.
Ort::SessionOptions session_options;
session_options.AppendExecutionProvider_V2(env, selected_ep_devices, ep_options);
Ort::Session session(env, ORT_TSTR("model.onnx"), session_options);
// Run model ...
}
env.UnregisterExecutionProviderLibrary(/*...*/);
As shown in the following sequence diagram, ONNX Runtime calls OrtEpFactory::CreateEp()
during session creation in order to create an instance of the plugin EP.
Session creation with automatic EP selection
The application code below uses the API function SessionOptionsSetEpSelectionPolicy to have ONNX Runtime automatically select an EP based on the user’s policy (e.g., PREFER_NPU). If the plugin EP library registered with ONNX Runtime has a factory that supports NPU, then ONNX Runtime may select an EP from that factory to run the model.
Ort::Env env;
env.RegisterExecutionProviderLibrary(/*...*/);
{
Ort::SessionOptions session_options;
session_options.SetEpSelectionPolicy(OrtExecutionProviderDevicePolicy::PREFER_NPU);
Ort::Session session(env, ORT_TSTR("model.onnx"), session_options);
// Run model ...
}
env.UnregisterExecutionProviderLibrary(/*...*/);
API reference
API header files:
- onnxruntime_ep_c_api.h
- Defines interfaces implemented by plugin EP and EP factory instances.
- Provides APIs utilized by plugin EP and EP factory instances.
- onnxruntime_c_api.h
- Provides APIs used to traverse an input model graph.
Data Types
Type | Description |
---|---|
OrtHardwareDeviceType | Enumerates classes of hardware devices:
|
OrtHardwareDevice | Opaque type that represents a physical hardware device. |
OrtExecutionProviderDevicePolicy | Enumerates the default EP selection policies available to users of ORT's automatic EP selection. |
OrtEpDevice | Opaque type that represents a pairing of an EP and hardware device that can run a model or model subgraph. |
OrtNodeFusionOptions | Struct that contains options for fusing nodes supported by an EP. |
OrtNodeComputeContext | Opaque type that contains a compiled/fused node's name and host memory allocation functions. ONNX Runtime provides an instance of OrtNodeComputeContext as an argument to OrtNodeComputeInfo::CreateState() . |
OrtNodeComputeInfo | Struct that contains the computation function for a compiled OrtGraph instance. Initialized by an OrtEp instance. |
OrtEpGraphSupportInfo | Opaque type that contains information on the nodes supported by an EP. An instance of OrtEpGraphSupportInfo is passed to OrtEp::GetCapability() and the EP populates the OrtEpGraphSupportInfo instance with information on the nodes that it supports. |
OrtEpDataLayout | Enumerates the operator data layouts that could be preferred by an EP. By default, ONNX models use a "channel-first" layout (e.g., NCHW) but some EPs may prefer a "channel-last" layout (e.g., NHWC). |
OrtMemoryDevice | Opaque type that represents a combination of a physical device and memory type. A memory allocation and allocator are associated with a specific OrtMemoryDevice , and this information is used to determine when data transfer is required. |
OrtDataTransferImpl | Struct of functions that an EP implements to copy data between the devices that the EP uses and CPU. |
OrtSyncNotificationImpl | Struct of functions that an EP implements for Stream notifications. |
OrtSyncStreamImpl | Struct of functions that an EP implements if it needs to support Streams. |
OrtEpFactory | A plugin EP library provides ORT with one or more instances of OrtEpFactory . An OrtEpFactory implements functions that are used by ORT to query device support, create allocators, create data transfer objects, and create instances of an EP (i.e., an OrtEp instance).An OrtEpFactory may support more than one hardware device (OrtHardwareDevice ). If more than one hardware device is supported by the factory, an EP instance created by the factory is expected to internally partition any graph nodes assigned to the EP among its supported hardware devices.Alternatively, if an EP library author needs ONNX Runtime to partition the graph nodes among different hardware devices supported by the EP library, then the EP library must provide multiple OrtEpFactory instances. Each OrtEpFactory instance must support one hardware device and must create an EP instance with a unique name (e.g., MyEP_CPU, MyEP_GPU, MyEP_NPU). |
OrtEp | An instance of an Ep that can execute model nodes on one or more hardware devices (OrtHardwareDevice ). An OrtEp implements functions that are used by ORT to query graph node support, compile supported nodes, query preferred data layout, set run options, etc. An OrtEpFactory creates an OrtEp instance via the OrtEpFactory::CreateEp() function. |
OrtRunOptions | Opaque object containing options passed to the OrtApi::Run() function, which runs a model. |
OrtGraph | Opaque type that represents a graph. Provided to OrtEp instances in calls to OrtEp::GetCapability() and OrtEp::Compile() . |
OrtValueInfo | Opaque type that contains information for a value in a graph. A graph value can be a graph input, graph output, graph initializer, node input, or node output. An OrtValueInfo instance has the following information.
|
OrtExternalInitializerInfo | Opaque type that contains information for an initializer stored in an external file. An OrtExternalInitializerInfo instance contains the file path, file offset, and byte size for the initializer. Can be obtained from an OrtValueInfo via the function ValueInfo_GetExternalInitializerInfo() . |
OrtTypeInfo | Opaque type that contains the element type and shape information for ONNX tensors, sequences, maps, sparse tensors, etc. |
OrtTensorTypeAndShapeInfo | Opaque type that contains the element type and shape information for an ONNX tensor. |
OrtNode | Opaque type that represents a node in a graph. |
OrtOpAttrType | Enumerates attribute types. |
OrtOpAttr | Opaque type that represents an ONNX operator attribute. |
Plugin EP Library Registration APIs
The following table lists the API functions used for registration of a plugin EP library.
Function | Description |
---|---|
RegisterExecutionProviderLibrary | Register an EP library with ORT. The library must export the CreateEpFactories and ReleaseEpFactory functions. |
UnregisterExecutionProviderLibrary | Unregister an EP library with ORT. Caller MUST ensure there are no OrtSession instances using the EPs created by the library before calling this function. |
GetEpDevices | Get the list of available OrtEpDevice instances. Each OrtEpDevice instance contains details of the execution provider and the device it will use. |
SessionOptionsAppendExecutionProvider_V2 | Append the execution provider that is responsible for the provided OrtEpDevice instances to the session options. |
SessionOptionsSetEpSelectionPolicy | Set the execution provider selection policy for the session. Allows users to specify a device selection policy for automatic EP selection. If custom selection is required please use SessionOptionsSetEpSelectionPolicyDelegate instead. |
SessionOptionsSetEpSelectionPolicyDelegate | Set the execution provider selection policy delegate for the session. Allows users to provide a custom device selection policy for automatic EP selection. |