Plugin Execution Provider Libraries

Contents

Background

An ONNX Runtime Execution Provider (EP) executes model operations on one or more hardware accelerators (e.g., GPU, NPU, etc.). ONNX Runtime provides a variety of built-in EPs, such as the default CPU EP. To enable further extensibility, ONNX Runtime supports user-defined plugin EP libraries that an application can register with ONNX Runtime for use in an ONNX Runtime inference session.

This page provides a reference for the APIs necessary to develop and use plugin EP libraries with ONNX Runtime.

Creating a plugin EP library

A plugin EP is built as a dynamic/shared library that exports the functions CreateEpFactories() and ReleaseEpFactory(). ONNX Runtime calls CreateEpFactories() to obtain one or more instances of OrtEpFactory. An OrtEpFactory creates OrtEp instances and specifies the hardware devices supported by the EPs it creates.

The ONNX Runtime repository includes a sample plugin EP library, which is referenced in the following sections.

Defining an OrtEp

An OrtEp represents an instance of an EP that is used by an ONNX Runtime session to identify and execute the model operations supported by the EP.

The following table lists the required variables and functions that an implementer must define for an OrtEp.

Field Summary Example implementation
ort_version_supported The ONNX Runtime version with which the EP was compiled. Implementation should set to ORT_API_VERSION. ExampleEp()
GetName Get the execution provider name. ExampleEp::GetNameImpl()
GetCapability Get information about the nodes/subgraphs supported by the OrtEp instance. ExampleEp::GetCapabilityImpl()
Compile Compile OrtGraph instances assigned to the OrtEp. Implementation must set a OrtNodeComputeInfo instance for each OrtGraph in order to define its computation function.

If the session is configured to generate a pre-compiled model, the execution provider must return count number of EPContext nodes.
ExampleEp::CompileImpl()
ReleaseNodeComputeInfos Release OrtNodeComputeInfo instances. ExampleEp::ReleaseNodeComputeInfosImpl()

The following table lists the optional functions that an implementor may define for an OrtEp. If an optional OrtEp function is not defined, ONNX Runtime uses a default implementation.

Field Summary Example implementation
GetPreferredDataLayout Get the EP's preferred data layout.

If this function is not implemented, ORT assumes that the EP prefers the data layout OrtEpDataLayout::NCHW.
ShouldConvertDataLayoutForOp Given an op with domain domain and type op_type, determine whether an associated node's data layout should be converted to a target_data_layout. If the EP prefers a non-default data layout, this function will be called during layout transformation with target_data_layout set to the EP's preferred data layout

Implementation of this function is optional. If an EP prefers a non-default data layout, it may implement this to customize the specific op data layout preferences at a finer granularity.
SetDynamicOptions Set dynamic options on this EP. Dynamic options can be set by the application at any time after session creation with OrtApi::SetEpDynamicOptions().

Implementation of this function is optional. An EP should only implement this function if it needs to handle any dynamic options.
OnRunStart Called by ORT to notify the EP of the start of a run.

Implementation of this function is optional. An EP should only implement this function if it needs to handle application-provided options at the start of a run.
OnRunEnd Called by ORT to notify the EP of the end of a run.

Implementation of this function is optional. An EP should only implement this function if it needs to handle application-provided options at the end of a run.
CreateAllocator Create an OrtAllocator for the given OrtMemoryInfo for an OrtSession.

The OrtMemoryInfo instance will match one of the values set in the OrtEpDevice using EpDevice_AddAllocatorInfo. Any allocator specific options should be read from the session options.

Implementation of this function is optional. If not provided, ORT will use `OrtEpFactory::CreateAllocator()`.
CreateSyncStreamForDevice Create a synchronization stream for the given memory device for an OrtSession.

This is used to create a synchronization stream for the execution provider and is used to synchronize operations on the device during model execution. Any stream specific options should be read from the session options.

Implementation of this function is optional. If not provided, ORT will use `OrtEpFactory::CreateSyncStreamForDevice()`.
GetCompiledModelCompatibilityInfo Get a string with details about the EP stack used to produce a compiled model.

The compatibility information string can be used with OrtEpFactory::ValidateCompiledModelCompatibilityInfo to determine if a compiled model is compatible with the EP.

Defining an OrtEpFactory

An OrtEpFactory represents an instance of an EP factory that is used by an ONNX Runtime session to query device support, create allocators, create data transfer objects, and create instances of an EP (i.e., an OrtEp).

The following table lists the required variables and functions that an implementer must define for an OrtEpFactory.

Field Summary Example implementation
ort_version_supported The ONNX Runtime version with which the EP was compiled. Implementation should set this to ORT_API_VERSION. ExampleEpFactory()
GetName Get the name of the EP that the factory creates. Must match OrtEp::GetName(). ExampleEpFactory::GetNameImpl()
GetVendor Get the name of the name of the vendor that owns the EP that the factory creates. ExampleEpFactory::GetVendor()
GetVendorId Get the vendor ID of the vendor that owns the EP that the factory creates. This is typically the PCI vendor ID. ExampleEpFactory::GetVendorId()
GetVersion Get the version of the EP that the factory creates. The version string should adhere to the Semantic Versioning 2.0 specification. ExampleEpFactory::GetVersionImpl()
GetSupportedDevices Get information about the OrtHardwareDevice instances supported by an EP created by the factory. ExampleEpFactory::GetSupportedDevicesImpl()
CreateEp Creates an OrtEp instance for use in an ONNX Runtime session. ORT calls OrtEpFactory::ReleaseEp() to release the instance. ExampleEpFactory::CreateEpImpl()

The following table lists the optional functions that an implementer may define for an OrtEpFactory.

Field Summary Example implementation
ValidateCompiledModelCompatibilityInfo Validate the compatibility of a compiled model with the EP.

This function validates if a model produced with the supllied compatibility information string is supported by the underlying EP. The implementation should check if a compiled model is compatible with the EP and return the appropriate OrtCompiledModelCompatibility value.
CreateAllocator Create an OrtAllocator that can be shared across sessions for the given OrtMemoryInfo.

The factory that creates the EP is responsible for providing the allocators required by the EP. The OrtMemoryInfo instance will match one of the values set in the OrtEpDevice using EpDevice_AddAllocatorInfo.
ExampleEpFactory::CreateAllocatorImpl()
ReleaseAllocator Releases an OrtAllocator instance created by the factory. ExampleEpFactory::ReleaseAllocatorImpl()
CreateDataTransfer Creates an OrtDataTransferImpl instance for the factory.

An OrtDataTransferImpl can be used to copy data between devices that the EP supports.
ExampleEpFactory::CreateDataTransferImpl()
IsStreamAware Returns true if the EPs created by the factory are stream-aware. ExampleEpFactory::IsStreamAwareImpl()
CreateSyncStreamForDevice Creates a synchronization stream for the given OrtMemoryDevice.

This is use to create a synchronization stream for the OrtMemoryDevice that can be used for operations outside of a session.
ExampleEpFactory::CreateSyncStreamForDeviceImpl()

Exporting functions to create and release factories

ONNX Runtime expects a plugin EP library to export certain functions/symbols. The following table lists the functions that have to be exported from the plugin EP library.

Function Description Example implementation
CreateEpFactories ONNX Runtime calls this function to create OrtEpFactory instances. ExampleEp: CreateEpFactories
ReleaseEpFactory ONNX Runtime calls this function to release an OrtEpFactory instance. ExampleEp: ReleaseEpFactory

Using a plugin EP library

Plugin EP library registration

The sample application code below uses the following API functions to register and unregister a plugin EP library.

const char* lib_registration_name = "ep_lib_name";
Ort::Env env;

// Register plugin EP library with ONNX Runtime.
env.RegisterExecutionProviderLibrary(
  lib_registration_name,   // Registration name can be anything the application chooses.
  ORT_TSTR("ep_path.dll")  // Path to the plugin EP library.
);

{
  Ort::Session session(env, /*...*/);
  // Run a model ...
}

// Unregister the library using the application-specified registration name.
// Must only unregister a library after all sessions that use the library have been released.
env.UnregisterExecutionProviderLibrary(lib_registration_name);

As shown in the following sequence diagram, registering a plugin EP library causes ONNX Runtime to load the library and call the library’s CreateEpFactories() function. During the call to CreateEpFactories(), ONNX Runtime determines the subset of hardware devices supported by each factory by calling OrtEpFactory::GetSupportedDevices() with all hardware devices that ONNX Runtime discovered during initialization.

The factory returns OrtEpDevice instances from OrtEpFactory::GetSupportedDevices(). Each OrtEpDevice instance pairs a factory with a hardware device that the factory supports. For example, if a single factory instance supports both CPU and NPU, then the call to OrtEpFactory::GetSupportedDevices() returns two OrtEpDevice instances:

  • ep_device_0: (factory_0, CPU)
  • ep_device_1: (factory_0, NPU)


Sequence diagram showing registration and unregistration of a plugin EP library

Session creation with explicit OrtEpDevice(s)

The application code below uses the API function SessionOptionsAppendExecutionProvider_V2 to add an EP from a library to an ONNX Runtime session.

The application first calls GetEpDevices to get a list of OrtEpDevices available to the application. Each OrtEpDevice represents a hardware device supported by an OrtEpFactory. The SessionOptionsAppendExecutionProvider_V2 function takes an array of OrtEpDevice instances as input, where all OrtEpDevice instances refer to the same OrtEpFactory.

Ort::Env env;
env.RegisterExecutionProviderLibrary(/*...*/);

{
  std::vector<Ort::ConstEpDevice> ep_devices = env.GetEpDevices();

  // Find the Ort::EpDevice for "my_ep".
  std::array<Ort::ConstEpDevice, 1> selected_ep_devices = { nullptr };
  for (Ort::ConstEpDevice ep_device : ep_devices) {
    if (std::strcmp(ep_device.GetName(), "my_ep") == 0) {
      selected_ep_devices[0] = ep_device;
      break;
    }
  }

  if (selected_ep_devices[0] == nullptr) {
    // Did not find EP. Report application error ...
  }

  Ort::KeyValuePairs ep_options(/*...*/);  // Optional EP options.
  Ort::SessionOptions session_options;
  session_options.AppendExecutionProvider_V2(env, selected_ep_devices, ep_options);

  Ort::Session session(env, ORT_TSTR("model.onnx"), session_options);

  // Run model ...
}

env.UnregisterExecutionProviderLibrary(/*...*/);

As shown in the following sequence diagram, ONNX Runtime calls OrtEpFactory::CreateEp() during session creation in order to create an instance of the plugin EP.


Sequence diagram showing session creation with explicit ep devices

Session creation with automatic EP selection

The application code below uses the API function SessionOptionsSetEpSelectionPolicy to have ONNX Runtime automatically select an EP based on the user’s policy (e.g., PREFER_NPU). If the plugin EP library registered with ONNX Runtime has a factory that supports NPU, then ONNX Runtime may select an EP from that factory to run the model.

Ort::Env env;
env.RegisterExecutionProviderLibrary(/*...*/);

{
  Ort::SessionOptions session_options;
  session_options.SetEpSelectionPolicy(OrtExecutionProviderDevicePolicy::PREFER_NPU);

  Ort::Session session(env, ORT_TSTR("model.onnx"), session_options);

  // Run model ...
}

env.UnregisterExecutionProviderLibrary(/*...*/);


Sequence diagram showing session creation with automatic EP selection

API reference

API header files:

  • onnxruntime_ep_c_api.h
    • Defines interfaces implemented by plugin EP and EP factory instances.
    • Provides APIs utilized by plugin EP and EP factory instances.
  • onnxruntime_c_api.h
    • Provides APIs used to traverse an input model graph.

Data Types

Type Description
OrtHardwareDeviceType Enumerates classes of hardware devices:
  • OrtHardwareDeviceType_CPU
  • OrtHardwareDeviceType_GPU
  • OrtHardwareDeviceType_NPU
OrtHardwareDevice Opaque type that represents a physical hardware device.
OrtExecutionProviderDevicePolicy Enumerates the default EP selection policies available to users of ORT's automatic EP selection.
OrtEpDevice Opaque type that represents a pairing of an EP and hardware device that can run a model or model subgraph.
OrtNodeFusionOptions Struct that contains options for fusing nodes supported by an EP.
OrtNodeComputeContext Opaque type that contains a compiled/fused node's name and host memory allocation functions. ONNX Runtime provides an instance of OrtNodeComputeContext as an argument to OrtNodeComputeInfo::CreateState().
OrtNodeComputeInfo Struct that contains the computation function for a compiled OrtGraph instance. Initialized by an OrtEp instance.
OrtEpGraphSupportInfo Opaque type that contains information on the nodes supported by an EP. An instance of OrtEpGraphSupportInfo is passed to OrtEp::GetCapability() and the EP populates the OrtEpGraphSupportInfo instance with information on the nodes that it supports.
OrtEpDataLayout Enumerates the operator data layouts that could be preferred by an EP. By default, ONNX models use a "channel-first" layout (e.g., NCHW) but some EPs may prefer a "channel-last" layout (e.g., NHWC).
OrtMemoryDevice Opaque type that represents a combination of a physical device and memory type. A memory allocation and allocator are associated with a specific OrtMemoryDevice, and this information is used to determine when data transfer is required.
OrtDataTransferImpl Struct of functions that an EP implements to copy data between the devices that the EP uses and CPU.
OrtSyncNotificationImpl Struct of functions that an EP implements for Stream notifications.
OrtSyncStreamImpl Struct of functions that an EP implements if it needs to support Streams.
OrtEpFactory A plugin EP library provides ORT with one or more instances of OrtEpFactory. An OrtEpFactory implements functions that are used by ORT to query device support, create allocators, create data transfer objects, and create instances of an EP (i.e., an OrtEp instance).
An OrtEpFactory may support more than one hardware device (OrtHardwareDevice). If more than one hardware device is supported by the factory, an EP instance created by the factory is expected to internally partition any graph nodes assigned to the EP among its supported hardware devices.
Alternatively, if an EP library author needs ONNX Runtime to partition the graph nodes among different hardware devices supported by the EP library, then the EP library must provide multiple OrtEpFactory instances. Each OrtEpFactory instance must support one hardware device and must create an EP instance with a unique name (e.g., MyEP_CPU, MyEP_GPU, MyEP_NPU).
OrtEp An instance of an Ep that can execute model nodes on one or more hardware devices (OrtHardwareDevice). An OrtEp implements functions that are used by ORT to query graph node support, compile supported nodes, query preferred data layout, set run options, etc. An OrtEpFactory creates an OrtEp instance via the OrtEpFactory::CreateEp() function.
OrtRunOptions Opaque object containing options passed to the OrtApi::Run() function, which runs a model.
OrtGraph Opaque type that represents a graph. Provided to OrtEp instances in calls to OrtEp::GetCapability() and OrtEp::Compile().
OrtValueInfo Opaque type that contains information for a value in a graph. A graph value can be a graph input, graph output, graph initializer, node input, or node output. An OrtValueInfo instance has the following information.
  • Type and shape (e.g., OrtTypeInfo)
  • OrtNode consumers
  • OrtNode producer
  • Information that classifies the value as a graph input, graph output, initializer, etc.
OrtExternalInitializerInfo Opaque type that contains information for an initializer stored in an external file. An OrtExternalInitializerInfo instance contains the file path, file offset, and byte size for the initializer. Can be obtained from an OrtValueInfo via the function ValueInfo_GetExternalInitializerInfo().
OrtTypeInfo Opaque type that contains the element type and shape information for ONNX tensors, sequences, maps, sparse tensors, etc.
OrtTensorTypeAndShapeInfo Opaque type that contains the element type and shape information for an ONNX tensor.
OrtNode Opaque type that represents a node in a graph.
OrtOpAttrType Enumerates attribute types.
OrtOpAttr Opaque type that represents an ONNX operator attribute.

Plugin EP Library Registration APIs

The following table lists the API functions used for registration of a plugin EP library.

Function Description
RegisterExecutionProviderLibrary Register an EP library with ORT. The library must export the CreateEpFactories and ReleaseEpFactory functions.
UnregisterExecutionProviderLibrary Unregister an EP library with ORT. Caller MUST ensure there are no OrtSession instances using the EPs created by the library before calling this function.
GetEpDevices Get the list of available OrtEpDevice instances.

Each OrtEpDevice instance contains details of the execution provider and the device it will use.
SessionOptionsAppendExecutionProvider_V2 Append the execution provider that is responsible for the provided OrtEpDevice instances to the session options.
SessionOptionsSetEpSelectionPolicy Set the execution provider selection policy for the session.

Allows users to specify a device selection policy for automatic EP selection. If custom selection is required please use SessionOptionsSetEpSelectionPolicyDelegate instead.
SessionOptionsSetEpSelectionPolicyDelegate Set the execution provider selection policy delegate for the session.

Allows users to provide a custom device selection policy for automatic EP selection.