Contains functions that an OrtEp implements to specify the computation for an operator kernel. More...

#include <onnxruntime_ep_c_api.h>

Public Member Functions
OrtStatus *	Compute (OrtKernelImpl this_ptr, OrtKernelContext context)
	Computation function called to execute the kernel on an EP.

OrtStatus *	PrePackWeight (OrtKernelImpl this_ptr, const OrtValue tensor, int input_index, OrtAllocator allocator, OrtSharedPrePackedWeightCache prepacked_weight_cache, bool *is_packed)
	Optional function to pre-pack a constant tensor (i.e., a weight) to the kernel's preferred data layout.

OrtStatus *	SetSharedPrePackedWeight (OrtKernelImpl this_ptr, const void const buffer_data_ptrs, const size_t buffer_data_sizes, size_t num_buffers, int input_index)
	Optional function that receives data for a shared pre-packed weight from ORT.

Public Attributes
uint32_t	ort_version_supported
	Must be initialized to ORT_API_VERSION.

uint32_t	flags
	EP must initialize to 0. Used internally by ORT.

void(*	Release )(OrtKernelImpl *this_ptr)
	Called by ORT to release the OrtKernelImpl instance and its resources.

Detailed Description

Contains functions that an OrtEp implements to specify the computation for an operator kernel.

Since: Version 1.24.

Member Function Documentation

◆ Compute()

OrtStatus * OrtKernelImpl::Compute	(	OrtKernelImpl *	this_ptr,
		OrtKernelContext *	context
	)

Computation function called to execute the kernel on an EP.

Note: Implementation of this function is required.

Parameters

[in]	this_ptr	The OrtKernelImpl instance.
[in]	context	The OrtKernelContext instance that provides access to the inputs and outputs.

Returns: If no error, nullptr will be returned. If there is an error, a pointer to an OrtStatus that contains error details will be returned. Use OrtApi::ReleaseStatus to free this pointer.

Since: Version 1.24.

◆ PrePackWeight()

OrtStatus * OrtKernelImpl::PrePackWeight	(	OrtKernelImpl *	this_ptr,
		const OrtValue *	tensor,
		int	input_index,
		OrtAllocator *	allocator,
		OrtSharedPrePackedWeightCache *	prepacked_weight_cache,
		bool *	is_packed
	)

Optional function to pre-pack a constant tensor (i.e., a weight) to the kernel's preferred data layout.

For example, a Conv kernel can define this function to pack input W to the channel-last data layout before inference.

Pre-packing can operate in three different modes: no pre-packing mode, sharing mode, and non-sharing mode. 1) No pre-packing mode: The kernel can forgo any weight pre-packing for the given input_index by setting is_packed to false and returning a successful OrtStatus. In this mode, the kernel's OrtKernelImpl::SetSharedPrePackedWeight() function is not called for that specific input_index. 2) Sharing mode: Sharing is allowed if the prepacked_weight_cache argument is not NULL and the EP stores weight data in CPU-accessible memory. In this case, the kernel can optionally choose to share the packed weight with other kernels that use the same weight (compared by content hash). To do so, the kernel must allocate the packed weight with the provided allocator, then it stores the packed weight data into prepacked_weight_cache via SharedPrePackedWeightCache_StoreWeightData(), sets is_packed to true, and returns a successful OrtStatus. ORT will subsequently call OrtKernelImpl::SetSharedPrePackedWeight() to provide this kernel with the actual shared weight data, whose memory location could differ (i.e., if shared data was allocated by a previously processed kernel). 3) Non-sharing mode: In non-sharing mode, the prepacked_weight_cache argument is ignored. In this mode, the implementation allocates the packed data with the provided allocator, sets is_packed to true, and returns a successful OrtStatus. The kernel is ultimately responsible for releasing the packed data for the weight with allocator. ORT may release the original (unpacked) weight, which must not be accessed in OrtKernelImpl::Compute(). Note that in this mode, the kernel's OrtKernelImpl::SetSharedPrePackedWeight() function is not called by ORT for that specific input_index.

Note: This function is based on the internal OpKernel::PrePack() virtual function used within ORT.

Parameters

[in]	this_ptr	The OrtKernelImpl instance.
[in]	tensor	The OrtValue instance representing the constant tensor (weight). Do not cache in the kernel.
[in]	input_index	The input index of the tensor in this kernel.
[in]	allocator	Allocator for allocating the pre-packed data. Its use is required in sharing mode and recommended, but not required, in the non-sharing mode. This will be an allocator set by the application for the session/environment (e.g., via CreateAndRegisterAllocator[V2] or RegisterAllocator), or an allocator on the OrtEpDevice (read-only or default) otherwise. The allocator remains valid throughout the lifetime of the OrtKernelImpl instance.
[in]	prepacked_weight_cache	May be NULL. If not NULL, the kernel may choose to share a packed weight by first storing it in the OrtSharedPrePackedWeightCache instance and then receiving the actual shared weight data in the call to OrtKernelImpl::SetSharedPrePackedWeight(). See the above description for "sharing mode".
[out]	is_packed	Output parameter that the implementation sets to true if the kernel packed the tensor data.

Returns: If no error, nullptr will be returned. If there is an error, a pointer to an OrtStatus that contains error details will be returned. Use OrtApi::ReleaseStatus to free this pointer.

Note: Implementation of this function is optional. If not implemented (set to NULL), ORT assumes the kernel does not pre-pack weight data (i.e., is_packed defaults to false).

Since: Version 1.24.

◆ SetSharedPrePackedWeight()

OrtStatus * OrtKernelImpl::SetSharedPrePackedWeight	(	OrtKernelImpl *	this_ptr,
		const void const	buffer_data_ptrs,
		const size_t *	buffer_data_sizes,
		size_t	num_buffers,
		int	input_index
	)

Optional function that receives data for a shared pre-packed weight from ORT.

ORT calls this function after calling OrtKernelImpl::PrePackWeight for a specific input_index if:

OrtKernelImpl::PrePackWeight set the output parameter is_packed to true.
OrtKernelImpl::PrePackWeight stored weight data to share into the provided OrtSharedPrePackedWeightCache parameter (prepacked_weight_cache) via the API SharedPrePackedWeightCache_StoreWeightData.

Refer to the description of the "sharing-mode" in the documentation for OrtKernelImpl::PrePackWeight().

Note: ORT will not call this function for an input_index that a previous call to OrtKernelImpl::PrePackWeight() did not elect to pre-pack and share.; This function is based on the internal OpKernel::UseSharedPrePackedBuffers() virtual function used within ORT.

Parameters

[in]	this_ptr	The OrtKernelImpl instance.
[in]	buffer_data_ptrs	An array of buffer data pointers that collectively hold the pre-packed data for a single shared weight. The buffers are provided in the same order and with the same contents (in a potentially different memory location) as the buffers passed into SharedPrePackedWeightCache_StoreWeightData() within the OrtKernelImpl::PrePackWeight() call for the same `input_index`.
[in]	buffer_data_sizes	An array of buffer byte sizes, one per element in `buffer_data_ptrs`.
[in]	num_buffers	The number of buffers used to store the data for the shared pre-packed weight. Specifies the number of elements in the `buffer_data_ptrs` and `buffer_data_sizes` arrays.
[in]	input_index	The input index of the tensor in this kernel. This index identifies the identity of the weight.

Returns: If no error, nullptr will be returned. If there is an error, a pointer to an OrtStatus that contains error details will be returned. Use OrtApi::ReleaseStatus to free this pointer.

Note: Implementation of this function is generally optional. It is only required if OrtKernelImpl::PrePack() elects to share pre-packed weights.

Since: Version 1.24.

Member Data Documentation

◆ flags

uint32_t OrtKernelImpl::flags

EP must initialize to 0. Used internally by ORT.

◆ ort_version_supported

uint32_t OrtKernelImpl::ort_version_supported

Must be initialized to ORT_API_VERSION.

◆ Release

void( * OrtKernelImpl::Release) (OrtKernelImpl *this_ptr)

Called by ORT to release the OrtKernelImpl instance and its resources.

Note: Implementation of this function is required.

Parameters

[in] this_ptr The OrtKernelImpl instance.

Since: Version 1.24.

Public Member Functions

Public Attributes

Detailed Description

Member Function Documentation

◆ Compute()

◆ PrePackWeight()

◆ SetSharedPrePackedWeight()

Member Data Documentation

◆ flags

◆ ort_version_supported

◆ Release