ONNX Runtime
Loading...
Searching...
No Matches
OrtKernelImpl Struct Reference

Contains functions that an OrtEp implements to specify the computation for an operator kernel. More...

#include <onnxruntime_ep_c_api.h>

Public Member Functions

OrtStatusCompute (OrtKernelImpl *this_ptr, OrtKernelContext *context)
 Computation function called to execute the kernel on an EP.
 
OrtStatusPrePackWeight (OrtKernelImpl *this_ptr, const OrtValue *tensor, int input_index, OrtAllocator *allocator, OrtSharedPrePackedWeightCache *prepacked_weight_cache, bool *is_packed)
 Optional function to pre-pack a constant tensor (i.e., a weight) to the kernel's preferred data layout.
 
OrtStatusSetSharedPrePackedWeight (OrtKernelImpl *this_ptr, const void *const *buffer_data_ptrs, const size_t *buffer_data_sizes, size_t num_buffers, int input_index)
 Optional function that receives data for a shared pre-packed weight from ORT.
 

Public Attributes

uint32_t ort_version_supported
 Must be initialized to ORT_API_VERSION.
 
uint32_t flags
 EP must initialize to 0. Used internally by ORT.
 
void(* Release )(OrtKernelImpl *this_ptr)
 Called by ORT to release the OrtKernelImpl instance and its resources.
 

Detailed Description

Contains functions that an OrtEp implements to specify the computation for an operator kernel.

Since
Version 1.24.

Member Function Documentation

◆ Compute()

OrtStatus * OrtKernelImpl::Compute ( OrtKernelImpl this_ptr,
OrtKernelContext context 
)

Computation function called to execute the kernel on an EP.

Note
Implementation of this function is required.
Parameters
[in]this_ptrThe OrtKernelImpl instance.
[in]contextThe OrtKernelContext instance that provides access to the inputs and outputs.

Returns
If no error, nullptr will be returned. If there is an error, a pointer to an OrtStatus that contains error details will be returned. Use OrtApi::ReleaseStatus to free this pointer.

Since
Version 1.24.

◆ PrePackWeight()

OrtStatus * OrtKernelImpl::PrePackWeight ( OrtKernelImpl this_ptr,
const OrtValue tensor,
int  input_index,
OrtAllocator allocator,
OrtSharedPrePackedWeightCache *  prepacked_weight_cache,
bool *  is_packed 
)

Optional function to pre-pack a constant tensor (i.e., a weight) to the kernel's preferred data layout.

For example, a Conv kernel can define this function to pack input W to the channel-last data layout before inference.

Pre-packing can operate in three different modes: no pre-packing mode, sharing mode, and non-sharing mode. 1) No pre-packing mode: The kernel can forgo any weight pre-packing for the given input_index by setting is_packed to false and returning a successful OrtStatus. In this mode, the kernel's OrtKernelImpl::SetSharedPrePackedWeight() function is not called for that specific input_index. 2) Sharing mode: Sharing is allowed if the prepacked_weight_cache argument is not NULL and the EP stores weight data in CPU-accessible memory. In this case, the kernel can optionally choose to share the packed weight with other kernels that use the same weight (compared by content hash). To do so, the kernel must allocate the packed weight with the provided allocator, then it stores the packed weight data into prepacked_weight_cache via SharedPrePackedWeightCache_StoreWeightData(), sets is_packed to true, and returns a successful OrtStatus. ORT will subsequently call OrtKernelImpl::SetSharedPrePackedWeight() to provide this kernel with the actual shared weight data, whose memory location could differ (i.e., if shared data was allocated by a previously processed kernel). 3) Non-sharing mode: In non-sharing mode, the prepacked_weight_cache argument is ignored. In this mode, the implementation allocates the packed data with the provided allocator, sets is_packed to true, and returns a successful OrtStatus. The kernel is ultimately responsible for releasing the packed data for the weight with allocator. ORT may release the original (unpacked) weight, which must not be accessed in OrtKernelImpl::Compute(). Note that in this mode, the kernel's OrtKernelImpl::SetSharedPrePackedWeight() function is not called by ORT for that specific input_index.

Note
This function is based on the internal OpKernel::PrePack() virtual function used within ORT.
Parameters
[in]this_ptrThe OrtKernelImpl instance.
[in]tensorThe OrtValue instance representing the constant tensor (weight). Do not cache in the kernel.
[in]input_indexThe input index of the tensor in this kernel.
[in]allocatorAllocator for allocating the pre-packed data. Its use is required in sharing mode and recommended, but not required, in the non-sharing mode. This will be an allocator set by the application for the session/environment (e.g., via CreateAndRegisterAllocator[V2] or RegisterAllocator), or an allocator on the OrtEpDevice (read-only or default) otherwise. The allocator remains valid throughout the lifetime of the OrtKernelImpl instance.
[in]prepacked_weight_cacheMay be NULL. If not NULL, the kernel may choose to share a packed weight by first storing it in the OrtSharedPrePackedWeightCache instance and then receiving the actual shared weight data in the call to OrtKernelImpl::SetSharedPrePackedWeight(). See the above description for "sharing mode".
[out]is_packedOutput parameter that the implementation sets to true if the kernel packed the tensor data.

Returns
If no error, nullptr will be returned. If there is an error, a pointer to an OrtStatus that contains error details will be returned. Use OrtApi::ReleaseStatus to free this pointer.

Note
Implementation of this function is optional. If not implemented (set to NULL), ORT assumes the kernel does not pre-pack weight data (i.e., is_packed defaults to false).
Since
Version 1.24.

◆ SetSharedPrePackedWeight()

OrtStatus * OrtKernelImpl::SetSharedPrePackedWeight ( OrtKernelImpl this_ptr,
const void *const *  buffer_data_ptrs,
const size_t *  buffer_data_sizes,
size_t  num_buffers,
int  input_index 
)

Optional function that receives data for a shared pre-packed weight from ORT.

ORT calls this function after calling OrtKernelImpl::PrePackWeight for a specific input_index if:

Refer to the description of the "sharing-mode" in the documentation for OrtKernelImpl::PrePackWeight().

Note
ORT will not call this function for an input_index that a previous call to OrtKernelImpl::PrePackWeight() did not elect to pre-pack and share.
This function is based on the internal OpKernel::UseSharedPrePackedBuffers() virtual function used within ORT.
Parameters
[in]this_ptrThe OrtKernelImpl instance.
[in]buffer_data_ptrsAn array of buffer data pointers that collectively hold the pre-packed data for a single shared weight. The buffers are provided in the same order and with the same contents (in a potentially different memory location) as the buffers passed into SharedPrePackedWeightCache_StoreWeightData() within the OrtKernelImpl::PrePackWeight() call for the same input_index.
[in]buffer_data_sizesAn array of buffer byte sizes, one per element in buffer_data_ptrs.
[in]num_buffersThe number of buffers used to store the data for the shared pre-packed weight. Specifies the number of elements in the buffer_data_ptrs and buffer_data_sizes arrays.
[in]input_indexThe input index of the tensor in this kernel. This index identifies the identity of the weight.

Returns
If no error, nullptr will be returned. If there is an error, a pointer to an OrtStatus that contains error details will be returned. Use OrtApi::ReleaseStatus to free this pointer.

Note
Implementation of this function is generally optional. It is only required if OrtKernelImpl::PrePack() elects to share pre-packed weights.
Since
Version 1.24.

Member Data Documentation

◆ flags

uint32_t OrtKernelImpl::flags

EP must initialize to 0. Used internally by ORT.

◆ ort_version_supported

uint32_t OrtKernelImpl::ort_version_supported

Must be initialized to ORT_API_VERSION.

◆ Release

void( * OrtKernelImpl::Release) (OrtKernelImpl *this_ptr)

Called by ORT to release the OrtKernelImpl instance and its resources.

Note
Implementation of this function is required.
Parameters
[in]this_ptrThe OrtKernelImpl instance.
Since
Version 1.24.