ONNX Runtime + Windows Dev Kit 2023 = NPU powered AI

Delivering NPU powered AI capabilities in your apps

Windows Dev Kit 2023, aka Project Volterra, enables developers to build apps that unlock the power of the NPU hardware to accelerate AI/ML workloads delivering AI-enhanced features & experiences without compromising app performance. You can get started now and access the power of the NPU through the open source and cross-platform ONNX Runtime inference engine making it easy to run AI/ML models from popular machine learning frameworks like PyTorch and TensorFlow.

Get started on your Windows Dev Kit 2023 today

Follow these steps to setup your device to use ONNX Runtime (ORT) with the built in NPU:

Download the Qualcomm AI Engine Direct SDK (QNN SDK)
Download and install the ONNX Runtime with QNN package
Start using the ONNX Runtime API in your application.

Optimizing models for the NPU

ONNX is a standard format for representing ML models authored in frameworks like PyTorch, TensorFlow, and others. ONNX Runtime can run any ONNX model, however to make use of the NPU, you currently need to quantize the ONNX model to QDQ model.
See our C# tutorial for an example of how this is done.
Many models can be optimized for the NPU using this process. Even if a model cannot be optimized for the NPU, it can still be run by ONNX Runtime on the CPU.

Getting Help

For help with ONNX Runtime, you canstart a discussion on GitHub or file an issue.