ONNX Runtime + Windows Dev Kit 2023 = NPU powered AI

Delivering NPU powered AI capabilities in your apps

Windows Dev Kit 2023, aka Project Volterra, enables developers to build apps that unlock the power of the NPU hardware to accelerate AI/ML workloads delivering AI-enhanced features & experiences without compromising app performance.

You can get started now and access the power of the NPU through the open source and cross-platform ONNX Runtime inference engine making it easy to run AI/ML models from popular machine learning frameworks like PyTorch and TensorFlow.

Get started on your Windows Dev Kit 2023 today

Follow these steps to setup your device to use ONNX Runtime (ORT) with the built in NPU:

  1. Request access to the Neural Processing SDK for Windows on Snapdragon. Qualcomm may reach out to you via email with further registration instructions for approval.
  2. Once approved, you will receive an email with links to download SNPE.
    1. Select the SNPE link which takes you to a Qualcomm login and download page.
    2. Select the Snapdragon_NPE_SDK.WIN.1.0 Installer link, download and install.
  3. Download and install the ONNX Runtime with SNPE package
  4. Start using the ONNX Runtime API in your application.

Optimizing models for the NPU

ONNX is a standard format for representing ML models authored in frameworks like PyTorch, TensorFlow, and others. ONNX Runtime can run any ONNX model, however to make use of the NPU, you currently need to use the following steps:

  • Run the tools provided in the SNPE SDK on your model to generate a binary file.
  • Include the contents of the binary file as a node in the ONNX graph.

See our C# tutorial for an example of how this is done.

Many models can be optimized for the NPU using this process. Even if a model cannot be optimized for NPU by the SNPE SDK, it can still be run by ONNX Runtime on the CPU.


Getting help

For help with ONNX Runtime, you can start a discussion on GitHub or file an issue.