ONNX Runtime Release Roadmap

ONNX Runtime is released on a quarterly basis. Patch releases are published between major releases as necessary.

Previous release
1.20.0
Release date: 11/1/2024
In-Progress Release
1.20.1
Release date: 11/20/2024
Next release
1.21
Release date: Feb. 2025

Announcements

  • The onnxruntime-gpu v1.10.0 will be removed from PyPI. We have hit our PyPI project size limit for onnxruntime-gpu, so we will be removing our oldest package version to free up the necessary space.
  • ONNX Runtime v1.20.0 is now officially released. For release notes, assets, and more, visit our GitHub Releases page.

Versioning Updates

We are planning to upgrade ONNX Runtime support for the following (where the first value is the highest version previously supported and the second value is the version support that will be added in ORT 1.20.1):

  • QNN SDK 2.27 --> 2.28
  • DirectML 1.15.2 --> 1.16
  • ONNX 1.17 support will be included in a future release.

Major Updates

In addition to various bug fixes and performance improvements, ORT 1.20.1 will include the following updates:

  • CPU FP16 implementation fixes for the following kernels: LayerNormalization, SimplifiedLayerNormalization, SkipLayerNormalization, SkipSimplifiedLayerNormalization.
  • Python quantization tool updates.
  • New QNN SDK version support.

Feature Requests

To request new ONNX Runtime features to be included in a future release, please submit a feature request through GitHub Issues or through GitHub Discussions.

To ensure that your request is addressed as quickly as possible, please:

  • Include a detailed title.
  • Provide as much detail as possible in the body of your request (e.g., use case for the feature, the platform(s) or EP(s) this feature is needed for, etc.).
  • Apply a label corresponding to the appropriate ONNX Runtime area (e.g., "platform:mobile", "platform:web", "ep:CUDA", etc.) if you know it.

Note: All timelines and features listed on this page are subject to change.

ONNX Runtime 1.20.1

Tentative release date: 11/20/2024

Announcements
  • The onnxruntime-gpu v1.10.0 will be removed from PyPI. We have hit our PyPI project size limit for onnxruntime-gpu, so we will be removing our oldest package version to free up the necessary space.
Build System & Packages

No features planned for 1.20.1. Stay tuned for 1.21 features.

Core

No features planned for 1.20.1. Stay tuned for 1.21 features.

Performance

No features planned for 1.20.1. Stay tuned for 1.21 features.

Quantization
  • Introduce get_int_qdq_config() helper to get QDQ configurations (#22677).
  • Update QDQ Pad, Slice, Softmax (#22676).
  • Handle input models with pre-quantized weights (#22633).
  • Prevent int32 quantized bias from clipping by adjusting the weight's scale (#22020).
EPs

CPU

  • Fix CPU FP16 implementations for the following kernels: LayerNormalization, SimplifiedLayerNormalization, SkipLayerNormalization, SkipSimplifiedLayerNormalization.

QNN

  • QNN SDK 2.28.x support.

DirectML

  • DirectML 1.16 support.
Mobile

No features planned for 1.20.1. Stay tuned for 1.21 features.

Web

No features planned for 1.20.1. Stay tuned for 1.21 features.

generate() API

No features planned for 1.20.1. Stay tuned for 1.21 features.

Extensions

No features planned for 1.20.1. Stay tuned for 1.21 features.

Olive

No features planned for 1.20.1. Stay tuned for 1.21 features.