Generative AI + ONNX Runtime



Integrate the power of generative AI in your apps and services with ONNX Runtime. Broad platform support and deep optimizations empower usage of state-of-the-art models for image synthesis, text generation, and more.


Stable Diffusion + ONNX Runtime

Use ONNX Runtime to accelerate this popular image generation model.

Benefits

Run Stable Diffusion outside of a Python environment

Speed up inference of Stable Diffusion on NVIDIA and AMD GPUs

Performance

The average latency in seconds on Stable Diffusion v1.5 and v2.1 models:

Stable Diffusion v1.5 latency graphs
Stable Diffusion v2.1 latency graphs

Large Language Models + ONNX Runtime

ONNX Runtime supports many popular large language model (LLM) families in the Hugging Face Model Hub. These, along with thousands of other models, are easily convertible to ONNX using the Optimum API.

LLaMA → GPT Neo → BLOOM → OPT → GPT-J → FLAN-T5 →