PyTorch + ONNX Runtime



PyTorch leads the deep learning landscape with its readily digestible and flexible API; the large number of ready-made models available, particularly in the natural language (NLP) domain; as well as its domain specific libraries.


Deploy anywhere

Run PyTorch models on cloud, desktop, mobile, IoT, and even in the browser

Boost performance

Accelerate PyTorch models to improve user experience and reduce costs

Improve time to market

Used by Microsoft and many others for their production PyTorch workloads

Why PyTorch + ONNX Runtime?

Native support in PyTorch

PyTorch includes support for ONNX through the torch.onnx APIs to simplify exporting your PyTorch model to the portable ONNX format. The ONNX Runtime team maintains these exporter APIs to ensure a high level of compatibility with PyTorch models.

					 
import torch

torch.onnx.export(
model,
inputs,
"model.onnx")

Production Ready

Train and deploy models reliably and at scale using a built-in PyTorch environment within Azure Machine Learning to ensure that the latest PyTorch version is fully supported through a lightweight, standalone environment that includes needed components like ONNX Runtime for Training to effectively run optimized training for large models.

Lower latency, higher throughput

Better performance can help improve your user experience and lower your operating costs. A wide range of models from computer vision (ResNet, MobileNet, Inception, YOLO, super resolution, etc) to speech and NLP (BERT, RoBERTa, GPT-2, T5, etc) can benefit from ONNX Runtime's optimized performance. The ONNX Runtime team regularly benchmarks and optimizes top models for performance. ONNX Runtime also integrates with top hardware accelerator libraries like TensorRT and OpenVINO so you can get the best performance on the hardware available while using the same common APIs across all your target platforms.

Get innovations into production faster

Development agility is a key factor in overall costs. ONNX Runtime was built on the experience of taking PyTorch models to production in high scale services like Microsoft Office, Bing, and Azure. It used to take weeks and months to take a model from R&D to production. With ONNX Runtime, models can be ready to be deployed at scale in hours or days.