Note

Go to the end to download the full example code.

Train, convert and predict with ONNX Runtime#

This example demonstrates an end to end scenario starting with the training of a machine learned model to its use in its converted from.

Train a logistic regression#

The first step consists in retrieving the iris dataset.

from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split

iris = load_iris()
X, y = iris.data, iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y)

Then we fit a model.

clr = LogisticRegression()
clr.fit(X_train, y_train)

LogisticRegression()

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

We compute the prediction on the test set and we show the confusion matrix.

from sklearn.metrics import confusion_matrix  # noqa: E402

pred = clr.predict(X_test)
print(confusion_matrix(y_test, pred))

[[12  0  0]
 [ 0 13  2]
 [ 0  1 10]]

Conversion to ONNX format#

We use module sklearn-onnx to convert the model into ONNX format.

from skl2onnx import convert_sklearn  # noqa: E402
from skl2onnx.common.data_types import FloatTensorType  # noqa: E402

initial_type = [("float_input", FloatTensorType([None, 4]))]
onx = convert_sklearn(clr, initial_types=initial_type)
with open("logreg_iris.onnx", "wb") as f:
    f.write(onx.SerializeToString())

We load the model with ONNX Runtime and look at its input and output.

import onnxruntime as rt  # noqa: E402

sess = rt.InferenceSession("logreg_iris.onnx", providers=rt.get_available_providers())

print(f"input name='{sess.get_inputs()[0].name}' and shape={sess.get_inputs()[0].shape}")
print(f"output name='{sess.get_outputs()[0].name}' and shape={sess.get_outputs()[0].shape}")

input name='float_input' and shape=[None, 4]
output name='output_label' and shape=[None]

We compute the predictions.

input_name = sess.get_inputs()[0].name
label_name = sess.get_outputs()[0].name

import numpy  # noqa: E402

pred_onx = sess.run([label_name], {input_name: X_test.astype(numpy.float32)})[0]
print(confusion_matrix(pred, pred_onx))

[[12  0  0]
 [ 0 14  0]
 [ 0  0 12]]

The prediction are perfectly identical.

Probabilities#

Probabilities are needed to compute other relevant metrics such as the ROC Curve. Let’s see how to get them first with scikit-learn.

prob_sklearn = clr.predict_proba(X_test)
print(prob_sklearn[:3])

[[4.31955320e-04 1.79410856e-01 8.20157189e-01]
 [2.42853428e-02 9.42734715e-01 3.29799421e-02]
 [8.55674397e-06 4.22854368e-02 9.57706006e-01]]

And then with ONNX Runtime. The probabilities appear to be

prob_name = sess.get_outputs()[1].name
prob_rt = sess.run([prob_name], {input_name: X_test.astype(numpy.float32)})[0]

import pprint  # noqa: E402

pprint.pprint(prob_rt[0:3])

[{0: 0.00043195491889491677, 1: 0.17941081523895264, 2: 0.8201572895050049},
 {0: 0.024285323917865753, 1: 0.9427347183227539, 2: 0.032979946583509445},
 {0: 8.556756256439257e-06, 1: 0.042285431176424026, 2: 0.9577060341835022}]

Let’s benchmark.

from timeit import Timer  # noqa: E402


def speed(inst, number=5, repeat=10):
    timer = Timer(inst, globals=globals())
    raw = numpy.array(timer.repeat(repeat, number=number))
    ave = raw.sum() / len(raw) / number
    mi, ma = raw.min() / number, raw.max() / number
    print(f"Average {ave:1.3g} min={mi:1.3g} max={ma:1.3g}")
    return ave


print("Execution time for clr.predict")
speed("clr.predict(X_test)")

print("Execution time for ONNX Runtime")
speed("sess.run([label_name], {input_name: X_test.astype(numpy.float32)})[0]")

Execution time for clr.predict
Average 5.4e-05 min=4.53e-05 max=9.73e-05
Execution time for ONNX Runtime
Average 2.18e-05 min=1.73e-05 max=5.07e-05

2.1786320003229774e-05

Let’s benchmark a scenario similar to what a webservice experiences: the model has to do one prediction at a time as opposed to a batch of prediction.

def loop(X_test, fct, n=None):
    nrow = X_test.shape[0]
    if n is None:
        n = nrow
    for i in range(n):
        im = i % nrow
        fct(X_test[im : im + 1])


print("Execution time for clr.predict")
speed("loop(X_test, clr.predict, 50)")


def sess_predict(x):
    return sess.run([label_name], {input_name: x.astype(numpy.float32)})[0]


print("Execution time for sess_predict")
speed("loop(X_test, sess_predict, 50)")

Execution time for clr.predict
Average 0.00191 min=0.00164 max=0.00234
Execution time for sess_predict
Average 0.000329 min=0.000324 max=0.000357

0.0003285247999974672

Let’s do the same for the probabilities.

print("Execution time for predict_proba")
speed("loop(X_test, clr.predict_proba, 50)")


def sess_predict_proba(x):
    return sess.run([prob_name], {input_name: x.astype(numpy.float32)})[0]


print("Execution time for sess_predict_proba")
speed("loop(X_test, sess_predict_proba, 50)")

Execution time for predict_proba
Average 0.00223 min=0.0022 max=0.00232
Execution time for sess_predict_proba
Average 0.000332 min=0.000325 max=0.000355

0.0003315076600028988

This second comparison is better as ONNX Runtime, in this experience, computes the label and the probabilities in every case.

Benchmark with RandomForest#

We first train and save a model in ONNX format.

from sklearn.ensemble import RandomForestClassifier  # noqa: E402

rf = RandomForestClassifier(n_estimators=10)
rf.fit(X_train, y_train)

initial_type = [("float_input", FloatTensorType([1, 4]))]
onx = convert_sklearn(rf, initial_types=initial_type)
with open("rf_iris.onnx", "wb") as f:
    f.write(onx.SerializeToString())

We compare.

sess = rt.InferenceSession("rf_iris.onnx", providers=rt.get_available_providers())

def sess_predict_proba_rf(x):
    return sess.run([prob_name], {input_name: x.astype(numpy.float32)})[0]

print("Execution time for predict_proba")
speed("loop(X_test, rf.predict_proba, 50)")

print("Execution time for sess_predict_proba")
speed("loop(X_test, sess_predict_proba_rf, 50)")

Execution time for predict_proba
Average 0.0165 min=0.0163 max=0.0181
Execution time for sess_predict_proba
Average 0.000327 min=0.000319 max=0.00036

0.00032709664001231433

Let’s see with different number of trees.

measures = []

for n_trees in range(5, 51, 5):
    print(n_trees)
    rf = RandomForestClassifier(n_estimators=n_trees)
    rf.fit(X_train, y_train)
    initial_type = [("float_input", FloatTensorType([1, 4]))]
    onx = convert_sklearn(rf, initial_types=initial_type)
    with open(f"rf_iris_{n_trees}.onnx", "wb") as f:
        f.write(onx.SerializeToString())
    sess = rt.InferenceSession(f"rf_iris_{n_trees}.onnx", providers=rt.get_available_providers())

    def sess_predict_proba_loop(x):
        return sess.run([prob_name], {input_name: x.astype(numpy.float32)})[0]  # noqa: B023

    tsk = speed("loop(X_test, rf.predict_proba, 25)", number=5, repeat=4)
    trt = speed("loop(X_test, sess_predict_proba_loop, 25)", number=5, repeat=4)
    measures.append({"n_trees": n_trees, "sklearn": tsk, "rt": trt})

from pandas import DataFrame  # noqa: E402

df = DataFrame(measures)
ax = df.plot(x="n_trees", y="sklearn", label="scikit-learn", c="blue", logy=True)
df.plot(x="n_trees", y="rt", label="onnxruntime", ax=ax, c="green", logy=True)
ax.set_xlabel("Number of trees")
ax.set_ylabel("Prediction time (s)")
ax.set_title("Speed comparison between scikit-learn and ONNX Runtime\nFor a random forest on Iris dataset")
ax.legend()

Speed comparison between scikit-learn and ONNX Runtime For a random forest on Iris dataset

5
Average 0.00702 min=0.00608 max=0.00788
Average 0.000262 min=0.000249 max=0.000287
10
Average 0.00889 min=0.00818 max=0.0107
Average 0.000168 min=0.000159 max=0.000189
15
Average 0.0119 min=0.0109 max=0.0132
Average 0.000171 min=0.000159 max=0.000194
20
Average 0.0145 min=0.0133 max=0.0172
Average 0.000174 min=0.000167 max=0.000194
25
Average 0.0162 min=0.0154 max=0.0174
Average 0.000174 min=0.000164 max=0.000197
30
Average 0.0185 min=0.0179 max=0.0194
Average 0.000175 min=0.000166 max=0.000201
35
Average 0.0206 min=0.0201 max=0.0216
Average 0.000178 min=0.000167 max=0.000197
40
Average 0.023 min=0.0227 max=0.0238
Average 0.000176 min=0.000168 max=0.000198
45
Average 0.0256 min=0.0251 max=0.0264
Average 0.00018 min=0.000171 max=0.000203
50
Average 0.028 min=0.0276 max=0.0289
Average 0.00018 min=0.000173 max=0.000202

<matplotlib.legend.Legend object at 0x77314e9d3d60>

Total running time of the script: (0 minutes 5.199 seconds)

Gallery generated by Sphinx-Gallery

	penalty	'l2'
	dual	False
	tol	0.0001
	C	1.0
	fit_intercept	True
	intercept_scaling	1
	class_weight	None
	random_state	None
	solver	'lbfgs'
	max_iter	100
	multi_class	'deprecated'
	verbose	0
	warm_start	False
	n_jobs	None
	l1_ratio	None