1. Getting Started
Goals
At the end of this tutorial you will know:
- How to export an image model with
torch_to_nnef
- The basic commands to check your model is correct within tract
- How to create a minimal Python program that perform inference with tract
- (Bonus) How to create a minimal rust binary that perform inference from the cli with tract
Prerequisite
- PyTorch and Python basics
- Basic rust knowledge (for the Bonus)
- 15 min to read this page
Step 1. Select an image classifier and an image
Let's start by simply selecting a model to export from the well known torchvision.
- Create a virtual env, install dependencies and assets:
mkdir getting_started_py
cd getting_started_py
python3 -m venv .venv
source .venv/bin/activate
pip install -U pip
pip install torch==2.7.0 \
torchvision==0.22.0 \
torch_to_nnef
wget https://upload.wikimedia.org/wikipedia/commons/5/55/Grace_Hopper.jpg
touch export.py
Let write inside export python file: export.py
to get PyTorch model & input data and perform inference
with the given image.
import torch
from torchvision import models as vision_mdl
from torchvision.io import read_image
my_image_model = vision_mdl.vit_b_16(pretrained=True) # (1)!
img = read_image("./Grace_Hopper.jpg")
classification_task = vision_mdl.ViT_B_16_Weights.IMAGENET1K_V1 # (2)!
input_data_sample = classification_task.transforms()(
img.unsqueeze(0)
)
with torch.no_grad():
predicted_index = my_image_model(
input_data_sample
).argmax(1).tolist()[0]
print(
"class id:",
predicted_index,
"label: ",
classification_task.meta["categories"][
predicted_index
],
)
- Selected model is documented here
- The classification task is documented here
Running the file:
The class index predicted with PyTorch (652
) needs to be aligned with tract prediction we will develop.
Step 2. Export to NNEF
Let's continue the export.py
by calling the main export function from this package:
from pathlib import Path
from torch_to_nnef import export_model_to_nnef, TractNNEF
file_path_export = Path("vit_b_16.nnef.tgz")
export_model_to_nnef( # (1)!
# any nn.Module
model=my_image_model,
# list of model arguments
# (here simply an example of tensor image)
args=input_data_sample,
# filepath to dump NNEF archive
file_path_export=file_path_export,
# inference engine to target
inference_target=TractNNEF( # (2)!
# tract version (to ensure compatible operators)
version="0.21.13",
# default False
# (tract cli will be installed on the machine on fly)
# and correctness of output compared to PyTorch for the
# provided model and input will be performed
check_io=True,
),
input_names=["input"],
output_names=["output"],
# create a debug bundle in case model export work
# but NNEF fail in tract
# (either due to load error or precision mismatch)
debug_bundle_path=Path("./debug.tgz"),
)
print(f"exported {file_path_export.absolute()}")
And that's it if we now run our little snippet (full code here)
We should now observe the following output:
.../site-packages/torch/__init__.py:2132: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert condition, message
aten::size replaced by constant traced value (follows NNEF spec).Keeping dynamism would require dynamic_axes specified.
exported ./vit_b_16.nnef.tgz
But wait there are 2 tracing warnings here:
-
The first is inherent to tracing mechanism happening inside
torch_to_nnef
indeed behind the scene we use PyTorch jit.trace. It is only able to capture torch control flows, so all Python manipulations that do not happen in PyTorch internals are 'solidified' into a set of fixed values. This also happens if you export a PyTorch model to ONNX with their internal tool. -
The second is interesting, it highlights a loss of model expressiveness because we did not specify that one of the input dimension is in fact the batch size, a parameter that may vary. We will show how to solve that in the next tutorial. (spoiler: we use same API as ONNX export to inform
dynamic_axes
)
Finally last line indicates that the model has been correctly exported on disk at: .../vit_b_16.nnef.tgz
.
Step 3. tract cli checks
We will now check with the tract cli that everything is working as expected.
Let's first display the help of the command line we downloaded when we checked io between tract and PyTorch (in step 2.)
If you did skip this steps you can always download manually the cli from the tract release page,
or run cargo install tract
(which will compile it for your system).
This command line is pretty dense so we will only use part of it today.
Let's first load and dump a profile of our model:
tract ./vit_b_16.nnef.tgz \
--nnef-tract-core \
-O \
dump \
--allow-random-input \
--profile
Here a lot is happening:
- tract loads the NNEF registry relative to core operators
- it then loads the model
- it declutters and optimizes it (thanks to the
-O
) - the
--allow-random-input
avoids us to provide a concrete input example - the
--profile
informs the command-line that we want to observe the speed of it
Output in stdout is composed of following sections in order:
The graph of computation (after decluttering and optimization) with each operation speed:
0.000 ms/i 0.0% ┏ 0 Source input
┃ ━━━ 1,3,224,224,F32
....
0.000 ms/i 0.0% ┣ 686 OptMatMulPack output_linear_output.pack_b
┃ ━━━ 1,Opaque 🔍 DynPackedOpaqueFact { k: Val(768), mn: Val(1), packers: [PackedF32[1]@128+1] }
0.046 ms/i 0.0% ┣┻ 688 OptMatMul output_linear_output
━━━ 1,1000,F32
This already tell us about how the network is composed and which specialized operators kernels were selected.
(this display is from an ARM CPU)
Then we have the list of custom properties that have been exported by torch_to_nnef
:
* export_cmd: ,String docs/examples/getting_started.py
* export_date: ,String 2025-07-08 ...
* exported_py_class: ,String VisionTransformer
* hostname: ,String ...
* os: ,String ...arm64 Darwin
* py_version: ,String 3.12.9 ...
* torch_to_nnef_version: ,String 0.18.6
* torch_version: ,String 2.6.0
* tract_stage: ,String optimized
* tract_target_version: ,String 0.21.13
* transformers_version: ,String 4.49.0
* user: ,String tuto
These are metadata that are automatically set when exporting models.
This often come handy during debugging sessions.
You can set custom ones with the specific_properties
parameter in TractNNEF
init.
Finally the aggregated per operator kind performance is shown:
* OptMatMul 74 nodes: 90.859 ms/i 65.6%
* OptMatMulPack 97 nodes: 12.779 ms/i 9.2%
* Softmax 12 nodes: 10.345 ms/i 7.5%
...
With percentage of time spent (again aggregated per operator kind) and you get the total time spent to run the network:
On classical networks, matrix multiplication operations should dominate the compute time.
Info
This command only displays time to run the inference (model load and optimization are not accounted for).
GPU usage
If you have a recent Apple Silicon device try the same command adding --metal
before the dump
and observe the speed difference.
Step 5. tract inference with Python
We just created a great NNEF model, and it has been checked during export to get same output for same input
between PyTorch and tract (thanks to the check_io=True
option). That said you may now wish to
interact with it to perform a fully fledged evaluation of the model (to ensure this new inference engine
does not get imprecise results on some specific samples).
For this purpose we need to install a new package in our activated venv
:
Let's now create a new python file called run.py
:
Let's read our example image again with torch vision and transform it
in numpy
feature matrix, this part is specific to the image classification,
and could be done with any tool you wish (this is not tract
or torch_to_nnef
related).
import tract
import numpy as np
from torchvision import models as vision_mdl
from torchvision.io import read_image
img = read_image("./Grace_Hopper.jpg")
classification_task = vision_mdl.ViT_B_16_Weights.IMAGENET1K_V1
input_data_sample = classification_task.transforms()(
img.unsqueeze(0)
).numpy()
Now we can load the NNEF
model with tract, declutter and optimize it:
model = (
tract.nnef() #(1)!
.with_tract_core()
.model_for_path("./vit_b_16.nnef.tgz")
.into_optimized()
.into_runnable()
)
- documentation available here
Finally we can run the inference for the provided input and extract predicted result:
result = model.run([input_data_sample])
confidences = result[0].to_numpy()
prediced_index = np.argmax(confidences)
print(
"class id:",
predicted_index,
"label: ",
classification_task.meta["categories"][
predicted_index
],
)
And that's it, we can now run our little snippet (full code here).
Congratulation
Your first export with torch_to_nnef
is now done and
you ran a successful standalone tract based inference with it.
This is sufficent if you intend to use python only.
Step 6. Making a minimal rust program
Ok, we have our model asset we confirmed it run well from tract cli and Python, now let's integrate it in a rust program.
We will build it step by step, but note that the code is very similar to this tract example.
cd ..
cargo init getting_started_rs
cd getting_started_rs
cargo add tract-core tract-nnef image
cp ../getting_started_py/vit_b_16.nnef.tgz ./
cp ../getting_started_py/Grace_Hopper.jpg ./
Now compile the project:
We should observe Hello, world!
in stdout.
Let's write the core interesting parts in src/main.rs
:
Add the prelude from tract_nnef
Replace the type signature of the main (for simplicity of this example):
Inside the main replace println with:
let model = tract_nnef::nnef()
.with_tract_core()
.model_for_path("./vit_b_16.nnef.tgz")?
// optimize the model
.into_optimized()?
// make the model runnable and fix its inputs and outputs
.into_runnable()?;
This code is responsible to load, declutter and optimize the model. Prepare an image to be ingested by the neural network:
// open image, resize it and make a Tensor out of it
let image = image::open("Grace_Hopper.jpg")?.to_rgb8();
// scale to model input dimension
let resized = image::imageops::resize(
&image,
224,
224,
::image::imageops::FilterType::Triangle
);
// normalization step
let image = tract_ndarray::Array4::from_shape_fn(
(1, 3, 224, 224),
|(_, c, x, y)| {
let mean = [0.485, 0.456, 0.406][c];
let std = [0.229, 0.224, 0.225][c];
(resized[(x as _, y as _)][c] as f32 / 255.0 - mean) / std
}
)
.into_tensor();
Notice that tract uses ndarray to manipulate tensors with its user facing API.
This tensor is now ready to be run with our tract model:
Let's now get the index of classified class for the image and print it:
// find and display the max value with its index
let best = result[0]
.to_array_view::<f32>()?
.iter()
.cloned()
.zip(0..)
.max_by(|a, b| a.0.partial_cmp(&b.0).unwrap());
println!("result: {best:?}");
That's it our code is complete, (your code should now look like this)
You can now rebuild and run the code with cargo:
This should display to you
Congratulation
you made it !
You first exported the network with
torch_to_nnef
and
ran a successful standalone rust cli command with tract based inference in it.
Demo 1: Image classifier
Using the knowledge you acquired during this tutorial and a bit of extra for WASM in rust.
We demo the use of a small Efficient NET B0
image neural network running in your browser (smaller than ViT to ensure a fast download of the asset for you - 22Mo for the model - ).
Note
This model is not trained by SONOS so prediction accuracy is responsibility of original torchvision authors. Inference performance is descent, but little to no effort was made to make tract WASM efficient (no SIMD WASM, no WebGPU kernels), this demo is for demonstration purpose.
Curious to read the code behind it ? Just look at our example directory here and this raw page content.
Demo 2: Yolo Human Pose Estimator
In the same logic here is a slightly more modern model.
Again the example directory here and this raw page content.