tract

torch_to_nnef.inference_target.tract

Tools to manipulate tract programatically.

NOTE: interaction are done with *Nix tty system in mind, no support for Windows

TractBinaryDownloader

TractBinaryDownloader(version: SemanticVersion, auto_download: bool = True)

Tract Downloader.

NOTE: Current version assume you are using hardware officialy supported by tract with pre-built binaries.

arch `property`

arch

Current OS architecture name needed to download tract cli asset.

dl_tract

dl_tract()

Download tract requested version in cache directory.

TractCheckTolerance

Bases: str, Enum

Level of tolerated difference between output values of PyTorch and tract.

(those are defined in tract)

TractCli

TractCli(tract_path: Path)

tract calls from CLI.

Why not use python package provided since few release of tract ?

we do not want to be coupled with a python lib as we declare version requested in API because this would lead to the need for an auto package download/import then rollback (since original environement may use another version)

assert_io_cmd_str

assert_io_cmd_str(nnef_path: Path, io_npz_path: Path, check_tolerance: TractCheckTolerance = TractCheckTolerance.EXACT)

Assert a NNEF asset has outputs within tolerance bound with tract.

download `classmethod`

download(version: SemanticVersion) -> TractCli

Download tract requested version in cache directory.

TractNNEF

TractNNEF(version: T.Union[str, SemanticVersion], feature_flags: T.Optional[T.Set[TractFeatureFlag]] = None, check_io: bool = True, dynamic_axes: T.Optional[T.Dict[str, T.Dict[int, str]]] = None, specific_tract_binary_path: T.Optional[Path] = None, check_io_tolerance: TractCheckTolerance = TractCheckTolerance.APPROXIMATE, specific_properties: T.Optional[T.Dict[str, str]] = None, dump_identity_properties: bool = True, force_attention_inner_in_f32: bool = False, force_linear_accumulation_in_f32: bool = False, force_norm_in_f32: bool = False, reify_sdpa_operator: bool = False, upsample_with_debox: bool = False)

Bases: InferenceTarget

Tract NNEF inference target.

Init.

Parameters:

Name	Type	Description	Default
`version`	`Union[str, SemanticVersion]`	tract version targeted for export	required
`feature_flags`	`Optional[Set[TractFeatureFlag]]`	set of possibly added feature flags from tract (for example complex numbers)	`None`
`check_io`	`bool`	check between tract cli and Pytorch original model that given provided input, output is similar	`True`
`dynamic_axes`	`Optional[Dict[str, Dict[int, str]]]`	Optional specification of dynamic dimension By default the exported model will have the shapes of all input and output tensors set to exactly match those given in args. To specify axes of tensors as dynamic (i.e. known only at runtime) set dynamic_axes to a dict with schema: KEY (str): an input or output name. Each name must also be provided in input_names or output_names. VALUE (dict or list): If a dict, keys are axis indices and values are axis names. If a list, each element is an axis index.	`None`
`specific_tract_binary_path`	`Optional[Path]`	filepath of tract cli in case of custom non released version of tract (for testing purpose)	`None`
`check_io_tolerance`	`TractCheckTolerance`	TractCheckTolerance level of difference tolerance between original output values and those generated by tract (those are defined tract levels)	`APPROXIMATE`
`specific_properties`	`Optional[Dict[str, str]]`	custom tract_properties you wish to add inside NNEF asset (will be parsed by tract as metadata)	`None`
`dump_identity_properties`	`bool`	add tract_properties relative to user identity (host, username, OS...), helpfull for debug	`True`
`force_attention_inner_in_f32`	`bool`	`control if attention should be forced as f32 inside (even if inputs are all f16), usefull for unstable networks like qwen2.5`	`False`
`force_linear_accumulation_in_f32`	`bool`	usefull for f16 models to ensure that output of f16. f16 matmul become f32 accumulators.	`False`
`force_norm_in_f32`	`bool`	ensure that all normalization layers are in f32 whatever the original PyTorch modeling.	`False`
`reify_sdpa_operator`	`bool`	enable the conversion of scaled_dot_product_attention as a tract operator (intead of a NNEF fragment). Experimental feature.	`False`
`upsample_with_debox`	`bool`	use debox upsample operator instead of deconvolution. This should be faster. (if tract version support it). Experimental feature.	`False`

post_export

post_export(model: nn.Module, nnef_graph: NGraph, args: T.List[T.Any], exported_filepath: Path, debug_bundle_path: T.Optional[Path] = None)

Perform check io and build debug bundle if fail.

post_trace

post_trace(nnef_graph, active_custom_extensions)

Add dynamic axes in the NNEF graph.

pre_trace

pre_trace(model: nn.Module, input_names: T.Optional[T.List[str]], output_names: T.Optional[T.List[str]])

Check dynamic_axes are correctly formated.

specific_fragments

specific_fragments(model: nn.Module) -> T.Dict[str, str]

Optional custom fragments to pass.

assert_io

assert_io(model: nn.Module, test_input, nnef_file_path: Path, tract_cli: TractCli, io_npz_path: T.Optional[Path] = None, input_names: T.Optional[T.List[str]] = None, output_names: T.Optional[T.List[str]] = None, check_tolerance: TractCheckTolerance = TractCheckTolerance.EXACT)

Simple assertion without debug bundle.

With addition of gc of model once output is generated.

assert_io_and_debug_bundle

assert_io_and_debug_bundle(model: nn.Module, test_input, nnef_file_path: Path, tract_cli: TractCli, io_npz_path: T.Optional[Path] = None, debug_bundle_path: T.Optional[Path] = None, input_names: T.Optional[T.List[str]] = None, output_names: T.Optional[T.List[str]] = None, check_tolerance: TractCheckTolerance = TractCheckTolerance.EXACT)

Core check to ensure tract give same output as PyTorch within bounds.

debug_dumper_pytorch_to_onnx_to_nnef

debug_dumper_pytorch_to_onnx_to_nnef(model: nn.Module, test_input, target_folder: Path, tract_cli: TractCli, raise_export_error: bool = True) -> bool

Try to export the model with ONNX and convert the ONNX to NNEF via tract.

Used in debug bundle build (if it works, that's give a valuable reference, to debug T2N)