torch_to_nnef.tensor

NamedTensor

NamedTensor(fp_tensor: torch.Tensor, nnef_name: str)

Bases: Tensor

Tensor enriched with name attribute.

data `property` `writable`

data

Very important to keep access to all special attr of NamedTensor.

OffloadedTensor

OffloadedTensor(elem, device, offload_dir: Path, name: str, offloaded_tensor_type: T.Type[torch.Tensor], force_gc_collect: bool = False)

Bases: OpaqueTensor

Tensor subclass that maintains data on disk.

It hold an virtual internal memory storage (permanent) and a temporary instantiation at each operation accessing it on targeted device.

Warning

we recommend to version of PyTorch > 1.12 for best compatibility.

is_meta `property`

is_meta: bool

Whether the tensor is on the meta device.

Always False as the tensor is (off|re)loaded from disk.

from_original_tensor `classmethod`

from_original_tensor(tensor: torch.Tensor, name: str, offload_dir: T.Optional[Path] = None, suffix_log_msg: str = '')

Take a torch.Tensor or OpaqueTensor and offload it to disk.

Parameters:

Name	Type	Description	Default
`tensor`	`Tensor`	the torch.Tensor or torch_to_nnef.tensor.OpaqueTensor to dump on disk	required
`name`	`str`	the name of the tensor that will be used to create the filename store on disk	required
`offload_dir`	`Optional[Path]`	The directory where this file will be stored (temporarly)	`None`
`suffix_log_msg`	`str`	Added message log suffix for context	`''`

to

to(*args, **kwargs)

Change the target device when reloaded in memory.

update_values

update_values(values: torch.Tensor, strict_shape: bool = True, strict_dtype: bool = True)

Replace offloaded tensor by new 'values' tensor.

Parameters:

Name	Type	Description	Default
`values`	`Tensor`	The tensor that will replace it on disk assertion are made to ensure same shape, dtype as prior	required
`strict_shape`	`bool`	if True (default) the shape of the new tensor must be the same as the prior one	`True`
`strict_dtype`	`bool`	if True (default) the dtype of the new tensor must be the same as the prior one	`True`

OpaqueTensorRef

OpaqueTensorRef(meta_tensor: torch.Tensor, opaque_tensor: OpaqueTensor)

Bases: Tensor

Allow to pass through 'tracing'.

QScalePerGroupF16

QScalePerGroupF16(group_size: int, scale: torch.Tensor, n_bits: int)

Bases: QScheme

f16 scale only per group.

Tract aligned using negative scales.

QTensor

QTensor(fp_tensor: torch.Tensor, qscheme: QScheme, dequant_to_dtype=torch.float32, u8_compressors: T.Optional[T.List[U8Compressor]] = None)

Bases: OpaqueTensor

Common interface for all Compressed storage.

to_device

to_device(new_device)

Specific device handling.

write_in_file

write_in_file(dirpath: T.Union[str, Path], label: str)

Called at NNEF write time.

Each specific inference engine format should implement the file dump prefered.

QTensorTractScaleOnly

QTensorTractScaleOnly(*args, specific_machine: T.Optional[str] = None, **kwargs)

Bases: QTensorTract

Tract data format it serializes to: Q4_0.

decompress

decompress()

Tract dequantization depends on hardware.

Typically dequantization happen with ops in f16 on ARM and f32 (scale directly casted) on others so we overwrite the function to be consistant with tract.

apply_name_to_tensor_in_module

apply_name_to_tensor_in_module(model: torch.nn.Module)

Transform torch.Tensor or Parameters into NamedTensor.

This is applied at export time of torch_to_nnef Just before doing any tracing and allow to keep variable naming identical to PyTorch one

This consistent naming unlock subsequent manipulations such as LORA applications @ inference or such.

set_opaque_tensor_in_params_as_ref

set_opaque_tensor_in_params_as_ref(model: torch.nn.Module)

Transform OpaqueTensor Parameters into OpaqueTensorRef.

This is applied at export time of torch_to_nnef Just before doing any tracing

torch_to_nnef.tensor