torch_to_nnef.tensor.quant
torch_to_nnef.tensor.quant
Advanced QTensor (<= 8bits) with complex quant scheme non torch native.
QScalePerGroupF16
QScheme
QTensor
QTensor(fp_tensor: torch.Tensor, qscheme: QScheme, dequant_to_dtype=torch.float32, u8_compressors: T.Optional[T.List[U8Compressor]] = None)
Bases: OpaqueTensor
Common interface for all Compressed storage.
QTensorTract
QTensorTractScaleOnly
Bases: QTensorTract
Tract data format it serializes to: Q4_0.
U8Compressor
Abstract class to add u8 compression methods.
This can be used to
Apply bitpack elements bellow 8bit Apply classic compression algorithm
Warning !! .shape of u8_tensor compressed must be same as .shape once decompressed
compress
abstractmethod
Compress a u8 tensor (into u8).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
u8_tensor |
tensor to be compressed with dtype torch.uint8 |
required |
Return: compressed tensor with dtype torch.uint8
decompress
abstractmethod
Decompress an u8 torch tensor (into u8).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
u8_tensor |
compressed tensor with dtype torch.uint8 |
required |
Return: tensor decompressed with dtype torch.uint8