offload

torch_to_nnef.tensor.offload

OffLoad Tensor.

Tensor subclass to work around memories limit on various devices by offloading on disk or on a different 'memory' than final one.

It holds an internal memory storage (permanent) and a temporary instantiation at each operation accessing it on targeted device.

HuggingFace 'accelerate' difference

This is different than HuggingFace 'accelerate' that would spread once the layout of your network accross the different devices available, but preventing to move data to other device afterward.

Indeed we use the torch "Tensor" API instead of the torch.device("meta") allowing to hold more informations such as the final targeted device (and other stuff).

This avoid us to have any need for the Hooking system done in accelerate, and skip need to align data flow graph by pre&post casting.

In short it is transparent for end-user that can use those like read-only device movable tensors (mutation support could be envisioned if needed).

OffloadedTensor

OffloadedTensor(elem, device, offload_dir: Path, name: str, offloaded_tensor_type: T.Type[torch.Tensor], force_gc_collect: bool = False)

Bases: OpaqueTensor

Tensor subclass that maintains data on disk.

It hold an virtual internal memory storage (permanent) and a temporary instantiation at each operation accessing it on targeted device.

Warning

we recommend to version of PyTorch > 1.12 for best compatibility.

is_meta `property`

is_meta: bool

Whether the tensor is on the meta device.

Always False as the tensor is (off|re)loaded from disk.

from_original_tensor `classmethod`

from_original_tensor(tensor: torch.Tensor, name: str, offload_dir: T.Optional[Path] = None, suffix_log_msg: str = '')

Take a torch.Tensor or OpaqueTensor and offload it to disk.

Parameters:

Name	Type	Description	Default
`tensor`	`Tensor`	the torch.Tensor or torch_to_nnef.tensor.OpaqueTensor to dump on disk	required
`name`	`str`	the name of the tensor that will be used to create the filename store on disk	required
`offload_dir`	`Optional[Path]`	The directory where this file will be stored (temporarly)	`None`
`suffix_log_msg`	`str`	Added message log suffix for context	`''`

to

to(*args, **kwargs)

Change the target device when reloaded in memory.

update_values

update_values(values: torch.Tensor, strict_shape: bool = True, strict_dtype: bool = True)

Replace offloaded tensor by new 'values' tensor.

Parameters:

Name	Type	Description	Default
`values`	`Tensor`	The tensor that will replace it on disk assertion are made to ensure same shape, dtype as prior	required
`strict_shape`	`bool`	if True (default) the shape of the new tensor must be the same as the prior one	`True`
`strict_dtype`	`bool`	if True (default) the dtype of the new tensor must be the same as the prior one	`True`

ctx_maybe_load_from_disk_as_offloaded

ctx_maybe_load_from_disk_as_offloaded(offload_dir: T.Optional[T.Union[str, Path]] = None)

Context manager to force safetensors/torch_load to offload to disk.

Example:

with ctx_load_from_disk_as_offloaded():
    if filename.endswith(".safetensors"):
        adapters_weights = safe_load_file(filename, device="cpu")
    else:
        adapters_weights = torch_load(
            filename,
            map_location=torch.device(device)
        )

will offload every tensor to disk as soon as possible.

load_state_dict

load_state_dict(checkpoint_file, device_map=None, offload_dir: T.Optional[Path] = None, apply_offload: bool = False)

Load a checkpoint from a given file.

If the checkpoint is in the safetensors format and a device map is passed, the weights can be fast-loaded directly on the GPU.

Parameters:

Name	Type	Description	Default
`checkpoint_file`	`str`	The path to the checkpoint to load.	required
`device_map`	`Dict[str, Union[int, str, torch.device]]`, optional	A map that specifies where each submodule should go. It doesn't need to be refined to each parameter/buffer name, once a given module name is inside, every submodule of it will be sent to the same device.	`None`
`offload_dir`	`Optional[Path]`	Path optional Offload directory to store tensors	`None`
`apply_offload`	`bool`	bool if activated it will offload each loaded tensor as soon as possible (we disable it in most case to allow set_module_tensor_to_device dtype casting in memory directly)	`False`

safe_load_file

safe_load_file(filename: T.Union[str, os.PathLike], device: TDEVICE = 'cpu', offload_dir: T.Optional[Path] = None, apply_offload: bool = False) -> T.Dict[str, torch.Tensor]

Loads a safetensors file into torch format.

Parameters:

Name	Type	Description	Default
`filename`	`str`, or `os.PathLike`	The name of the file which contains the tensors	required
`device`	`Union[str, int]`, optional, defaults to `cpu`	The device where the tensors need to be located after load. available options are all regular torch device locations.	`'cpu'`
`offload_dir`	`Optional[Path]`	Path location where tensor with device disk will be offloaded	`None`
`apply_offload`	`bool`	if offload is applyied or left to cpu	`False`

Returns:

Type	Description
`Dict[str, Tensor]`	`Dict[str, torch.Tensor]`: dictionary that contains name as key,
`Dict[str, Tensor]`	value as `torch.Tensor`

Example:

from safetensors.torch import load_file

file_path = "./my_folder/bert.safetensors"
loaded = load_file(file_path)

set_module_tensor_to_device

set_module_tensor_to_device(mod_updater: ModTensorUpdater, tensor_name: str, device: TDEVICE, value: T.Optional[torch.Tensor] = None, dtype: T.Optional[T.Union[str, torch.dtype]] = None, offload_dir: T.Optional[Path] = None)

A helper function to set a given tensor (parameter of buffer) to device.

( note that doing param.to(device) creates a new tensor not linked to the parameter, which is why we need this function ).

Parameters:

Name	Type	Description	Default
`mod_updater`	`ModTensorUpdater`	The module updater instance that contains the module	required
`tensor_name`	`str`	The full name of the parameter/buffer.	required
`device`	`int`, `str` or `torch.device`	The device on which to set the tensor.	required
`value`	`torch.Tensor`, optional	The value of the tensor (useful when going from the meta device to any other device).	`None`
`dtype`	`torch.dtype`, optional	If set, the value of the parameter will be cast to this `dtype`. Otherwise, `value` will be cast to the dtype of the existing parameter in the model.	`None`
`offload_dir`	`Optional[Path]`	The directory where tensor offloaded on disk will be stored.	`None`

t2n_load_checkpoint_and_dispatch

t2n_load_checkpoint_and_dispatch(model: nn.Module, checkpoint: Path, device_map: T.Optional[T.Union[str, T.Dict[str, T.Union[str, int, torch.device]]]], offload_dir: Path, strict: bool = False, offload_at_load_state_dict: bool = False)

Allow to offload as soon as possible.

This may be benefical in some rare case where partitioned safetensors file are too big for RAM else it's better to offload after dtype cast in set_module_tensor_to_device.

offload

torch_to_nnef.tensor.offload

HuggingFace 'accelerate' difference

OffloadedTensor

is_meta property

from_original_tensor classmethod

to

update_values

ctx_maybe_load_from_disk_as_offloaded

load_state_dict

safe_load_file

set_module_tensor_to_device

t2n_load_checkpoint_and_dispatch

is_meta `property`

from_original_tensor `classmethod`