Skip to content

offload

torch_to_nnef.tensor.offload

OffLoad Tensor.

Tensor subclass to work around memories limit on various devices by offloading on disk or on a different 'memory' than final one.

It holds an internal memory storage (permanent) and a temporary instantiation at each operation accessing it on targeted device.

HuggingFace 'accelerate' difference

This is different than HuggingFace 'accelerate' that would spread once the layout of your network accross the different devices available, but preventing to move data to other device afterward.

Indeed we use the torch "Tensor" API instead of the torch.device("meta") allowing to hold more informations such as the final targeted device (and other stuff).

This avoid us to have any need for the Hooking system done in accelerate, and skip need to align data flow graph by pre&post casting.

In short it is transparent for end-user that can use those like read-only device movable tensors (mutation support could be envisioned if needed).

OffloadedTensor

OffloadedTensor(elem, device, offload_dir: Path, name: str, offloaded_tensor_type: T.Type[torch.Tensor], force_gc_collect: bool = False)

Bases: OpaqueTensor

Tensor subclass that maintains data on disk.

It hold an virtual internal memory storage (permanent) and a temporary instantiation at each operation accessing it on targeted device.

Warning

we recommend to version of PyTorch > 1.12 for best compatibility.

is_meta property
is_meta: bool

Whether the tensor is on the meta device.

Always False as the tensor is (off|re)loaded from disk.

from_original_tensor classmethod
from_original_tensor(tensor: torch.Tensor, name: str, offload_dir: T.Optional[Path] = None, suffix_log_msg: str = '')

Take a torch.Tensor or OpaqueTensor and offload it to disk.

Parameters:

Name Type Description Default
tensor Tensor

the torch.Tensor or torch_to_nnef.tensor.OpaqueTensor to dump on disk

required
name str

the name of the tensor that will be used to create the filename store on disk

required
offload_dir Optional[Path]

The directory where this file will be stored (temporarly)

None
suffix_log_msg str

Added message log suffix for context

''
to
to(*args, **kwargs)

Change the target device when reloaded in memory.

update_values
update_values(values: torch.Tensor, strict_shape: bool = True, strict_dtype: bool = True)

Replace offloaded tensor by new 'values' tensor.

Parameters:

Name Type Description Default
values Tensor

The tensor that will replace it on disk assertion are made to ensure same shape, dtype as prior

required
strict_shape bool

if True (default) the shape of the new tensor must be the same as the prior one

True
strict_dtype bool

if True (default) the dtype of the new tensor must be the same as the prior one

True

ctx_maybe_load_from_disk_as_offloaded

ctx_maybe_load_from_disk_as_offloaded(offload_dir: T.Optional[T.Union[str, Path]] = None)

Context manager to force safetensors/torch_load to offload to disk.

Example:

with ctx_load_from_disk_as_offloaded():
    if filename.endswith(".safetensors"):
        adapters_weights = safe_load_file(filename, device="cpu")
    else:
        adapters_weights = torch_load(
            filename,
            map_location=torch.device(device)
        )
will offload every tensor to disk as soon as possible.

load_state_dict

load_state_dict(checkpoint_file, device_map=None, offload_dir: T.Optional[Path] = None, apply_offload: bool = False)

Load a checkpoint from a given file.

If the checkpoint is in the safetensors format and a device map is passed, the weights can be fast-loaded directly on the GPU.

Parameters:

Name Type Description Default
checkpoint_file `str`

The path to the checkpoint to load.

required
device_map `Dict[str, Union[int, str, torch.device]]`, *optional*

A map that specifies where each submodule should go. It doesn't need to be refined to each parameter/buffer name, once a given module name is inside, every submodule of it will be sent to the same device.

None
offload_dir Optional[Path]

Path optional Offload directory to store tensors

None
apply_offload bool

bool if activated it will offload each loaded tensor as soon as possible (we disable it in most case to allow set_module_tensor_to_device dtype casting in memory directly)

False

safe_load_file

safe_load_file(filename: T.Union[str, os.PathLike], device: TDEVICE = 'cpu', offload_dir: T.Optional[Path] = None, apply_offload: bool = False) -> T.Dict[str, torch.Tensor]

Loads a safetensors file into torch format.

Parameters:

Name Type Description Default
filename `str`, or `os.PathLike`

The name of the file which contains the tensors

required
device `Union[str, int]`, *optional*, defaults to `cpu`

The device where the tensors need to be located after load. available options are all regular torch device locations.

'cpu'
offload_dir Optional[Path]

Path location where tensor with device disk will be offloaded

None
apply_offload bool

if offload is applyied or left to cpu

False

Returns:

Type Description
Dict[str, Tensor]

Dict[str, torch.Tensor]: dictionary that contains name as key,

Dict[str, Tensor]

value as torch.Tensor

Example:

from safetensors.torch import load_file

file_path = "./my_folder/bert.safetensors"
loaded = load_file(file_path)

set_module_tensor_to_device

set_module_tensor_to_device(mod_updater: ModTensorUpdater, tensor_name: str, device: TDEVICE, value: T.Optional[torch.Tensor] = None, dtype: T.Optional[T.Union[str, torch.dtype]] = None, offload_dir: T.Optional[Path] = None)

A helper function to set a given tensor (parameter of buffer) to device.

( note that doing param.to(device) creates a new tensor not linked to the parameter, which is why we need this function ).

Parameters:

Name Type Description Default
mod_updater `ModTensorUpdater`

The module updater instance that contains the module

required
tensor_name `str`

The full name of the parameter/buffer.

required
device `int`, `str` or `torch.device`

The device on which to set the tensor.

required
value `torch.Tensor`, *optional*

The value of the tensor (useful when going from the meta device to any other device).

None
dtype `torch.dtype`, *optional*

If set, the value of the parameter will be cast to this dtype. Otherwise, value will be cast to the dtype of the existing parameter in the model.

None
offload_dir Optional[Path]

The directory where tensor offloaded on disk will be stored.

None

t2n_load_checkpoint_and_dispatch

t2n_load_checkpoint_and_dispatch(model: nn.Module, checkpoint: Path, device_map: T.Optional[T.Union[str, T.Dict[str, T.Union[str, int, torch.device]]]], offload_dir: Path, strict: bool = False, offload_at_load_state_dict: bool = False)

Allow to offload as soon as possible.

This may be benefical in some rare case where partitioned safetensors file are too big for RAM else it's better to offload after dtype cast in set_module_tensor_to_device.