nemo_tract
torch_to_nnef.nemo_tract
Support for NVIDIA NeMo models export to NNEF (with TractNNEF focus).
Provide utilities to export NeMo models, particularly ASR models, to the NNEF format using TractNNEF. Includes functions to handle model subnets, dynamic axes, and custom extensions required for the export process.
DecoderWithoutTargetLength
Bases: Module
Wraps the decoder or joint+decoder for export.
This remove the parameters 'target_length' that are not needed during inference...
Enabled classes
nemo.collections.asr.modules.rnnt.RNNTDecoderJoint
nemo.collections.asr.modules.rnnt.RNNTDecoder
Alter forward by auto adding the target_length parameter based on the shape of the input tensors (Batch size). as an array of shape (batch_size, 1) full of ones. Then remove it from the output (this is the 2nd argument). This is only applied for enabled classes.
This should lead at export to a complete removal of the unused target_length.
WrapPreprocessorCast
Bases: Module
Wraps the preprocessor to add a cast to float32 at the output.
build_custom_subnet_tract_properties
Build custom tract properties for nemo subnet.
build_dynamic_axes
Build dynamic axes mapping and custom extensions for nemo subnet.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
subnet |
nemo subnet module |
required | |
nemo_dynamic_axes |
dynamic axes info from nemo export |
required |
Returns: dynamic_axes: dynamic axes mapping for torch_to_nnef custom_extensions: custom extensions for torch_to_nnef
Note
this code will not scale well and should be refactored when more nemo models are supported.
decoder_fix_input_example_batch_size
decoder_fix_input_example_batch_size(input_example: T.Tuple[torch.Tensor, ...], batch_size: int) -> T.List[torch.Tensor]
Fix the batch size of the input example for decoder models.
NeMo decoder batch size option input example is wrong. This function adjusts the input example to have the specified batch size.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_example |
Tuple[Tensor, ...]
|
The original input example tuple (input_ids, ...). |
required |
batch_size |
int
|
The desired batch size. |
required |
Returns:
| Type | Description |
|---|---|
List[Tensor]
|
The adjusted input example tuple with the specified batch size. |
export_nemo_asr_model
export_nemo_asr_model(asr_model, inference_target, export_dir: Path, compress_registry: str, compress_method: T.Optional[str] = None, skip_preprocessor: bool = False, split_joint_decoder: bool = False, extra_cfg: T.Optional[T.Dict[str, T.Any]] = None, float_dtype: T.Optional[torch.dtype] = None, remove_unused_inputs: bool = True, dump_checked_io: bool = False, *, omegaconf: InjectedOmegaConfModule = INJECTED, **kwargs)
Export a generic NeMo ASR model to NNEF format using TractNNEF.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
asr_model |
The NeMo ASR model to export. |
required | |
inference_target |
The inference target configuration for export. |
required | |
export_dir |
Path
|
Directory where the exported NNEF files will be saved. |
required |
skip_preprocessor |
bool
|
If True, skip exporting the preprocessor subnet. |
False
|
split_joint_decoder |
bool
|
Whether to split the joint&decoder subnets export. |
False
|
compress_registry |
str
|
Compression registry for the exported NNEF subnets. |
required |
compress_method |
Optional[str]
|
Compression method for the exported NNEF subnets. if None, no compression is applied. |
None
|
extra_cfg |
Optional[Dict[str, Any]]
|
Additional configuration to save alongside the model. |
None
|
float_dtype |
Optional[dtype]
|
Optional float dtype to use for export. |
None
|
remove_unused_inputs |
bool
|
To remove unused inputs in the exported model. This happen for decoder subnets that do not use target_length. |
True
|
dump_checked_io |
bool
|
Whether to dump checked input/output examples. |
False
|
omegaconf |
InjectedOmegaConfModule
|
Injected OmegaConf module. |
INJECTED
|
kwargs |
Additional keyword arguments to pass to the export function. |
{}
|
exportable_nemo_net
exportable_nemo_net(output_name, model, input_example, use_dynamo=False, batch_size: int = 1, float_dtype: T.Optional[torch.dtype] = None, *, nemo: InjectedNemoModule = INJECTED, pytorch_lightning: InjectedLightningModule = INJECTED)
Context manager to follow export way of nemo models.
see: nemo.core.classes.Exportable._export
iter_export_params_for_generic_nemo_asr_model
iter_export_params_for_generic_nemo_asr_model(asr_model, inference_target, skip_preprocessor: bool = False, split_joint_decoder: bool = False, remove_unused_inputs: bool = True, float_dtype: T.Optional[torch.dtype] = None) -> T.Iterator[ExportParameters]
Iterator over export parameters for a generic NeMo ASR model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
asr_model |
The NeMo ASR model to export. |
required | |
inference_target |
The target inference type. |
required | |
skip_preprocessor |
bool
|
Whether to skip exporting the preprocessor subnet. |
False
|
split_joint_decoder |
bool
|
Whether to split the joint and decoder subnets exported. |
False
|
remove_unused_inputs |
bool
|
Whether to remove unused inputs from the exported model. |
True
|
float_dtype |
Optional[dtype]
|
Optional float dtype to use for export. |
None
|
Yields:
| Type | Description |
|---|---|
ExportParameters
|
ExportParameters for each subnet of the ASR model, with the preprocessor |
iter_nemo_model_subnets
iter_nemo_model_subnets(model, input_example=None, float_dtype: T.Optional[torch.dtype] = None, split_joint_decoder: bool = False, remove_unused_inputs: bool = True, apply_sequential_examples: bool = False, batch_size: int = 3)
Iterator over exportable subnets of a nemo model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model |
NeMo model to iterate over. |
required | |
input_example |
Optional input example to use for export. |
None
|
|
float_dtype |
Optional[dtype]
|
Optional float dtype to use for export. |
None
|
split_joint_decoder |
bool
|
To split joint decoder subnets (if encountered). |
False
|
remove_unused_inputs |
bool
|
To remove unused inputs from subnet exports. |
True
|
apply_sequential_examples |
bool
|
If True, use sequential input examples for each subnet. |
False
|
batch_size |
int
|
Batch size to use for dummy input generation. |
3
|
Yields:
| Name | Type | Description |
|---|---|---|
subnet_name |
name of the subnet |
|
subnet |
the subnet module |
|
input_example |
input example for the subnet |
|
dynamic_axes |
dynamic axes info for the subnet |
see: nemo.core.classes.Exportable.export
load_asr_model_from_nemo_slug
load_asr_model_from_nemo_slug(model_slug: str, *, nemo_asr: InjectedNemoModule = INJECTED, huggingface_hub: InjectedHuggingFaceHubModule = INJECTED)
Load a NeMo ASR model from a given model slug.
nemo_asr_hg_list
Return the list of available NeMo ASR models from HuggingFace.
setup_inference_target_from_cli_args
Setup TractNNEF inference target from CLI arguments.