Skip to content

loss

torch_to_nnef.op.aten.loss

ATen loss-family op emitters (mse_loss, nll_loss, cross_entropy_loss, ...).

Each loss is decomposed via a pointwise NNEF fragment (where pointwise makes sense -- mse, bce-with-logits, kl_div) plus a full-tensor mean_reduce / sum_reduce + squeeze chain governed by torch's reduction enum (0 = none, 1 = mean, 2 = sum).

binary_cross_entropy_with_logits

binary_cross_entropy_with_logits(node, op_helper, **kwargs)

Map aten::binary_cross_entropy_with_logits to NNEF.

Signature: (input, target, weight, pos_weight, reduction). Pointwise BCE via the numerically-stable softplus formulation lives in the binary_cross_entropy_with_logits fragment; weight / pos_weight modulators are not currently supported.

cross_entropy_loss

cross_entropy_loss(node, op_helper, **kwargs)

Map aten::cross_entropy_loss to NNEF.

Lowers to nll_loss(log_softmax(input, dim=1), target, ...). weight / ignore_index / label_smoothing are not currently supported (raise on non-default values).

huber_loss

huber_loss(node, op_helper, **kwargs)

Map PyTorch aten::huber_loss(input, target, reduction, delta).

Pointwise piecewise: quadratic when |input - target| < delta, linear otherwise. Reduction applied by the emitter.

kl_div

kl_div(node, op_helper, **kwargs)

Map aten::kl_div(input, target, reduction, log_target) to NNEF.

Two pointwise fragments, picked by log_target: - kl_div (default): target * (log(target) - input) - kl_div_log_target: exp(target) * (target - input)

input is assumed to be log-probabilities (caller normally feeds log_softmax(...)). Torch's reduction='batchmean' is lowered to sum plus an external division upstream of the aten op, so the aten reduction enum here is only 0 / 1 / 2.

mse_loss

mse_loss(node, op_helper, **kwargs)

Map PyTorch aten::mse_loss(input, target, reduction) to NNEF.

Pointwise (input - target) ** 2 is delegated to the mse_loss fragment, then reduced if reduction != none. Torch broadcasts input / target upstream of the aten op (we see a separate aten::broadcast_tensors in the trace), so the fragment can assume matching shapes.

nll_loss

nll_loss(node, op_helper, **kwargs)

Map PyTorch's nll_loss family to NNEF.

Signature (all three variants): nll_loss(input, target, weight, reduction, ignore_index).

The per-sample loss is -input[n, target[n], ...] along the class axis (=1). Class-weighting and ignore-index masking are common training-side knobs; we raise T2NErrorNotImplemented for both until a real need shows up.

smooth_l1_loss

smooth_l1_loss(node, op_helper, **kwargs)

Map aten::smooth_l1_loss(input, target, reduction, beta).

Same piecewise shape as huber_loss with a different scaling: the quadratic branch is 0.5 * diff^2 / beta and the linear branch is |diff| - 0.5 * beta (vs huber's delta * (|diff| - 0.5 * delta)).