matmul
torch_to_nnef.op.aten.matmul
addbmm
aten::addbmm -> beta*self + alpha*sum_b(bmm(b1, b2)).
addmv
aten::addmv -> beta*self + alpha*(mat @ vec).
addr
aten::addr -> beta*self + alpha*(vec1 outer vec2).
baddbmm
Map PyTorch: 'aten:baddbmm', 'aten:addmm', 'aten:bias_addmm'.
bias_addmm is a dispatcher-fused addmm variant (the self
operand is a broadcasted bias); semantics are identical so we
route it through the same addmm fragment.
bilinear
Map aten::bilinear(input1, input2, weight, bias) to NNEF.
bilinear(x1, x2, W, b)[..., k] = sum_{i, j} x1[..., i] * W[k, i, j]
* x2[..., j] + b[k].
Lowered via tract_core_einsum with the three-operand expression
bi, kij, bj -> bk. tract's einsum doesn't accept ellipsis, so
only rank-2 x1 / x2 are supported here.
block_diag
Map aten::block_diag(tensors) to NNEF (rank-2 blocks only).
Builds an (M, N) matrix where M = sum m_i, N = sum n_i and
each block i (shape (m_i, n_i)) sits at row offset
sum_{j<i} m_j and col offset sum_{j<i} n_j; off-diagonal
blocks are zero.
Each block is pad-extended to full width (m_i, N) then all
are concat-stacked along axis 0. Static shapes only.
cartesian_prod
Map aten::cartesian_prod(tensors) to NNEF.
Inputs are 1-D tensors of sizes n_0..n_{K-1}; the result is a
(prod_k n_k, K) matrix whose rows enumerate every tuple in
lexicographic order over the first axis.
Each input column k is built by unsqueeze + tile over all
other dims, then reshape to (prod, 1). The K columns are
concatenated along axis 1. Static sizes only.
chain_matmul
Map aten::chain_matmul(matrices) to a chain of matmul ops.
matrices is a FixedTensorList of >=2 2-D tensors. The chain
is reduced left-to-right; this matches torch's deprecation note
that recommends linalg.multi_dot (which picks a parenthesization
by cost). For inference graphs the per-matrix shapes are fixed and
constant-folding handles the planning, so the naive left-fold is
enough.
conv_tbc
Map PyTorch: 'aten:conv_tbc' to NNEF.
conv_tbc(input, weight, bias, pad) is a 1-D convolution over a
(T, B, C) input: time-batch-channel layout: with weight
(kernel, C_in, C_out). Equivalent semantically to
conv1d(input.permute(1, 2, 0), weight.permute(2, 1, 0), bias,
padding=pad).permute(2, 0, 1), which is exactly the
permute -> conv -> permute chain we emit.
conv_transpose_nd
Map PyTorch: 'aten:conv_transpose{1,2,3}d' to NNEF.
Marked CompositeImplicitAutograd upstream, so PyTorch usually
decomposes these to aten::_convolution(transposed=True) before
the trace reaches t2n. Registering them anyway keeps the support
page accurate and gives a working path if PyTorch ever stops
decomposing for some platform.
Signature: (input, weight, bias?, stride, padding, output_padding,
groups, dilation): 8 positional args.
dot
Map PyTorch: 'aten:dot' to NNEF.
torch.dot(a, b) is the 1-D x 1-D inner product, returning a
scalar. NNEF's matmul requires rank >= 2, so we unsqueeze the
inputs to (1, K) and (K, 1), matmul, then squeeze the (1, 1) back
to a scalar.
einsum
Map PyTorch: 'aten:einsum' to NNEF.
inner
Map aten::inner(input, other) to NNEF.
inner(a, b)[..., i, ..., j, ...] = sum(a[..., i, :] * b[..., j, :]),
i.e. a matmul of a against b.transpose(-1, -2). Works for any
rank: the trailing axis is reduced and the leading dims of a /
b stack as independent index dims.
For 1D inputs the trace materializes torch's "0-D scalar" output;
NNEF doesn't have a 0-D tensor type, so for the 1D case we let the
standard matmul emit a (1, 1) result and rely on torch's trace
having squeezed it upstream (it does: aten::inner with 1D
inputs is the dot-product overload that returns a 0-D real).
kron
Map aten::kron(a, b) to NNEF (rank-2 inputs only).
Kronecker product:
kron(a, b)[i*p+k, j*q+l] = a[i, j] * b[k, l] for a (m, n)
and b (p, q), producing (m*p, n*q).
Lowered via interleaved unsqueeze + broadcast-mul + reshape:
a_e = a.unsqueeze(1).unsqueeze(3) -> (m, 1, n, 1)
b_e = b.unsqueeze(0).unsqueeze(2) -> (1, p, 1, q)
prod = a_e * b_e -> (m, p, n, q)
out = prod.reshape(m*p, n*q)
Higher-rank inputs would need the same interleaved pattern at every dim, plus shape resolution per axis; raise for now.
linear
Map PyTorch: 'aten:linear' to NNEF.
matmul
Map PyTorch: 'aten:matmul', 'aten:bmm', 'aten:mm' to NNEF.
NNEF matmul requires equal rank on both operands; PyTorch's
aten::matmul accepts rank-1 forms with these documented semantics:
(K,) @ (..., K, N)->(..., N): promote A to(..., 1, K), matmul gives(..., 1, N), squeeze the row-1 axis.(..., M, K) @ (K,)->(..., M): promote B to(..., K, 1), matmul gives(..., M, 1), squeeze the col-1 axis.(K,) @ (K,)->()(scalar): promote both, matmul gives(1, 1), squeeze both axes.
Both inputs are promoted to a common target_rank = max(a_rank,
b_rank, 2): batch dims get prepended 1s; for rank-1 inputs the
vector lands as the row (A) or column (B) dim with a singleton on
the opposite side. The post-matmul squeeze drops those singletons
to match _infer_trace_result_matmul's rank prediction so
downstream unsqueeze(-1) / shape ops resolve their axes against
the same rank the IR is tracking.
mv
Map PyTorch: 'aten:mv' to NNEF.
torch.mv(M, v) is matrix-vector with M rank-2 and v rank-1,
returning a rank-1 result. NNEF matmul needs rank-2 on both
sides, so unsqueeze v to (K, 1), matmul to (M, 1), squeeze back.