linmult.core.ffn

FFN residual block: two linear layers with GELU activation and residual connection.

Classes

FFNResidual

Two-layer FFN with GELU activation, dropout, and residual connection.

Module Contents

class linmult.core.ffn.FFNResidual(dim: int, dropout: float = 0.0)[source]

Bases: torch.nn.Module

Two-layer FFN with GELU activation, dropout, and residual connection.

Computes x + fc2(dropout(gelu(fc1(x)))).

Parameters:
  • dim (int) – Input and output feature dimension.

  • dropout (float) – Dropout probability applied after the first linear layer. Defaults to 0.0.

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(x: torch.Tensor) torch.Tensor[source]

Apply FFN + residual.

Parameters:

x (torch.Tensor) – Input tensor of any shape with last dim dim.

Returns:

Same shape as x.

Return type:

torch.Tensor