linmult.core.config¶
Typed configuration dataclasses for LinT and LinMulT.
Classes¶
Configuration for one output head. |
|
Configuration for |
|
Configuration for |
Module Contents¶
- class linmult.core.config.HeadConfig[source]¶
Configuration for one output head.
- Parameters:
type (str) – Head type. One of
"sequence_aggregation","sequence","vector","simple","upsample","downsample".output_dim (int) – Output feature dimensionality.
name (str) – Head name used as key in the output dict. Defaults to
""(resolved to the head class name at construction time).norm (str) – Normalisation type for heads that use it. One of
"bn","in". Defaults to"bn".pooling (str | None) – Pooling strategy. One of
"gap","gmp","attentionpool", orNone(no pooling, e.g. forSimpleHeadwithout temporal reduction). Defaults toNone(preserve sequence).hidden_dim (int) – Hidden projection size. Defaults to
256.dropout (float) – Dropout probability used inside the head. Defaults to
0.1.input_time_dim (int | None) – Source time dimension for
UpsampleHead/DownsampleHead. Defaults toNone.output_time_dim (int | None) – Target time dimension for
UpsampleHead/DownsampleHead. Defaults toNone.
- classmethod from_dict(d: dict[str, Any]) HeadConfig[source]¶
Construct from a plain dict, ignoring unknown keys.
- Parameters:
d (dict) – Dictionary of head configuration values.
- Returns:
A new
HeadConfiginstance.- Return type:
- class linmult.core.config.LinTConfig[source]¶
Configuration for
LinT(unimodal linear-complexity transformer).Required
- Parameters:
input_feature_dim (int) – Input feature dimensionality.
Identity
- Parameters:
name (str) – Model name shown in
repr. Defaults to"".
Core
- Parameters:
d_model (int) – Internal embedding dimension. Defaults to
40.num_heads (int) – Number of attention heads. Defaults to
8.cmt_num_layers (int) – Self-attention encoder depth. Defaults to
6.
Attention
- Parameters:
attention_type (str) – Attention mechanism. One of
"linear"(default),"performer","flash","softmax","bigbird","mha".flash_query_key_dim (int | None) – Scoring dimension for
"flash"(GAU). Defaults toNone(computed asmax(d_model // 2, 16)).performer_num_random_features (int | None) – Random feature count for
"performer". Defaults toNone(computed asmax(head_dim * 4, 32)).bigbird_block_size (int) – Local block size for
"bigbird". Defaults to64.bigbird_num_global_tokens (int) – Global tokens for
"bigbird". Defaults to16.bigbird_num_random_tokens (int) – Random tokens for
"bigbird". Defaults to10.
Dropout
- Parameters:
dropout_input (float) – Dropout on input before projection. Defaults to
0.0.dropout_output (float) – FFN-fusion output dropout. Defaults to
0.0.dropout_pe (float) – Dropout after positional encoding. Defaults to
0.0.dropout_ffn (float) – Dropout in transformer FFN. Defaults to
0.1.dropout_attention (float) – Attention-weight dropout. Defaults to
0.0.
TRM
- Parameters:
time_dim_reducer (str | None) – Collapse
(B, T, F)→(B, F)before heads. One of"attentionpool","gap","gmp","last", orNone(no reduction). Defaults toNone.
Optional modules
- Parameters:
add_module_ffn_fusion (bool) – FFN + residual block after the encoder. Defaults to
False.
Heads
- Parameters:
heads (list[HeadConfig | dict]) – Output head configurations. Plain dicts are automatically coerced to
HeadConfig. Defaults to[].
Special handling
- Parameters:
special_handling (dict[str, Any]) – Modality-specific input handling (e.g. weighted-sum of transformer layers). Defaults to
{}.
- __post_init__() None[source]¶
Coerce head dicts to
HeadConfiginstances.
- classmethod from_dict(d: dict[str, Any]) LinTConfig[source]¶
Construct from a plain dict (e.g. loaded from YAML), ignoring unknown keys.
- Parameters:
d (dict) – Dictionary of configuration values.
- Returns:
A new
LinTConfiginstance.- Return type:
- classmethod from_yaml(path: str | pathlib.Path) LinTConfig[source]¶
Load a
LinTConfigfrom a YAML file.- Parameters:
path (str | Path) – Path to the YAML configuration file.
- Returns:
A new
LinTConfiginstance.- Return type:
- build_attention_config() linmult.core.attention.AttentionConfig[source]¶
Build an
AttentionConfigfrom this config.- Returns:
Attention configuration ready for use in model construction.
- Return type:
- class linmult.core.config.LinMulTConfig[source]¶
Configuration for
LinMulT(multimodal linear-complexity transformer).Required
- Parameters:
input_feature_dim (list[int]) – Input feature dimensionality per modality. Must have at least 2 entries.
Identity
- Parameters:
name (str) – Model name shown in
repr. Defaults to"".
Core
- Parameters:
d_model (int) – Internal embedding dimension. Defaults to
40.num_heads (int) – Number of attention heads. Defaults to
8.cmt_num_layers (int) – Cross-modal transformer (CMT) encoder depth. Defaults to
6.branch_sat_num_layers (int) – Per-branch self-attention encoder depth. Defaults to
6.
Attention
- Parameters:
attention_type (str) – Attention mechanism. One of
"linear"(default),"performer","flash","softmax","bigbird","mha".flash_query_key_dim (int | None) – Scoring dimension for
"flash"(GAU). Defaults toNone(computed asmax(d_model // 2, 16)).performer_num_random_features (int | None) – Random feature count for
"performer". Defaults toNone(computed asmax(head_dim * 4, 32)).bigbird_block_size (int) – Local block size for
"bigbird". Defaults to64.bigbird_num_global_tokens (int) – Global tokens for
"bigbird". Defaults to16.bigbird_num_random_tokens (int) – Random tokens for
"bigbird". Defaults to10.
Dropout
- Parameters:
dropout_input (float) – Dropout on input before projection. Defaults to
0.0.dropout_output (float) – FFN-fusion output dropout. Defaults to
0.0.dropout_pe (float) – Dropout after positional encoding. Defaults to
0.0.dropout_ffn (float) – Dropout in transformer FFN. Defaults to
0.1.dropout_attention (float) – Attention-weight dropout. Defaults to
0.0.dropout_tam (float) – Dropout inside the TAM projector. Defaults to
0.1.
Unimodal self-attention (optional)
- Parameters:
add_module_unimodal_sat (bool) – Per-modality self-attention transformer (SAT) before cross-modal layers. Defaults to
False.unimodal_sat_num_layers (int) – Unimodal SAT encoder depth. Defaults to
6.
Multimodal signal via TAM (optional)
- Parameters:
add_module_multimodal_signal (bool) – Prepend a TAM-fused cross-modal summary to each branch. Requires
tam_time_dim. Defaults toFalse.mms_num_layers (int) – Encoder depth inside the MMS TAM. Defaults to
6.tam_aligner (str | None) – Temporal alignment strategy. One of
"aap","amp","padding". Required when either TAM module is enabled. Defaults toNone.tam_time_dim (int | None) – Target time dimension after TAM alignment. Required when either TAM module is enabled. Defaults to
None.
TRM
- Parameters:
time_dim_reducer (str | None) – Collapse
(B, T, F)→(B, F)before heads. One of"attentionpool","gap","gmp","last", orNone(no reduction). Defaults toNone.
Fusion (optional)
- Parameters:
add_module_tam_fusion (bool) – TAM-based fusion after cross-modal branches. Requires
tam_time_dim. Defaults toFalse.fusion_num_layers (int) – Encoder depth inside the TAM fusion module. Defaults to
6.add_module_sat_fusion (bool) – Self-attention transformer on the fused representation. Defaults to
False.fusion_sat_num_layers (int) – Fusion SAT encoder depth. Defaults to
6.add_module_ffn_fusion (bool) – FFN + residual block after fusion. Defaults to
False.
Heads
- Parameters:
heads (list[HeadConfig | dict]) – Output head configurations. Plain dicts are automatically coerced to
HeadConfig. Defaults to[].auxiliary_heads (list[HeadConfig | dict]) – Per-branch auxiliary head configs. Plain dicts are automatically coerced to
HeadConfig. Defaults to[].
Special handling
- Parameters:
special_handling (dict[str, Any]) – Modality-specific input handling (e.g. weighted-sum of transformer layers). Defaults to
{}.
- classmethod from_dict(d: dict[str, Any]) LinMulTConfig[source]¶
Construct from a plain dict (e.g. loaded from YAML), ignoring unknown keys.
- Parameters:
d (dict) – Dictionary of configuration values.
- Returns:
A new
LinMulTConfiginstance.- Return type:
- classmethod from_yaml(path: str | pathlib.Path) LinMulTConfig[source]¶
Load a
LinMulTConfigfrom a YAML file.- Parameters:
path (str | Path) – Path to the YAML configuration file.
- Returns:
A new
LinMulTConfiginstance.- Return type:
- build_attention_config() linmult.core.attention.AttentionConfig[source]¶
Build an
AttentionConfigfrom this config.- Returns:
Attention configuration ready for use in model construction.
- Return type: