vllm.config.mamba ¶
Classes:
-
MambaBackendEnum–Enumeration of supported Mamba SSU (selective state update) backends.
-
MambaConfig–Configuration for Mamba SSM backends.
MambaBackendEnum ¶
Bases: Enum
Enumeration of supported Mamba SSU (selective state update) backends.
Source code in vllm/config/mamba.py
MambaConfig ¶
Configuration for Mamba SSM backends.
Methods:
-
validate_backend_before–Enable parsing of the
backendenum type from string.
Attributes:
-
backend(MambaBackendEnum) –Mamba SSU backend to use.
-
enable_stochastic_rounding(bool) –Enable stochastic rounding when writing SSM state to fp16 cache.
-
stochastic_rounding_philox_rounds(int) –Number of Philox PRNG rounds for stochastic rounding random number
Source code in vllm/config/mamba.py
backend = MambaBackendEnum.TRITON class-attribute instance-attribute ¶
Mamba SSU backend to use.
enable_stochastic_rounding = False class-attribute instance-attribute ¶
Enable stochastic rounding when writing SSM state to fp16 cache. Uses random bits to unbias the rounding error, which can improve numerical stability for long sequences.
stochastic_rounding_philox_rounds = 0 class-attribute instance-attribute ¶
Number of Philox PRNG rounds for stochastic rounding random number generation. 0 uses the Triton default. Higher values improve randomness quality at the cost of compute.
validate_backend_before(value) classmethod ¶
Enable parsing of the backend enum type from string.
Source code in vllm/config/mamba.py
_MambaBackendEnumMeta ¶
Bases: EnumMeta
Metaclass for MambaBackendEnum to provide better error messages.