vllm.v1.attention.backends.mla.prefill.selector ¶
Selector for MLA prefill backends.
This module provides functions for selecting the appropriate MLA prefill backend based on device capabilities and configuration.
Classes:
-
MLAPrefillSelectorConfig–Hashable configuration for MLA prefill backend selection.
Functions:
-
get_mla_prefill_backend–Select the MLA prefill backend based on configuration and device.
-
is_deepseek_r1_mla_compatible–Check if model has DeepSeek R1 compatible MLA dimensions.
MLAPrefillSelectorConfig ¶
Bases: NamedTuple
Hashable configuration for MLA prefill backend selection.
This is analogous to AttentionSelectorConfig and contains model-specific configuration needed to select an MLA prefill backend, extracted from VllmConfig into a hashable form for caching.
Source code in vllm/v1/attention/backends/mla/prefill/selector.py
_auto_select_mla_prefill_backend(device_capability, selector_config) cached ¶
Auto-select the best available MLA prefill backend.
Parameters:
-
(device_capability¶DeviceCapability) –The device's compute capability.
-
(selector_config¶MLAPrefillSelectorConfig) –Hashable configuration for backend selection.
Returns:
-
type[MLAPrefillBackend]–The selected prefill backend class.
Source code in vllm/v1/attention/backends/mla/prefill/selector.py
_get_mla_prefill_backend_priorities(device_capability) ¶
Get MLA prefill backend priorities based on device capability.
Parameters:
-
(device_capability¶DeviceCapability) –The device's compute capability.
Returns:
-
list[MLAPrefillBackendEnum]–List of backends in priority order (highest priority first).
Source code in vllm/v1/attention/backends/mla/prefill/selector.py
get_mla_prefill_backend(vllm_config) ¶
Select the MLA prefill backend based on configuration and device.
This function first checks for explicit user preferences via mla_prefill_backend in AttentionConfig, then falls back to automatic priority-based selection.
Parameters:
-
(vllm_config¶VllmConfig) –The vLLM configuration.
Returns:
-
type[MLAPrefillBackend]–The selected prefill backend class.
Source code in vllm/v1/attention/backends/mla/prefill/selector.py
is_deepseek_r1_mla_compatible(vllm_config) ¶
Check if model has DeepSeek R1 compatible MLA dimensions.
DeepSeek R1 MLA dimensions are: - qk_nope_head_dim = 128 - qk_rope_head_dim = 64 - v_head_dim = 128