vllm.v1.attention.backends.fa_utils ¶
Functions:
-
is_flash_attn_varlen_func_available–Check if flash_attn_varlen_func is available.
is_flash_attn_varlen_func_available() ¶
Check if flash_attn_varlen_func is available.
This function determines whether the flash_attn_varlen_func imported at module level is a working implementation or a stub.
Platform-specific sources: - CUDA: vllm.vllm_flash_attn.flash_attn_varlen_func - XPU: xpu_ops.flash_attn_varlen_func - ROCm: upstream flash_attn.flash_attn_varlen_func (if available)
Note: This is separate from the AITER flash attention backend (rocm_aiter_fa.py) which uses rocm_aiter_ops.flash_attn_varlen_func. The condition to use AITER is handled separately via _aiter_ops.is_aiter_found_and_supported().
Returns:
-
bool(bool) –True if a working flash_attn_varlen_func implementation is available.