vllm.v1.attention.ops.chunked_prefill_paged_decode ¶
Functions:
-
has_native_kv_cache_layout–Return whether KV cache blocks can use the native ROCm pairing.
has_native_kv_cache_layout(key_cache, value_cache) ¶
Return whether KV cache blocks can use the native ROCm pairing.
The native reshape_and_cache writer assumes packed blocks. If cache update needs reshape_and_cache_flash for a stride-padded hybrid layout, decode should use the matching Triton path too.