vllm.v1.kv_offload.cpu.swap_blocks_triton ¶
Triton kernel + tuned constants for the swap_blocks_batch fast path.
Functions:
-
swap_blocks_batch–Triton implementation of
swap_blocks_batchfor small CPU->GPU batches.
swap_blocks_batch(src_addrs, dst_addrs, sizes, is_src_access_order_any=False, *, bytes_per_chunk) ¶
Triton implementation of swap_blocks_batch for small CPU->GPU batches.