vllm.v1.kv_offload.cpu.policies.base ¶
Classes:
-
BlockStatus–Offloading status for a single block of KV data.
-
CachePolicy–Encapsulates both block organization (data structures) and replacement
BlockStatus ¶
Bases: Structure
Offloading status for a single block of KV data. Holds the following information:
ref_cnt - the current number of transfers using this block as a source. A value of -1 indicates the block is not yet ready to be read. block_id - index of the physical CPU buffer slot.
Attributes:
Source code in vllm/v1/kv_offload/cpu/policies/base.py
is_ready property ¶
Returns whether the block is ready to be read.
CachePolicy ¶
Bases: ABC
Encapsulates both block organization (data structures) and replacement decisions (which block to evict). LRU and ARC differ in both dimensions — ARC's ghost lists and target_t1_size live at the intersection of storage and eviction, so they cannot be separated cleanly.
Methods:
-
clear–Remove ALL blocks regardless of ref_cnt.
-
evict–Evict exactly n blocks, skipping any in protected.
-
get–Find block in data structures. Returns None if not present.
-
insert–Add a newly allocated block. For ARC: also removes from ghost lists.
-
remove–Remove a block (used to clean up after a failed store).
-
touch–Mark blocks as recently used.
Source code in vllm/v1/kv_offload/cpu/policies/base.py
clear() abstractmethod ¶
evict(n, protected) abstractmethod ¶
Evict exactly n blocks, skipping any in protected.
Returns a list of (key, block) for the evicted blocks, or None if n evictions cannot be satisfied. The operation is atomic: if None is returned, no state changes are made.
For ARC: ghost list cleanup (trimming to cache_capacity) is performed at the end of a successful eviction.