vllm.v1.kv_offload.tiering.example.manager ¶
ExampleSecondaryTierManager: A simple in-memory secondary tier.
This implementation provides a minimal secondary tier that stores blocks in memory (using a dictionary) with immediate completion. It serves as a reference for writing new tiers and is useful for testing the TieringOffloadingManager without requiring actual storage or network backends.
Classes:
-
ExampleSecondaryTierManager–A simple in-memory secondary tier.
ExampleSecondaryTierManager ¶
Bases: SecondaryTierManager
A simple in-memory secondary tier.
This implementation: - Stores blocks in a dictionary (key -> True) - Completes transfers immediately (synchronous)
Methods:
-
__init__–Initialize the example secondary tier.
-
get_finished_jobs–Poll for finished jobs.
-
get_num_blocks–Get the number of blocks currently stored in this tier.
-
lookup–Check whether a block exists in this secondary tier.
-
submit_load–Submit a job to load blocks from this tier to primary tier.
-
submit_store–Submit a job to store blocks from primary tier to this tier.
Source code in vllm/v1/kv_offload/tiering/example/manager.py
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 | |
__init__(offloading_spec, primary_kv_view, tier_type, custom_param=0) ¶
Initialize the example secondary tier.
Parameters:
Source code in vllm/v1/kv_offload/tiering/example/manager.py
get_finished_jobs() ¶
get_num_blocks() ¶
lookup(key, req_context) ¶
Check whether a block exists in this secondary tier.
Parameters:
Returns:
-
bool | None–True if the block is present, False if not found.
Source code in vllm/v1/kv_offload/tiering/example/manager.py
submit_load(job_metadata) ¶
Submit a job to load blocks from this tier to primary tier.
Parameters:
-
(job_metadata¶JobMetadata) –Job metadata including job_id, keys, and spec for writing blocks into the primary tier.
Source code in vllm/v1/kv_offload/tiering/example/manager.py
submit_store(job_metadata) ¶
Submit a job to store blocks from primary tier to this tier.
Parameters:
-
(job_metadata¶JobMetadata) –Job metadata including job_id, keys, and spec for reading blocks from the primary tier.