vllm.distributed.kv_transfer.kv_connector.v1.lmcache_integration.utils ¶
Functions:
-
apply_mm_hashes_to_token_ids–Overwrite token_ids in-place for multimodal placeholders using
-
create_lmcache_metadata–Create LMCacheEngineMetadata from vLLM configuration.
-
extract_mm_features–Normalize multimodal information from a Request into parallel lists.
-
hex_hash_to_int16–Convert a hex hash string to a 16-bit integer.
-
is_false–Check if the given string value is equivalent to 'false'.
-
lmcache_get_or_create_config–Get the LMCache configuration from the environment variable
apply_mm_hashes_to_token_ids(token_ids, mm_hashes, mm_positions) ¶
Overwrite token_ids in-place for multimodal placeholders using efficient slice assignments.
Source code in vllm/distributed/kv_transfer/kv_connector/v1/lmcache_integration/utils.py
create_lmcache_metadata(vllm_config=None, model_config=None, parallel_config=None, cache_config=None) ¶
Create LMCacheEngineMetadata from vLLM configuration.
This function extracts common metadata creation logic that was duplicated across multiple files.
Parameters:
-
(vllm_config¶VllmConfig, default:None) –vLLM configuration object containing model, parallel, and cache configs (alternative to individual config parameters)
-
(model_config¶ModelConfig, default:None) –Model configuration (alternative to vllm_config)
-
(parallel_config¶ParallelConfig, default:None) –Parallel configuration (alternative to vllm_config)
-
(cache_config¶CacheConfig, default:None) –Cache configuration (alternative to vllm_config)
Source code in vllm/distributed/kv_transfer/kv_connector/v1/lmcache_integration/utils.py
extract_mm_features(request, modify=False) ¶
Normalize multimodal information from a Request into parallel lists.
This helper reads either
1) request.mm_features (objects each exposing .identifier and .mm_position), or 2) legacy fields request.mm_hashes and request.mm_positions.
It returns two equally sized lists: the multimodal hash identifiers and their corresponding positions. If the request contains no multimodal info, it returns ([], []).
Parameters:
-
(request¶Request) –The source object.
-
(modify¶bool, default:False) –Controls copy semantics for the legacy-path return values. - If True and legacy fields are used, shallow-copies are returned so the caller can mutate the lists without affecting
request. - If False, the original legacy sequences are returned as-is (zero-copy); treat them as read-only.
Returns:
-
list[str]–tuple[list[str], list[PlaceholderRange]]: (
mm_hashes,mm_positions). -
list[PlaceholderRange]–May be
([], [])when no multimodal data is present.
Source code in vllm/distributed/kv_transfer/kv_connector/v1/lmcache_integration/utils.py
hex_hash_to_int16(s) ¶
is_false(value) ¶
Check if the given string value is equivalent to 'false'.
lmcache_get_or_create_config() ¶
Get the LMCache configuration from the environment variable LMCACHE_CONFIG_FILE. If the environment variable is not set, this function will return the default configuration.
This function is thread-safe and implements singleton pattern, ensuring the configuration is loaded only once.