vllm.lora.utils ¶
Functions:
-
get_adapter_absolute_path–Resolves the given lora_path to an absolute local path.
-
get_captured_lora_counts–Returns num_active_loras values for cudagraph capture.
-
get_supported_lora_modules–In vLLM, all linear layers support LoRA.
-
is_in_target_modules–Check if a module passes the deployment-time target_modules filter.
-
is_moe_model–Checks if the model contains MoERunner layers and warns the user.
-
is_supported_lora_module–Check if a module is in the model's supported LoRA modules.
-
parse_fine_tuned_lora_name–Parse the name of lora weights.
-
replace_submodule–Replace a submodule in a model with a new module.
get_adapter_absolute_path(lora_path) ¶
Resolves the given lora_path to an absolute local path.
If the lora_path is identified as a Hugging Face model identifier, it will download the model and return the local snapshot path. Otherwise, it treats the lora_path as a local file path and converts it to an absolute path.
lora_path (str): The path to the lora model, which can be an absolute path, a relative path, or a Hugging Face model identifier.
Returns: str: The resolved absolute local path to the lora model.
Source code in vllm/lora/utils.py
get_captured_lora_counts(max_loras, specialize) ¶
Returns num_active_loras values for cudagraph capture.
When specialize=True: powers of 2 up to max_loras, plus max_loras + 1. When specialize=False: just [max_loras + 1].
This is the single source of truth for LoRA capture cases, used by both CudagraphDispatcher and PunicaWrapperGPU.
Source code in vllm/lora/utils.py
get_supported_lora_modules(model) ¶
In vLLM, all linear layers support LoRA.
Source code in vllm/lora/utils.py
is_in_target_modules(module_name, target_modules, packed_modules_mapping=None) ¶
Check if a module passes the deployment-time target_modules filter.
When target_modules is None (no restriction), all modules pass. Otherwise, the module's suffix must be in the target_modules list.
Parameters:
-
(module_name¶str) –Full dot-separated module name.
-
(target_modules¶list[str] | None) –Optional deployment-time restriction list from LoRAConfig.target_modules.
-
(packed_modules_mapping¶dict[str, list[str]] | None, default:None) –Optional model-defined mapping from packed runtime module names to their adapter-visible submodule names (e.g.
{"gate_up_proj": ["gate_proj", "up_proj"]}).
Returns:
-
bool–True if the module passes the filter, False otherwise.
Source code in vllm/lora/utils.py
is_moe_model(model) ¶
Checks if the model contains MoERunner layers and warns the user.
Source code in vllm/lora/utils.py
is_supported_lora_module(module_name, supported_lora_modules) ¶
Check if a module is in the model's supported LoRA modules.
Uses regex suffix matching against the model-defined supported modules list (e.g., matching "model.layers.0.self_attn.o_proj" against "o_proj").
Parameters:
-
(module_name¶str) –Full dot-separated module name.
-
(supported_lora_modules¶list[str]) –List of module suffixes supported by the model.
Returns:
-
bool–True if the module is supported, False otherwise.
Source code in vllm/lora/utils.py
parse_fine_tuned_lora_name(name, weights_mapper=None) ¶
Parse the name of lora weights.
Parameters:
-
(name¶str) –the name of the fine-tuned LoRA, e.g. base_model.model.dense1.weight
-
(weights_mapper¶WeightsMapper | None, default:None) –maps the name of weight, e.g.
model.->language_model.model.,
return: tuple(module_name, is_lora_a): module_name: the name of the module, e.g. model.dense1, is_lora_a whether the tensor is lora_a or lora_b.
Source code in vllm/lora/utils.py
replace_submodule(model, module_name, new_module) ¶
Replace a submodule in a model with a new module.