vllm.model_executor.layers.quantization.compressed_tensors.utils ¶
Functions:
-
check_equal_or_regex_match–Checks whether a layer_name is exactly equal or a regex match for
-
find_matched_target–Helper function to look up which "target" in the compressed-tensors
_find_first_match(value, targets, check_contains=False) ¶
Returns first element of target that matches value either exactly or as a regex after 're:'. If check_contains is set to True, additionally checks if the target string is contained within the value.
Parameters:
-
(value¶str) –string to compare the list of targets against
-
(targets¶Iterable[str]) –list of targets to match the layer against
-
(check_contains¶bool, default:False) –whether or not to do a substring match
Source code in vllm/model_executor/layers/quantization/compressed_tensors/utils.py
_is_equal_or_regex_match(value, target, check_contains=False) ¶
Checks whether a value is exactly equal or a regex match for target if target starts with 're:'. If check_contains is set to True, additionally checks if the target string is contained within the value.
Source code in vllm/model_executor/layers/quantization/compressed_tensors/utils.py
_match_fused_layer(layer_name, target_layers, fused_mapping) ¶
Match a fused layer name to its corresponding individual layer in target_layers. Returns first value in fused_mapping which matches targets
Implements an "all" matching strategy where a fused layer matches iff "all" of its components match
Parameters:
-
(layer_name¶str) –layer name
-
(target_layers¶Iterable[str]) –list of targets to match the layer against
-
(fused_mapping¶Mapping[str, list[str]]) –map from fused layer names to its components
Examples:
layer_name = "model.layers.0.self_attn.qkv_proj" target_layers = ["model.layers.0.self_attn.q_proj", "model.layers.0.self_attn.k_proj", "model.layers.0.self_attn.v_proj"]
Source code in vllm/model_executor/layers/quantization/compressed_tensors/utils.py
check_equal_or_regex_match(layer_name, targets) ¶
Checks whether a layer_name is exactly equal or a regex match for if target starts with 're:' to any target in list.
Source code in vllm/model_executor/layers/quantization/compressed_tensors/utils.py
find_matched_target(layer_name, module, targets, fused_mapping=MappingProxyType({})) ¶
Helper function to look up which "target" in the compressed-tensors config that a layer corresponds to.
Recall that a compressed-tensors configs has a concept of config_groups, where each layer can be quantized with a different scheme.
targets in each config_group will be a list of either layer names (or regexes corresponding to layer names) or names of torch Modules.
First, we try to match the layer_name with a target Second, we try to match the module's name with a target Third, we try to map the layer_name to a list of fused module names. All component module names must match in order for a match to be successful. A successful match returns the first component target
Parameters:
-
(layer_name¶str | None) –layer name
-
(module¶Module) –torch.nn.Module
-
(targets¶Iterable[str]) –list of targets to match the layer against
-
(fused_mapping¶Mapping[str, list[str]], default:MappingProxyType({})) –map from fused layer names to its components