vllm.model_executor.model_loader.reload.utils ¶
Functions:
-
get_info_size–Calculate the number of bytes used by loaded weights for a given layer
-
get_layer_params_buffers–Get all parameters and buffers of a module as a tuple of dicts.
-
get_layer_size–Calculate total number of elements across loadable tensors in a layer.
-
get_layer_tensors–Get all parameters and buffers from a module as a dict.
-
has_device_tensors–Return True if the loaded weights exist on an accelerator device
get_info_size(info) ¶
Calculate the number of bytes used by loaded weights for a given layer
Parameters:
-
(info¶LayerReloadingInfo) –layerwise info to get size of
Returns:
-
int–number of bytes used by loaded weights
Source code in vllm/model_executor/model_loader/reload/utils.py
get_layer_params_buffers(layer) ¶
Get all parameters and buffers of a module as a tuple of dicts.
Source code in vllm/model_executor/model_loader/reload/utils.py
get_layer_size(layer) ¶
Calculate total number of elements across loadable tensors in a layer.
Excludes SKIP_TENSORS (e.g. _expert_map) which are never moved to meta device and never loaded via weight_loader during layerwise reload.
Source code in vllm/model_executor/model_loader/reload/utils.py
get_layer_tensors(layer) ¶
Get all parameters and buffers from a module as a dict.
has_device_tensors(bound_args) ¶
Return True if the loaded weights exist on an accelerator device
Parameters:
-
(bound_args¶BoundArguments) –args to load weights
Returns:
-
bool–True if weights are on accelerator device