Skip to content

vllm.model_executor.model_loader.reload.utils

Functions:

  • get_info_size

    Calculate the number of bytes used by loaded weights for a given layer

  • get_layer_params_buffers

    Get all parameters and buffers of a module as a tuple of dicts.

  • get_layer_size

    Calculate total number of elements across loadable tensors in a layer.

  • get_layer_tensors

    Get all parameters and buffers from a module as a dict.

  • has_device_tensors

    Return True if the loaded weights exist on an accelerator device

get_info_size(info)

Calculate the number of bytes used by loaded weights for a given layer

Parameters:

  • info

    (LayerReloadingInfo) –

    layerwise info to get size of

Returns:

  • int

    number of bytes used by loaded weights

Source code in vllm/model_executor/model_loader/reload/utils.py
def get_info_size(info: LayerReloadingInfo) -> int:
    """
    Calculate the number of bytes used by loaded weights for a given layer

    Args:
        info: layerwise info to get size of

    Returns:
        number of bytes used by loaded weights
    """
    return sum(
        value.nbytes
        for _, args in info.loaded_weights
        for value in args.arguments.values()
        if isinstance(value, torch.Tensor) and value.device.type not in ("meta", "cpu")
    )

get_layer_params_buffers(layer)

Get all parameters and buffers of a module as a tuple of dicts.

Source code in vllm/model_executor/model_loader/reload/utils.py
def get_layer_params_buffers(layer: torch.nn.Module) -> LayerTensors:
    """Get all parameters and buffers of a module as a tuple of dicts."""
    return (
        {name: param for name, param in layer._parameters.items() if param is not None},
        {name: buffer for name, buffer in layer._buffers.items() if buffer is not None},
    )

get_layer_size(layer)

Calculate total number of elements across loadable tensors in a layer.

Excludes SKIP_TENSORS (e.g. _expert_map) which are never moved to meta device and never loaded via weight_loader during layerwise reload.

Source code in vllm/model_executor/model_loader/reload/utils.py
def get_layer_size(layer: torch.nn.Module) -> int:
    """Calculate total number of elements across loadable tensors in a layer.

    Excludes SKIP_TENSORS (e.g. _expert_map) which are never moved to meta
    device and never loaded via weight_loader during layerwise reload.
    """
    from .meta import SKIP_TENSORS

    return sum(
        tensor.numel()
        for name, tensor in get_layer_tensors(layer).items()
        if name not in SKIP_TENSORS
    )

get_layer_tensors(layer)

Get all parameters and buffers from a module as a dict.

Source code in vllm/model_executor/model_loader/reload/utils.py
def get_layer_tensors(layer: torch.nn.Module) -> dict[str, torch.Tensor]:
    """Get all parameters and buffers from a module as a dict."""
    params, buffers = get_layer_params_buffers(layer)
    return params | buffers

has_device_tensors(bound_args)

Return True if the loaded weights exist on an accelerator device

Parameters:

Returns:

  • bool

    True if weights are on accelerator device

Source code in vllm/model_executor/model_loader/reload/utils.py
def has_device_tensors(bound_args: BoundArguments) -> bool:
    """
    Return True if the loaded weights exist on an accelerator device

    Args:
        bound_args: args to load weights

    Returns:
        True if weights are on accelerator device
    """
    return any(
        isinstance(value, torch.Tensor) and value.device.type not in ("meta", "cpu")
        for value in bound_args.arguments.values()
    )