vllm.model_executor.model_loader.reload.meta ¶
Functions:
-
get_numel_loaded–Determine how many elements would be loaded by a weight loader call.
-
materialize_layer–Materialize all meta tensors in a layer to actual tensors.
-
materialize_meta_tensor–Materialize a meta tensor into an actual tensor on the current device.
-
restore_layer_on_meta–Restore a layer to model format with tensors on the meta device
-
to_meta_tensor–Convert a tensor to a meta tensor while preserving class and attributes.
CopyCounter ¶
Bases: TorchDispatchMode
Tracks total number of elements modified with copy_.
Useful for keeping track of weight loading where underlying weights can be arbitrarily transformed (such as with narrow) before calling copy.
Note: Assumes that copy kwargs are not used.
Source code in vllm/model_executor/model_loader/reload/meta.py
get_numel_loaded(weight_loader, args) ¶
Determine how many elements would be loaded by a weight loader call.
Parameters:
-
(weight_loader¶Callable) –used to load weights
-
(args¶BoundArguments) –bound arguments to weight loader
Returns:
-
int–number of elements loaded by the weight loader, the return value of the
-
object–weight loader
Source code in vllm/model_executor/model_loader/reload/meta.py
materialize_layer(layer, info) ¶
Materialize all meta tensors in a layer to actual tensors.
Source code in vllm/model_executor/model_loader/reload/meta.py
materialize_meta_tensor(meta_tensor) ¶
Materialize a meta tensor into an actual tensor on the current device. Should be called within the torch device context for the given rank.
Source code in vllm/model_executor/model_loader/reload/meta.py
restore_layer_on_meta(layer, info) ¶
Restore a layer to model format with tensors on the meta device
Source code in vllm/model_executor/model_loader/reload/meta.py
to_meta_tensor(tensor) ¶
Convert a tensor to a meta tensor while preserving class and attributes.