vllm.model_executor.layers.quantization.quark.schemes.quark_scheme ¶
Classes:
-
QuarkScheme–Abstract class used to describe the weight creation and forward pass
QuarkScheme ¶
Bases: ABC
Abstract class used to describe the weight creation and forward pass of different quantization schemes supported by Quark.
Methods:
-
apply_weights–Run the forward pass for the particular scheme. This is where
-
create_weights–Weight creation for the particular scheme. Inputs to this function
-
get_min_capability–Get minimum device capability.
-
process_weights_after_loading–Called after weight loading is complete for any cleanup that
Source code in vllm/model_executor/layers/quantization/quark/schemes/quark_scheme.py
apply_weights(layer, x, bias) abstractmethod ¶
Run the forward pass for the particular scheme. This is where scheme-specific dequant/quant steps/kernels should be applied.
Parameters:
-
(layer¶Module) –torch.nn.Module with the registered weights and other parameters relevant to the particular scheme.
-
(x¶Tensor) –input to the layer
-
(bias¶Tensor | None) –bias parameter
Source code in vllm/model_executor/layers/quantization/quark/schemes/quark_scheme.py
create_weights(*args, **kwargs) abstractmethod ¶
Weight creation for the particular scheme. Inputs to this function
get_min_capability() abstractmethod classmethod ¶
process_weights_after_loading(layer) abstractmethod ¶
Called after weight loading is complete for any cleanup that needs to occur.