vllm.kernels.aiter_ops ¶
Attributes:
-
AITER_SUPPORTED–Most kernels in this file are supported if AITER is installed.
-
aiter_lib–This library holds torch custom ops for wrapped AITER ops.
-
direct_register_aiter_op–Syntactic sugar for registering AITER custom ops.
-
rms_add_no_var_16bit_only–AITER fused_add_rms_norm only supports 16-bit activations and no var_size override.
-
rms_no_var_16bit_only–AITER rms_norm only supports float16 and bfloat16 acts, no var_size override,
AITER_SUPPORTED = is_aiter_found() module-attribute ¶
Most kernels in this file are supported if AITER is installed.
aiter_lib = Library('vllm_aiter', 'FRAGMENT') module-attribute ¶
This library holds torch custom ops for wrapped AITER ops. Many AITER ops want to remain invisible to torch.compile even after lowering. They are thus wrapped into torch custom ops inside the IR op implementations.
direct_register_aiter_op = functools.partial(direct_register_custom_op, target_lib=aiter_lib) module-attribute ¶
Syntactic sugar for registering AITER custom ops.
rms_add_no_var_16bit_only = lambda x, x_residual, weight, epsilon, variance_size=None: variance_size is None and x.dtype in (torch.float16, torch.bfloat16) and (weight is None or weight.dtype == x.dtype) module-attribute ¶
AITER fused_add_rms_norm only supports 16-bit activations and no var_size override. Requires weight dtype to match x dtype.
rms_no_var_16bit_only = lambda x, weight, epsilon, variance_size=None: variance_size is None and x.dtype in (torch.float16, torch.bfloat16) and (weight is None or weight.dtype == x.dtype) module-attribute ¶
AITER rms_norm only supports float16 and bfloat16 acts, no var_size override, and requires weight dtype to match x dtype.