vllm.model_executor.kernels.mhc.triton ¶
Functions:
-
rmsnorm_nw–Weight-free RMSNorm over the last dimension.
_hc_head_triton(hs_flat, fn, hc_scale, hc_base, out, hidden_size, rms_eps, hc_eps, hc_mult) ¶
Fill pre-allocated out (T, H) in-place with the hc_head result.
Source code in vllm/model_executor/kernels/mhc/triton.py
_rmsnorm_nw_kernel(x_ptr, out_ptr, stride_row, D, eps, RBLOCK) ¶
Weight-free RMSNorm Triton kernel: out = x * rsqrt(mean(x², -1) + eps).
Source code in vllm/model_executor/kernels/mhc/triton.py
rmsnorm_nw(x, eps) ¶
Weight-free RMSNorm over the last dimension.
Treats x as [num_rows, D] where num_rows = product(shape[:-1]). Returns a contiguous tensor with the same shape and dtype as x.