vllm.model_executor.kernels.linear.mxfp8 ¶
Modules:
-
Mxfp8LinearKernel– -
emulation– -
flashinfer– -
marlin– -
xpu–
Classes:
-
Mxfp8LinearLayerConfig–Configuration for an MXFP8 linear layer.
Mxfp8LinearLayerConfig dataclass ¶
Configuration for an MXFP8 linear layer.
All MXFP8 layers share the same structure: FP8-E4M3 weights with uint8 (E8M0) per-block scales at block size 32.