vllm.model_executor.layers.fused_moe.experts ¶
Modules:
-
aiter_mxfp4_w4a8_moe– -
batched_deep_gemm_moe– -
cpu_int4_moe–CPU INT4 W4A8 dynamic quantized fused MoE experts.
-
cpu_moe–CPU FP8 W8A16 and MXFP4 W4A16 fused MoE experts.
-
cutlass_moe–CUTLASS based Fused MoE kernels.
-
deep_gemm_moe– -
fallback– -
flashinfer_b12x_moe– -
flashinfer_cutedsl_batched_moe– -
flashinfer_cutedsl_moe– -
flashinfer_cutlass_moe– -
fused_batched_moe–Fused batched MoE kernel.
-
fused_humming_moe–Fused MoE utilities for Humming.
-
gpt_oss_triton_kernels_moe– -
lora_context– -
lora_experts_mixin– -
marlin_moe–Fused MoE utilities for GPTQ.
-
nvfp4_emulation_moe–NVFP4 quantization emulation for MoE.
-
ocp_mx_emulation_moe–OCP MX quantization emulation for MoE.
-
rocm_aiter_moe– -
triton_cutlass_moe– -
triton_deep_gemm_moe– -
triton_moe–Triton-based MoE expert implementations.
-
trtllm_bf16_moe– -
trtllm_fp8_moe– -
trtllm_mxfp4_moe– -
trtllm_mxint4_moe– -
trtllm_nvfp4_moe– -
xpu_moe–