Skip to content

vllm.models.deepseek_v4.nvidia

Modules:

  • flashinfer_sparse

    DeepSeek V4 FlashInfer TRTLLM-gen sparse MLA backend.

  • flashmla
  • model
  • mtp

    MTP draft model for DeepSeek V4 (internal codename: DeepseekV4).

  • ops

    NVIDIA-only (cutedsl/cutlass) kernels for DeepSeek V4.