vllm.models.deepseek_v4.nvidia ¶ Modules: flashinfer_sparse – DeepSeek V4 FlashInfer TRTLLM-gen sparse MLA backend. flashmla – model – mtp – MTP draft model for DeepSeek V4 (internal codename: DeepseekV4). ops – NVIDIA-only (cutedsl/cutlass) kernels for DeepSeek V4.