vllm.v1.spec_decode ¶
Modules:
-
custom_class_proposer– -
dflash– -
extract_hidden_states– -
gemma4–Gemma4 MTP (Multi-Token Prediction) proposer for speculative decoding.
-
llm_base_proposer– -
medusa– -
metrics– -
ngram_proposer– -
ngram_proposer_gpu–GPU-accelerated N-gram proposer using fully async PyTorch tensor operations.
-
step3p5– -
suffix_decoding– -
utils–