`vllm.benchmarks` ¶

Modules:

datasets –
latency –

Benchmark the latency of processing a single batch of requests.
lib –

Benchmark library utilities.
mm_processor –

Benchmark multimodal processor latency.
plot –

Generate plots for benchmark results.
serve –

Benchmark online serving throughput.
startup –

Benchmark the cold and warm startup time of vLLM models.
sweep –
throughput –

Benchmark offline inference throughput.

vllm.benchmarks ¶