vllm.benchmarks ¶
Modules:
-
datasets– -
latency–Benchmark the latency of processing a single batch of requests.
-
lib–Benchmark library utilities.
-
mm_processor–Benchmark multimodal processor latency.
-
plot–Generate plots for benchmark results.
-
serve–Benchmark online serving throughput.
-
startup–Benchmark the cold and warm startup time of vLLM models.
-
sweep– -
throughput–Benchmark offline inference throughput.