Skip to content

vllm.benchmarks

Modules:

  • datasets
  • latency

    Benchmark the latency of processing a single batch of requests.

  • lib

    Benchmark library utilities.

  • mm_processor

    Benchmark multimodal processor latency.

  • plot

    Generate plots for benchmark results.

  • serve

    Benchmark online serving throughput.

  • startup

    Benchmark the cold and warm startup time of vLLM models.

  • sweep
  • throughput

    Benchmark offline inference throughput.