vllm.entrypoints.serve.dev.cache.api_router ¶
Functions:
-
reset_encoder_cache–Reset the encoder cache. Note that we currently do not check if the
-
reset_mm_cache–Reset the multi-modal cache. Note that we currently do not check if the
-
reset_prefix_cache–Reset the local prefix cache.
reset_encoder_cache(raw_request) async ¶
Reset the encoder cache. Note that we currently do not check if the encoder cache is successfully reset in the API server.
Source code in vllm/entrypoints/serve/dev/cache/api_router.py
reset_mm_cache(raw_request) async ¶
Reset the multi-modal cache. Note that we currently do not check if the multi-modal cache is successfully reset in the API server.
Source code in vllm/entrypoints/serve/dev/cache/api_router.py
reset_prefix_cache(raw_request, reset_running_requests=Query(default=False), reset_external=Query(default=False)) async ¶
Reset the local prefix cache.
Optionally, if the query parameter reset_external=true also resets the external (connector-managed) prefix cache.
Note that we currently do not check if the prefix cache is successfully reset in the API server.
Example
POST /reset_prefix_cache?reset_external=true