vllm.v1.worker.gpu.async_utils ¶
Functions:
-
stream–Lightweight version of torch.cuda.stream() context manager which
stream(to_stream, from_stream) ¶
Lightweight version of torch.cuda.stream() context manager which avoids current_stream and device lookups.