vllm.tokenizers.hf ¶
Classes:
-
ThreadSafeHFTokenizerMixin–Mixin class for thread-safe HF fast tokenizers.
Functions:
-
get_cached_tokenizer–By default, transformers will recompute multiple tokenizer properties
-
maybe_make_thread_pool–If
tokenizeris aPreTrainedTokenizerFast, modify the tokenizer
ThreadSafeHFTokenizerMixin ¶
get_cached_tokenizer(tokenizer) ¶
By default, transformers will recompute multiple tokenizer properties each time they are called, leading to a significant slowdown. This proxy caches these properties for faster access.
Source code in vllm/tokenizers/hf.py
maybe_make_thread_pool(tokenizer, copies=1) ¶
If tokenizer is a PreTrainedTokenizerFast, modify the tokenizer in-place to make the public interface thread-safe by routing calls through a deep-copied tokenizer pool.
Note that: - Only TokenizerLike's public interface is thread-safe. This doesn't include _tokenizer property nor any mutation methods like add_special_tokens or add_tokens. - Adjacent method calls could happen on different deep copies.