vllm.model_executor.models.interfaces_base ¶
Classes:
-
VllmModel–The interface required for all models in vLLM.
-
VllmModelForPooling–The interface required for all pooling models in vLLM.
-
VllmModelForTextGeneration–The interface required for all generative models in vLLM.
Functions:
-
attn_type–Decorator to set
VllmModelForPooling.attn_type. -
default_pooling_type–Decorator to set
VllmModelForPooling.default_*_pooling_type.
VllmModel ¶
Bases: Protocol[T_co]
The interface required for all models in vLLM.
Methods:
-
embed_input_ids–Apply token embeddings to
input_ids.
Source code in vllm/model_executor/models/interfaces_base.py
VllmModelForPooling ¶
Bases: VllmModel[T_co], Protocol[T_co]
The interface required for all pooling models in vLLM.
Attributes:
-
attn_type(AttnTypeStr) –Indicates the
-
default_seq_pooling_type(SequencePoolingType) –Indicates the vllm.config.pooler.PoolerConfig.seq_pooling_type
-
default_tok_pooling_type(TokenPoolingType) –Indicates the vllm.config.pooler.PoolerConfig.tok_pooling_type
-
is_pooling_model(Literal[True]) –A flag that indicates this model supports pooling.
-
pooler(Pooler) –The pooler is only called on TP rank 0.
-
score_type(ScoreType) –Indicates the
Source code in vllm/model_executor/models/interfaces_base.py
attn_type = 'decoder' class-attribute ¶
Indicates the vllm.config.model.ModelConfig.attn_type to use by default.
You can use the vllm.model_executor.models.interfaces_base.attn_type decorator to conveniently set this field.
default_seq_pooling_type = 'LAST' class-attribute ¶
Indicates the vllm.config.pooler.PoolerConfig.seq_pooling_type to use by default.
You can use the vllm.model_executor.models.interfaces_base.default_pooling_type decorator to conveniently set this field.
default_tok_pooling_type = 'ALL' class-attribute ¶
Indicates the vllm.config.pooler.PoolerConfig.tok_pooling_type to use by default.
You can use the vllm.model_executor.models.interfaces_base.default_pooling_type decorator to conveniently set this field.
is_pooling_model = True class-attribute ¶
A flag that indicates this model supports pooling.
Note
There is no need to redefine this flag if this class is in the MRO of your model class.
pooler instance-attribute ¶
The pooler is only called on TP rank 0.
score_type = 'bi-encoder' class-attribute ¶
Indicates the vllm.config.model.ModelConfig.score_type to use by default.
Scoring API handles score/rerank for:
-
"classify" task (score_type: cross-encoder models)
-
"embed" task (score_type: bi-encoder models)
-
"token_embed" task (score_type: late interaction models)
score_type defaults to bi-encoder, then the Score API uses the "embed" task.
If you set score_type to cross-encoder via vllm.model_executor.models.interfaces.SupportsCrossEncoding, then the Score API uses the "score" task.
If you set score_type to late-interaction via vllm.model_executor.models.interfaces.SupportsLateInteraction, then the Score API uses the "token_embed" task.
VllmModelForTextGeneration ¶
Bases: VllmModel[T], Protocol[T]
The interface required for all generative models in vLLM.
Methods:
-
compute_logits–Return
Noneif TP rank > 0.
Source code in vllm/model_executor/models/interfaces_base.py
attn_type(attn_type) ¶
Decorator to set VllmModelForPooling.attn_type.
default_pooling_type(*, seq_pooling_type='LAST', tok_pooling_type='ALL') ¶
Decorator to set VllmModelForPooling.default_*_pooling_type.