vllm.config.model_arch ¶
Classes:
-
ModelArchitectureConfig–Configuration for model architecture that required by vLLM runtime
ModelArchitectureConfig ¶
Configuration for model architecture that required by vLLM runtime
Attributes:
-
architectures(list[str]) –List of model architecture class names (e.g., ['LlamaForCausalLM']).
-
derived_max_model_len_and_key(tuple[float, str | None]) –Derived maximum model length and key from the hf config.
-
head_size(int) –Head dimension of the model.
-
hidden_size(int) –Hidden size of the model.
-
is_deepseek_mla(bool) –Whether the model is a DeepSeek MLA model.
-
is_mm_prefix_lm(bool) –Whether the model uses image bidirectional attention.
-
model_type(str) –Model type identifier (e.g., 'llama', 'gpt_oss').
-
num_experts(int) –Number of experts in the model.
-
quantization_config(dict[str, Any] | None) –Quantization configuration dictionary containing quantization parameters.
-
text_model_type(str | None) –Text model type identifier (e.g., 'llama4_text').
-
total_num_attention_heads(int) –Number of attention heads in the model.
-
total_num_hidden_layers(int) –Number of hidden layers in the model.
-
total_num_kv_heads(int) –Number of key value heads in the model.
-
vocab_size(int) –Vocabulary size of the model.
Source code in vllm/config/model_arch.py
architectures instance-attribute ¶
List of model architecture class names (e.g., ['LlamaForCausalLM']). It can be None upon calling vllm_config.with_hf_config(config.text_config)
derived_max_model_len_and_key instance-attribute ¶
Derived maximum model length and key from the hf config.
head_size instance-attribute ¶
Head dimension of the model.
hidden_size instance-attribute ¶
Hidden size of the model.
is_deepseek_mla instance-attribute ¶
Whether the model is a DeepSeek MLA model.
is_mm_prefix_lm instance-attribute ¶
Whether the model uses image bidirectional attention.
model_type instance-attribute ¶
Model type identifier (e.g., 'llama', 'gpt_oss').
num_experts instance-attribute ¶
Number of experts in the model.
quantization_config instance-attribute ¶
Quantization configuration dictionary containing quantization parameters.
text_model_type instance-attribute ¶
Text model type identifier (e.g., 'llama4_text').
total_num_attention_heads instance-attribute ¶
Number of attention heads in the model.
total_num_hidden_layers instance-attribute ¶
Number of hidden layers in the model.
total_num_kv_heads instance-attribute ¶
Number of key value heads in the model.
vocab_size instance-attribute ¶
Vocabulary size of the model.