vllm.model_executor.layers.attention.cross_attention ¶
Classes:
-
CrossAttention–Cross-attention for encoder-decoder models.
CrossAttention ¶
Bases: Attention
Cross-attention for encoder-decoder models. Handles attention between decoder queries and encoder keys/values.
Source code in vllm/model_executor/layers/attention/cross_attention.py
_get_cross_slot_mapping(encoder_seq_lens, block_table_tensor, kv_cache_spec, device) ¶
Get cross-attention slot mappings.