`vllm.v1.sample.logits_processor.interface` ¶

Classes:

BatchUpdate –

Persistent batch state change info for logitsprocs
LogitsProcessor –

`BatchUpdate` `dataclass` ¶

Persistent batch state change info for logitsprocs

Source code in vllm/v1/sample/logits_processor/interface.py

@dataclass(frozen=True)
class BatchUpdate:
    """Persistent batch state change info for logitsprocs"""

    batch_size: int  # Current num reqs in batch

    # Metadata for requests added to, removed from, and moved
    # within the persistent batch.
    #
    # Key assumption: the `output_tok_ids` list (which is an element of each
    # tuple in `added`) is a reference to the request's running output tokens
    # list; via this reference, the logits processors always see the latest
    # list of generated output tokens.
    #
    # NOTE:
    # * Added or moved requests may replace existing requests with the same
    #   index.
    # * Operations should be processed in the following order:
    #   - removed, added, moved
    removed: Sequence[RemovedRequest]
    added: Sequence[AddedRequest]
    moved: Sequence[MovedRequest]

`LogitsProcessor` ¶

Bases: ABC

Methods:

apply –

Apply LogitsProcessor to batch logits tensor.
is_argmax_invariant –

True if logits processor has no impact on the
update_state –

Called when there are new output tokens, prior
validate_params –

Validate sampling params for this logits processor.

Source code in vllm/v1/sample/logits_processor/interface.py

class LogitsProcessor(ABC):
    @classmethod
    def validate_params(cls, sampling_params: SamplingParams):
        """Validate sampling params for this logits processor.

        Raise ValueError for invalid ones.
        """
        return None

    @abstractmethod
    def __init__(
        self, vllm_config: "VllmConfig", device: torch.device, is_pin_memory: bool
    ) -> None:
        raise NotImplementedError

    @abstractmethod
    def apply(self, logits: torch.Tensor) -> torch.Tensor:
        """Apply LogitsProcessor to batch logits tensor.

        The updated tensor must be returned but may be
        modified in-place.
        """
        raise NotImplementedError

    @abstractmethod
    def is_argmax_invariant(self) -> bool:
        """True if logits processor has no impact on the
        argmax computation in greedy sampling.
        NOTE: may or may not have the same value for all
        instances of a given LogitsProcessor subclass,
        depending on subclass implementation.
        """
        raise NotImplementedError

    @abstractmethod
    def update_state(
        self,
        batch_update: "BatchUpdate | None",
    ) -> None:
        """Called when there are new output tokens, prior
        to each forward pass.

        Args:
            batch_update: Non-None iff there have been changes
                to the batch makeup.
        """
        raise NotImplementedError

`apply(logits)` `abstractmethod` ¶

Apply LogitsProcessor to batch logits tensor.

The updated tensor must be returned but may be modified in-place.

Source code in vllm/v1/sample/logits_processor/interface.py

@abstractmethod
def apply(self, logits: torch.Tensor) -> torch.Tensor:
    """Apply LogitsProcessor to batch logits tensor.

    The updated tensor must be returned but may be
    modified in-place.
    """
    raise NotImplementedError

`is_argmax_invariant()` `abstractmethod` ¶

True if logits processor has no impact on the argmax computation in greedy sampling. NOTE: may or may not have the same value for all instances of a given LogitsProcessor subclass, depending on subclass implementation.

Source code in vllm/v1/sample/logits_processor/interface.py

@abstractmethod
def is_argmax_invariant(self) -> bool:
    """True if logits processor has no impact on the
    argmax computation in greedy sampling.
    NOTE: may or may not have the same value for all
    instances of a given LogitsProcessor subclass,
    depending on subclass implementation.
    """
    raise NotImplementedError

`update_state(batch_update)` `abstractmethod` ¶

Called when there are new output tokens, prior to each forward pass.

Parameters:

batch_update ¶
(BatchUpdate | None) –

Non-None iff there have been changes to the batch makeup.

Source code in vllm/v1/sample/logits_processor/interface.py

@abstractmethod
def update_state(
    self,
    batch_update: "BatchUpdate | None",
) -> None:
    """Called when there are new output tokens, prior
    to each forward pass.

    Args:
        batch_update: Non-None iff there have been changes
            to the batch makeup.
    """
    raise NotImplementedError

`validate_params(sampling_params)` `classmethod` ¶

Validate sampling params for this logits processor.

Raise ValueError for invalid ones.

Source code in vllm/v1/sample/logits_processor/interface.py

@classmethod
def validate_params(cls, sampling_params: SamplingParams):
    """Validate sampling params for this logits processor.

    Raise ValueError for invalid ones.
    """
    return None

vllm.v1.sample.logits_processor.interface ¶

BatchUpdate dataclass ¶

LogitsProcessor ¶

apply(logits) abstractmethod ¶

is_argmax_invariant() abstractmethod ¶

update_state(batch_update) abstractmethod ¶

batch_update ¶

validate_params(sampling_params) classmethod ¶

`vllm.v1.sample.logits_processor.interface` ¶

`BatchUpdate` `dataclass` ¶

`LogitsProcessor` ¶

`apply(logits)` `abstractmethod` ¶

`is_argmax_invariant()` `abstractmethod` ¶

`update_state(batch_update)` `abstractmethod` ¶

`batch_update` ¶

`validate_params(sampling_params)` `classmethod` ¶