vllm.model_executor.models.glmasr_utils ¶
_calculate_conv_output_length(input_length, padding, kernel_size, stride) ¶
Calculate Conv1d output length using standard formula.
Source code in vllm/model_executor/models/glmasr_utils.py
_get_audio_output_lengths_for_tower(audio_tower, audio_lengths, merge_factor, conv_params) ¶
Calculate the output lengths after audio processing.
The output length accounts for: 1. Convolution layers (downsampling) 2. Merge factor (further downsampling during projection)
Parameters:
-
(audio_tower¶Module) –The audio encoder module
-
(audio_lengths¶Tensor) –Input feature lengths [batch_size]
-
(merge_factor¶int) –Factor for merging adjacent features
-
(conv_params¶list[tuple[int, int, int]]) –List of (padding, kernel_size, stride) for each conv layer
Returns:
-
Tensor–Output lengths after all processing [batch_size]