vllm.model_executor.layers.fused_moe.all2all_utils ¶
Functions:
-
maybe_roundup_layer_hidden_size–Given layer hidden size and MoE configurations, round up hidden_size
maybe_roundup_layer_hidden_size(hidden_size, act_dtype, moe_parallel_config) ¶
Given layer hidden size and MoE configurations, round up hidden_size if necessary.
Parameters:
-
(hidden_size¶int) –Layer hidden-size
-
(act_dtype¶dtype) –Data type of the layer activations.
-
(moe_parallel_config¶FusedMoEParallelConfig) –Fused MoE parallelization strategy configuration.
Return
Rounded up hidden_size if rounding up is required based on the configs and all2all backend. Original hidden size otherwise.