Skip to content

Use active MPU for AutoEP sequence-parallel size#24

Merged
tohtana merged 1 commit into
tohtana/add_autoepfrom
tohtana/fix-autoep-sp-size-before-groups-mpu
May 14, 2026
Merged

Use active MPU for AutoEP sequence-parallel size#24
tohtana merged 1 commit into
tohtana/add_autoepfrom
tohtana/fix-autoep-sp-size-before-groups-mpu

Conversation

@tohtana
Copy link
Copy Markdown
Owner

@tohtana tohtana commented May 14, 2026

Summary

  • Fixes the Codex bot review comment: Add AutoEP deepspeedai/DeepSpeed#7938 (comment)
  • Reads AutoEP sequence-parallel size from the active engine MPU when it exposes get_sequence_parallel_world_size, before groups.mpu is populated.
  • Keeps the existing groups._get_sequence_parallel_world_size() fallback for non-MPU paths.
  • Adds unit coverage for MPU-first AutoEP validation/group creation and the groups-helper fallback.

Intended Target

This PR is intended to merge into tohtana/add_autoep, the branch behind upstream deepspeedai/DeepSpeed PR deepspeedai#7938.

Tests

Environment: /mnt/local_storage/autoep_transformers_matrix_20260513_fixed/venv-5.8.1
Transformers: 5.8.1

  • timeout 300 /mnt/local_storage/autoep_transformers_matrix_20260513_fixed/venv-5.8.1/bin/python -m pytest -q tests/unit/moe/test_autoep_unit.py::TestAutoEPConfig::test_configure_expert_parallel_uses_engine_mpu_sequence_parallel_size tests/unit/moe/test_autoep_unit.py::TestAutoEPConfig::test_autoep_sequence_parallel_size_falls_back_to_groups_helper
  • timeout 900 /mnt/local_storage/autoep_transformers_matrix_20260513_fixed/venv-5.8.1/bin/python -m pytest -q tests/unit/moe/test_autoep_unit.py

Both passed.

Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>
@tohtana tohtana merged commit 04b8abd into tohtana/add_autoep May 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant