Skip to content

fix: Fix AutoEP DeepSeekV3 preset for split ModuleList experts#7

Open
nathon-lee wants to merge 1 commit into
tohtana:tohtana/add_autoepfrom
nathon-lee:nathon_autoep_feature1
Open

fix: Fix AutoEP DeepSeekV3 preset for split ModuleList experts#7
nathon-lee wants to merge 1 commit into
tohtana:tohtana/add_autoepfrom
nathon-lee:nathon_autoep_feature1

Conversation

@nathon-lee
Copy link
Copy Markdown

@nathon-lee nathon-lee commented Apr 17, 2026

hi @tohtana could you please take a look at this PR when you have a chance? I updated the DeepSeekV3 AutoEP preset to match the actual HuggingFace expert layout and added regression coverage for the detection and repacking paths

I found the mismatch by instantiating the HuggingFace DeepSeekV3 model skeleton from config and observing that its experts are ModuleList entries with split gate_proj, up_proj, and down_proj layers rather than fused gate_up_proj weights.

import transformers.utils.import_utils
transformers.utils.import_utils.is_torch_fx_available = lambda: False

from transformers import AutoConfig, AutoModelForCausalLM
from accelerate import init_empty_weights

config = AutoConfig.from_pretrained("deepseek-ai/DeepSeek-V3", trust_remote_code=True)

with init_empty_weights():
    model = AutoModelForCausalLM.from_config(config, trust_remote_code=True)

print(model)

Title:

Fix DeepSeekV3 AutoEP preset mismatch

Body:

DeepSeekV3 experts in HuggingFace are stored as ModuleList entries with split gate_proj, up_proj, and down_proj layers, but the previous AutoEP preset assumed fused gate_up_proj expert weights. This mismatch could cause AutoEP to fail MoE layer detection.

This PR updates the deepseek_v3 preset to match the real model structure and adds regression tests for:

DeepSeekV3-style MoE detection
ModuleList expert repacking
the old fused expert assumption failing on the same layout

Test Result

root@b949db7a21db:/workspace/DeepSpeed_woo# python3.13 -m pytest tests/unit/moe/test_autoep_unit.py -k "test_autodetect_deepseek_v3_module_list_experts ortest_repack_module_list_separate_gate_and_up"
=================================================================== test session starts ===================================================================
platform linux -- Python 3.13.5, pytest-9.0.3, pluggy-1.6.0 -- /usr/bin/python3.13
cachedir: .pytest_cache
rootdir: /workspace/DeepSpeed_woo/tests
configfile: pytest.ini
plugins: anyio-4.13.0
collected 67 items / 65 deselected / 2 selected

tests/unit/moe/test_autoep_unit.py::TestMoEDetection::test_autodetect_deepseek_v3_module_list_experts PASSED                                        [ 50%]
tests/unit/moe/test_autoep_unit.py::TestWeightRepacking::test_repack_module_list_separate_gate_and_up PASSED                                        [100%]

==================================================================== warnings summary =====================================================================
../../usr/local/lib/python3.13/dist-packages/torch/jit/_script.py:365: 14 warnings
  /usr/local/lib/python3.13/dist-packages/torch/jit/_script.py:365: DeprecationWarning: `torch.jit.script_method` is deprecated. Please switch to `torch.compile` or `torch.export`.
    warnings.warn(

unit/moe/test_autoep_unit.py::TestMoEDetection::test_autodetect_deepseek_v3_module_list_experts
  /workspace/DeepSpeed_woo/tests/conftest.py:47: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver
    warnings.warn(

unit/moe/test_autoep_unit.py::TestMoEDetection::test_autodetect_deepseek_v3_module_list_experts
  /workspace/DeepSpeed_woo/tests/conftest.py:54: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver
    warnings.warn(

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
==================================================================== slowest durations ====================================================================

(6 durations < 1s hidden.)
====================================================== 2 passed, 65 deselected, 16 warnings in 1.67s ======================================================
root@b949db7a21db:/workspace/DeepSpeed_woo#
root@b949db7a21db:/workspace/DeepSpeed_woo# python3.13 -m pytest tests/unit/moe/test_autoep_unit.py --collect-only -k "test_deepseek_v3_old_fused_preset_assumption_fails or test_deepseek_v3_split_expert_names_detect_successfully"
=================================================================== test session starts ===================================================================
platform linux -- Python 3.13.5, pytest-9.0.3, pluggy-1.6.0 -- /usr/bin/python3.13
cachedir: .pytest_cache
rootdir: /workspace/DeepSpeed_woo/tests
configfile: pytest.ini
plugins: anyio-4.13.0
collected 67 items / 65 deselected / 2 selected

<Dir tests>
  <Package unit>
    <Dir moe>
      <Module test_autoep_unit.py>
        Unit tests for AutoEP feature (all phases append test classes here).
        <Class TestMoEDetection>
          Phase 3 tests for MoE layer detection.
          <Function test_deepseek_v3_old_fused_preset_assumption_fails>
            A fused gate_up_proj assumption cannot detect DeepSeekV3 ModuleList experts.
          <Function test_deepseek_v3_split_expert_names_detect_successfully>
            Using split gate/up names detects the same DeepSeekV3 ModuleList experts.

==================================================================== warnings summary =====================================================================
../../usr/local/lib/python3.13/dist-packages/torch/jit/_script.py:365: 14 warnings
  /usr/local/lib/python3.13/dist-packages/torch/jit/_script.py:365: DeprecationWarning: `torch.jit.script_method` is deprecated. Please switch to `torch.compile` or `torch.export`.
    warnings.warn(

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
====================================================== 2/67 tests collected (65 deselected) in 1.84s ======================================================
root@b949db7a21db:/workspace/DeepSpeed_woo#

Signed-off-by: nathon <leejianwoo@gmail.com>
@nathon-lee nathon-lee requested a review from tohtana as a code owner April 17, 2026 06:39
@tohtana
Copy link
Copy Markdown
Owner

tohtana commented May 14, 2026

Hi @nathon-lee,
Thank you for catching this issue! Sorry I have missed this PR for a long time. As I made a lot of changes, I opened and merged this fix as a separated PR.
I think the main AutoEP PR is almost ready.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants