Skip to content

[Sequence Parallelism] Add AutoSP scaffolding for multimodal models (ViT + LLM)#8

Draft
nathon-lee wants to merge 1 commit into
multimodal-seq-parallelfrom
multimodal-seq-parallel-draft
Draft

[Sequence Parallelism] Add AutoSP scaffolding for multimodal models (ViT + LLM)#8
nathon-lee wants to merge 1 commit into
multimodal-seq-parallelfrom
multimodal-seq-parallel-draft

Conversation

@nathon-lee
Copy link
Copy Markdown
Owner

Summary
This PR introduces an initial AutoSP path for multimodal sequence parallelism in DeepSpeed. It adds model auto-detection and automatic attention wrapping for ViT and LLM branches, plus a fusion adapter skeleton for cross-modal sequence reshaping.

Motivation
Multimodal training involves much longer effective sequence lengths, and sequence parallelism is critical for throughput and memory efficiency. Existing workflows require substantial manual engineering to enable SP across both vision and language branches.

What is included

AutoSP detector for multimodal architectures
detects ViT attention modules
detects LLM attention modules
detects vision-language projection module candidate
ViT SP wrapper
adds UlyssesSPViTAttention
supports cls token replication behavior
preserves wrapped module tuple outputs
AutoSP entrypoint
adds auto_wrap_model_for_sp(model, process_group)
performs in-place module wrapping
Fusion adapter scaffold
adds ModalityFusionSPAdapter interface and SP gather/scatter flow
keeps token splicing architecture-specific via override hook
Exports
exposes AutoSP APIs from deepspeed.sequence
Tests
adds unit tests for detector/wrapper/auto-wrap behavior
What is not included

Architecture-specific visual token splice implementations (LLaVA/InternVL/Qwen2-VL) are not part of this PR and will be added in follow-up work.
Compatibility and risk

No behavior change unless users explicitly call auto_wrap_model_for_sp
Current implementation is additive and isolated to new sequence modules
Fusion logic remains opt-in and extensible
Validation

Added unit tests in test_autosp.py
Verified no API break in existing sequence module import paths
Follow-ups

Add model-specific fusion splice adapters
Add end-to-end multimodal SP integration tests
Add benchmark report (throughput/memory/scaling)

Signed-off-by: leejane <121294318@qq.com>
@nathon-lee nathon-lee marked this pull request as draft April 23, 2026 06:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants