[Sequence Parallelism] Add AutoSP scaffolding for multimodal models (ViT + LLM) by nathon-lee · Pull Request #8 · nathon-lee/DeepSpeed_woo

nathon-lee · 2026-04-23T04:44:13Z

Summary
This PR introduces an initial AutoSP path for multimodal sequence parallelism in DeepSpeed. It adds model auto-detection and automatic attention wrapping for ViT and LLM branches, plus a fusion adapter skeleton for cross-modal sequence reshaping.

Motivation
Multimodal training involves much longer effective sequence lengths, and sequence parallelism is critical for throughput and memory efficiency. Existing workflows require substantial manual engineering to enable SP across both vision and language branches.

What is included

AutoSP detector for multimodal architectures
detects ViT attention modules
detects LLM attention modules
detects vision-language projection module candidate
ViT SP wrapper
adds UlyssesSPViTAttention
supports cls token replication behavior
preserves wrapped module tuple outputs
AutoSP entrypoint
adds auto_wrap_model_for_sp(model, process_group)
performs in-place module wrapping
Fusion adapter scaffold
adds ModalityFusionSPAdapter interface and SP gather/scatter flow
keeps token splicing architecture-specific via override hook
Exports
exposes AutoSP APIs from deepspeed.sequence
Tests
adds unit tests for detector/wrapper/auto-wrap behavior
What is not included

Architecture-specific visual token splice implementations (LLaVA/InternVL/Qwen2-VL) are not part of this PR and will be added in follow-up work.
Compatibility and risk

No behavior change unless users explicitly call auto_wrap_model_for_sp
Current implementation is additive and isolated to new sequence modules
Fusion logic remains opt-in and extensible
Validation

Added unit tests in test_autosp.py
Verified no API break in existing sequence module import paths
Follow-ups

Add model-specific fusion splice adapters
Add end-to-end multimodal SP integration tests
Add benchmark report (throughput/memory/scaling)

Signed-off-by: leejane <121294318@qq.com>

feat: Add AutoSP scaffolding for multimodal sequence parallelism

d5c63a8

Signed-off-by: leejane <121294318@qq.com>

nathon-lee marked this pull request as draft April 23, 2026 06:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Sequence Parallelism] Add AutoSP scaffolding for multimodal models (ViT + LLM)#8

[Sequence Parallelism] Add AutoSP scaffolding for multimodal models (ViT + LLM)#8
nathon-lee wants to merge 1 commit into
multimodal-seq-parallelfrom
multimodal-seq-parallel-draft

nathon-lee commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

nathon-lee commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants