[megatron] support gemma4 megatron by Jintao-Huang · Pull Request #9296 · modelscope/ms-swift

Jintao-Huang · 2026-05-09T07:51:30Z

No description provided.

gemini-code-assist

Code Review

This pull request updates the documentation to reflect support for Gemma 4 models and refactors embedding handling in Megatron utilities to support multiple modules during device conversion. In swift/model/models/gemma.py, a suggestion was made to use inputs_embeds.device instead of multimodal_mask.device when moving the pad_embedding tensor to ensure better robustness and consistency across operands in the subsequent torch.where call.

gemini-code-assist · 2026-05-09T07:53:12Z


        if self.config.get_text_config().hidden_size_per_layer_input:
            pad_embedding = self.language_model.embed_tokens.weight[self.config.text_config.pad_token_id, :]
+            pad_embedding = pad_embedding.to(multimodal_mask.device)


Using inputs_embeds.device as the target for the .to() call is generally more robust than multimodal_mask.device. Since inputs_embeds is the primary tensor representing the hidden states in this operation, it serves as the most reliable reference for the execution device, ensuring consistency across all operands in the subsequent torch.where call.

Suggested change

pad_embedding = pad_embedding.to(multimodal_mask.device)

pad_embedding = pad_embedding.to(inputs_embeds.device)

support gemma4 megatron

152593c

gemini-code-assist Bot reviewed May 9, 2026

View reviewed changes

Jintao-Huang added 3 commits May 11, 2026 14:27

Merge branch 'main' into support_gemma4_megatron

5086cdb

Merge branch 'main' into support_gemma4_megatron

5c9f86b

Merge branch 'main' into support_gemma4_megatron

e732b71

hjh0119 approved these changes May 18, 2026

View reviewed changes

Jintao-Huang added 2 commits May 19, 2026 01:59

Merge branch 'main' into support_gemma4_megatron

f37f967

update

d7c2f28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[megatron] support gemma4 megatron#9296

[megatron] support gemma4 megatron#9296
Jintao-Huang wants to merge 6 commits into
modelscope:mainfrom
Jintao-Huang:support_gemma4_megatron

Jintao-Huang commented May 9, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	pad_embedding = pad_embedding.to(multimodal_mask.device)
	pad_embedding = pad_embedding.to(inputs_embeds.device)

Conversation

Jintao-Huang commented May 9, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants