Fix resized LM head weights being overwritten by post_init by javierdejesusda · Pull Request #45079 · huggingface/transformers

javierdejesusda · 2026-03-28T00:06:03Z

What does this PR do?

When tie_word_embeddings=False, calling resize_token_embeddings() then post_init() overwrites
the LM head weights with random values. This happens because _get_resized_lm_head() returns a new
nn.Linear without setting _is_hf_initialized, so post_init treats it as uninitialized.

The fix adds new_lm_head._is_hf_initialized = True at the end of _get_resized_lm_head(), after
all weight copying is done. _get_resized_embeddings() doesn't need this because it reuses the
original module object.

Test added to ModelTesterMixin in test_modeling_common.py, following the existing
test_resize_embeddings_untied pattern. Verified on GPT2, LLaMA, and OPT.

Supersedes #36221 (stale since Feb 2025, reviewer feedback never addressed).

I confirm that this is not a pure code agent PR.
This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline, Pull Request section?
Was this discussed/approved via a Github issue or the forum?
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

Who can review?

@Rocketknight1 @ArthurZucker

When `tie_word_embeddings=False`, `_get_resized_lm_head()` creates a new `nn.Linear` without `_is_hf_initialized`, causing `post_init()` to reinitialize its weights. Set the flag after weight copying is done. Fixes huggingface#35141

Rocketknight1

Yep, LGTM!

HuggingFaceDocBuilderDev · 2026-03-30T14:33:20Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ArthurZucker

Good catch! thanks 🤗

…ce#45079) When `tie_word_embeddings=False`, `_get_resized_lm_head()` creates a new `nn.Linear` without `_is_hf_initialized`, causing `post_init()` to reinitialize its weights. Set the flag after weight copying is done. Fixes huggingface#35141

javierdejesusda force-pushed the fix/resize-embeddings-hf-initialized branch from 7b10fd4 to e59da7d Compare March 28, 2026 01:23

Rocketknight1 approved these changes Mar 30, 2026

View reviewed changes

Rocketknight1 enabled auto-merge March 30, 2026 14:23

Rocketknight1 added this pull request to the merge queue Mar 30, 2026

github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Mar 30, 2026

ArthurZucker approved these changes Mar 31, 2026

View reviewed changes

ArthurZucker added this pull request to the merge queue Mar 31, 2026

github-merge-queue Bot removed this pull request from the merge queue due to no response for status checks Mar 31, 2026

ArthurZucker merged commit 4932e97 into huggingface:main Apr 2, 2026
29 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix resized LM head weights being overwritten by post_init#45079

Fix resized LM head weights being overwritten by post_init#45079
ArthurZucker merged 1 commit intohuggingface:mainfrom
javierdejesusda:fix/resize-embeddings-hf-initialized

javierdejesusda commented Mar 28, 2026

Uh oh!

Rocketknight1 left a comment

Uh oh!

HuggingFaceDocBuilderDev commented Mar 30, 2026

Uh oh!

Uh oh!

ArthurZucker left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

javierdejesusda commented Mar 28, 2026

What does this PR do?

Who can review?

Uh oh!

Rocketknight1 left a comment

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Mar 30, 2026

Uh oh!

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants