Skip to content

Align default pipeline generation parameters with python transformers#1633

Closed
sroussey wants to merge 1 commit intohuggingface:mainfrom
sroussey:fix-1632
Closed

Align default pipeline generation parameters with python transformers#1633
sroussey wants to merge 1 commit intohuggingface:mainfrom
sroussey:fix-1632

Conversation

@sroussey
Copy link
Copy Markdown
Contributor

@sroussey sroussey commented Apr 3, 2026

Enhance pipeline configurations by setting default parameters for max_new_tokens, num_beams, do_sample, and temperature in Automatic Speech Recognition, Document Question Answering, Text Generation, and Text2Text Generation pipelines.

Closes #1632

…_new_tokens, num_beams, do_sample, and temperature in Automatic Speech Recognition, Document Question Answering, Text Generation, and Text2Text Generation pipelines.
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@nico-martin
Copy link
Copy Markdown
Collaborator

There are a couple of failing tests. But besides that, changing the default behaviour (especially for the do_sample and the temperature) is definitely a breaking change. Although I really like this change, @xenova, are we sure we want that for a minor- or even patch-release?

@xenova
Copy link
Copy Markdown
Collaborator

xenova commented Apr 10, 2026

Hmm, yeah I didn't think about the do_sample: true default. this could be a bit confusing. the max_new_tokens default I think it pretty safe, because the current approach (inherited from python version) is to use max_length: 20, which doesn't care about current sequence length. it could only generate one token, then end because the prompt is longer than 20 tokens.

@xenova
Copy link
Copy Markdown
Collaborator

xenova commented Apr 10, 2026

if anything, this should have been part of v4 😅 (and I forgot/delayed)

many models these days have their own generation config (sometimes using do_sample: true), so the behaviour should only be considered "fully-specified" if the user is the one doing the property setting

Copy link
Copy Markdown
Collaborator

@xenova xenova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the PR! I think the main fix we need is max_new_tokens, and although do_sample and temperature are less important, we can still include them for consistency. Marking as v4.1 is okay, as this probably should have been done in v4.0

to better align with the python library (see PR)

maybe we should introduce _default_generation_config on the class itself and use when we construct the generation config to actually use.

@huggingface huggingface deleted a comment from sroussey Apr 15, 2026
@xenova xenova changed the title Fix Issue 1632 Align default pipeline generation parameters with python transformers Apr 15, 2026
@xenova
Copy link
Copy Markdown
Collaborator

xenova commented Apr 15, 2026

merged via #1649
image

@xenova xenova closed this Apr 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

max_new_tokens?

4 participants