support qwen3 on nvidia by icyxp · Pull Request #3302 · huggingface/text-generation-inference

icyxp · 2025-07-23T08:04:05Z

Support Qwen3 on Nvidia

NikiBase · 2025-07-28T11:09:53Z

wrong url https://huggingface.co/collections/Qwen/qwen3-67c6c6f89c4f76621268bb6d
I think it must be this one https://huggingface.co/collections/Qwen/qwen3-67dd247413f0e2e4f653967f

fix qwen3 url

NikiBase · 2025-07-29T08:30:11Z

Hi @icyxp, I have tried your pull request on local using the image with llamacpp backend and it does not work
https://huggingface.co/docs/text-generation-inference/main/en/backends/llamacpp#build-docker-image
I had updated the llamacpp version on the Dockerfile_llamacpp to the latest version but still not working for Qwen3 model, maybe I should change something else but not sure. Could you please explain how you run your pull request.
Thank you in advance.

icyxp · 2025-07-29T09:36:55Z

@NikiBase Pls use Dockerfile,Not tested on llamacpp. maybe you can check out this: https://qwen.readthedocs.io/en/latest/run_locally/llama.cpp.html

NikiBase · 2025-07-29T10:43:25Z

I am trying to run the docker build from the base dockerfile and I got the following error on this stage

ERROR [base 10/26] RUN cd server &&  uv sync --frozen --extra gen --extra bnb --extra accelerate --extra compressed-tensors --extra quantize --extra peft --extra outlines --extra t

...

583.8 requests.exceptions.HTTPError: 429 Client Error: Too Many Requests for url: https://huggingface.co/api/resolve-cache/models/kernels-community/moe/e3efab933893cde20c5417ba185fa3b7cc811b24/build%2Ftorch27-cxx11-cu128-x86_64-linux%2Fmoe%2Fconfigs%2FE%3D8%2CN%3D8192%2Cdevice_name%3DAMD_Instinct_MI325X%2Cdtype%3Dfp8_w8a8.json

did you experienced this issue during the build phase?

I am running it again to see if it is a temporal connection issue, will tell you if it resolves this way

Thank you!

support qwen3 on nvidia

bb61a23

Update __init__.py

5d44bdd

fix qwen3 url

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support qwen3 on nvidia#3302

support qwen3 on nvidia#3302
icyxp wants to merge 2 commits intohuggingface:mainfrom
icyxp:main

icyxp commented Jul 23, 2025

Uh oh!

NikiBase commented Jul 28, 2025

Uh oh!

NikiBase commented Jul 29, 2025

Uh oh!

icyxp commented Jul 29, 2025 •

edited

Loading

Uh oh!

NikiBase commented Jul 29, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

icyxp commented Jul 23, 2025

Uh oh!

NikiBase commented Jul 28, 2025

Uh oh!

NikiBase commented Jul 29, 2025

Uh oh!

icyxp commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NikiBase commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

icyxp commented Jul 29, 2025 •

edited

Loading

NikiBase commented Jul 29, 2025 •

edited

Loading