Skip to content
This repository was archived by the owner on Mar 21, 2026. It is now read-only.

support qwen3 on nvidia#3302

Open
icyxp wants to merge 2 commits intohuggingface:mainfrom
icyxp:main
Open

support qwen3 on nvidia#3302
icyxp wants to merge 2 commits intohuggingface:mainfrom
icyxp:main

Conversation

@icyxp
Copy link
Copy Markdown
Contributor

@icyxp icyxp commented Jul 23, 2025

Support Qwen3 on Nvidia

@NikiBase
Copy link
Copy Markdown

wrong url https://huggingface.co/collections/Qwen/qwen3-67c6c6f89c4f76621268bb6d
I think it must be this one https://huggingface.co/collections/Qwen/qwen3-67dd247413f0e2e4f653967f

fix qwen3 url
@NikiBase
Copy link
Copy Markdown

Hi @icyxp, I have tried your pull request on local using the image with llamacpp backend and it does not work
https://huggingface.co/docs/text-generation-inference/main/en/backends/llamacpp#build-docker-image
I had updated the llamacpp version on the Dockerfile_llamacpp to the latest version but still not working for Qwen3 model, maybe I should change something else but not sure. Could you please explain how you run your pull request.
Thank you in advance.

@icyxp
Copy link
Copy Markdown
Contributor Author

icyxp commented Jul 29, 2025

@NikiBase Pls use Dockerfile,Not tested on llamacpp. maybe you can check out this: https://qwen.readthedocs.io/en/latest/run_locally/llama.cpp.html

@NikiBase
Copy link
Copy Markdown

NikiBase commented Jul 29, 2025

I am trying to run the docker build from the base dockerfile and I got the following error on this stage

ERROR [base 10/26] RUN cd server &&  uv sync --frozen --extra gen --extra bnb --extra accelerate --extra compressed-tensors --extra quantize --extra peft --extra outlines --extra t

...

583.8 requests.exceptions.HTTPError: 429 Client Error: Too Many Requests for url: https://huggingface.co/api/resolve-cache/models/kernels-community/moe/e3efab933893cde20c5417ba185fa3b7cc811b24/build%2Ftorch27-cxx11-cu128-x86_64-linux%2Fmoe%2Fconfigs%2FE%3D8%2CN%3D8192%2Cdevice_name%3DAMD_Instinct_MI325X%2Cdtype%3Dfp8_w8a8.json

did you experienced this issue during the build phase?

I am running it again to see if it is a temporal connection issue, will tell you if it resolves this way

Thank you!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants