This repository was archived by the owner on Mar 21, 2026. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Pull requests: huggingface/text-generation-inference
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Remove dead bitsandbytes CxB code from 8-bit inference path
#3356
opened Feb 16, 2026 by
TimDettmers
Loading…
1 task
test(config): add comprehensive tests for router config utilities
#3349
opened Dec 27, 2025 by
yurekami
Loading…
2 tasks
fix: don't use kernel layernorm on Blackwell architecture to avoid "no kernel image" error
#3343
opened Dec 11, 2025 by
AdamPalaxo
Loading…
1 of 5 tasks
feat: expose GPU energy consumption (mJ) in responses
#3315
opened Aug 28, 2025 by
JulienDelavande
Loading…
2 of 5 tasks
Add dedicated CPU-only Dockerfile and update documentation for CPU/…
#3310
opened Aug 7, 2025 by
jakubgajski
Loading…
2 of 5 tasks
Retrieve the correct cached model batch size in Neuron config checker for Neuron Backend
#3300
opened Jul 19, 2025 by
jimburtoft
Loading…
3 tasks
Set
uv UV_PYTHON_INSTALL_DIR explicitly
#3197
opened Apr 27, 2025 by
sebastianliebscher
Loading…
1 of 5 tasks
Fix flashinfer plan call to use positional arguments for #3165
#3166
opened Apr 11, 2025 by
ruckc
Loading…
2 of 5 tasks
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.