Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[Bugfix] Fix load balancer waiting count bug Something isn't working documentation Improvements or additions to documentation nvidia performance Performance-related issues v1
#40634 opened Apr 22, 2026 by Himan-D Loading…
[Feature] Triton INT4 / INT2 per-token-head KV cache quantization documentation Improvements or additions to documentation v1
#40633 opened Apr 22, 2026 by JartX Contributor Loading…
[Refactor] Unify 2D/3D kernels in triton_unified_attention v1
#40631 opened Apr 22, 2026 by JartX Contributor Loading…
[Bugfix] Include inductor and functorch configs in compilation cache key bug Something isn't working
#40627 opened Apr 22, 2026 by zou3519 Collaborator Loading…
Pr 39921 documentation Improvements or additions to documentation needs-rebase nvidia performance Performance-related issues
#40626 opened Apr 22, 2026 by Himan-D Loading…
4 tasks
[CI] Split disaggregated tests into own test-area ci/build ready ONLY add when PR is ready to merge/full CI is needed
#40623 opened Apr 22, 2026 by NickLucche Collaborator Loading…
Fix FireRedASR2 hallucination on non-speech audio
#40619 opened Apr 22, 2026 by Virtuoso461 Loading…
4 tasks
[Bugfix] Quiet weight prefetch logs when executor is shutting down bug Something isn't working
#40615 opened Apr 22, 2026 by zxuhan Loading…
[SpecDecode] Fix async proposer synchronization v1
#40610 opened Apr 22, 2026 by voipmonitor Contributor Draft
[Core] Enable FP8 KV cache with DCP for MLA
#40609 opened Apr 22, 2026 by voipmonitor Contributor Draft
docs: remove outdated LD_PRELOAD instructions [CPU-Backend] cpu Related to CPU backends documentation Improvements or additions to documentation
#40603 opened Apr 22, 2026 by specapoorv Loading…
Refactor INC quantization into package with INCScheme orchestrator
#40601 opened Apr 22, 2026 by yiliu30 Contributor Loading…
[WIP] Support ViT full CUDA graph for Kimi K2.5 nvidia v1
#40600 opened Apr 22, 2026 by gty111 Contributor Loading…
4 tasks
fix: correct typo 'Hyrbid' to 'Hybrid' in test-amd.yaml ci/build rocm Related to AMD ROCm
#40598 opened Apr 22, 2026 by TheodorePTP Loading…
[Bugfix] Close ApiServer ZMQ bind race with wildcard bind + pipe-back bug Something isn't working v1
#40596 opened Apr 22, 2026 by jing-4369 Loading…
Memshare needs-rebase
#40595 opened Apr 22, 2026 by nancynigam Draft
4 tasks
ProTip! What’s not been updated in a month: updated:<2026-03-22.