-
Notifications
You must be signed in to change notification settings - Fork 291
Pull requests: microsoft/onnxruntime-genai
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Auto-detect fixed kv-cache shape in DefaultKeyValueCache
0.14.0
#2166
opened May 18, 2026 by
akholodnamdcom
Contributor
Loading…
Add text-only mode support for Qwen 3.5 model builder
0.14.0
#2157
opened May 12, 2026 by
apsonawane
Contributor
Loading…
Add per-run profiling config for fine-grained Run() profiling
#2152
opened May 9, 2026 by
xiaofeihan1
Contributor
•
Draft
Add VideoChat-Flash (OpenGVLab) language model support
0.14.0
#2147
opened May 8, 2026 by
anilmartha
Contributor
Loading…
4 tasks
Expose mutable sampling parameters on live Generator
#2145
opened May 8, 2026 by
qjia7
Contributor
Loading…
4 tasks done
Add HunYuan Dense V1 (hunyuan_v1_dense) model support
0.14.0
#2144
opened May 8, 2026 by
anilmartha
Contributor
Loading…
fix: enable Generator.rewind_to(0) for multimodal models
#2141
opened May 8, 2026 by
justinchuby
Contributor
•
Draft
[Qwen3] Allow packed QKV MatMul under QK-Norm via post-MatMul Split
0.14.0
#2137
opened May 7, 2026 by
xiaofeihan1
Contributor
Loading…
Fix multimodal CUDA pipeline: embedding output persistence causes shape mismatch
#2123
opened May 5, 2026 by
justinchuby
Contributor
Loading…
Pipeline-as-Config: Declarative model dispatch replacing model_type string registry
#2115
opened May 2, 2026 by
justinchuby
Contributor
•
Draft
Fix AppendNextTokensToSequences heap overflow
#2111
opened Apr 30, 2026 by
apsonawane
Contributor
Loading…
Previous Next
ProTip!
Updated in the last three days: updated:>2026-05-16.