feat: MLA absorption for DeepSeek V2/V3 — fuse low-rank Q/K/V into standard dense tensors #96
Annotations
1 warning
|
Check vs baseline (PRs + manual)
no criterion baseline cached; skipping regression check
|
background
wait
wait-all
cancel
Loading