Skip to content

Pre Compute scale factor once and reuse inside kenrel.#19

Open
shalinib-ibm wants to merge 1 commit into
masterfrom
q4_gemm_precompute_scale
Open

Pre Compute scale factor once and reuse inside kenrel.#19
shalinib-ibm wants to merge 1 commit into
masterfrom
q4_gemm_precompute_scale

Conversation

@shalinib-ibm
Copy link
Copy Markdown
Owner

Not much performance diff from 4.3(base) t/s to 4.1 t/s (llama-bench Q4 model p 128 n 1 t 1 )

Make sure to read the contributing guidelines before submitting a PR

Not much performance diff from 4.3(base) t/s to 4.1 t/s
(llama-bench Q4 model p 128 n 1 t 1 )

Signed-off-by: Shalini Salomi Bodapati <Shalini.Salomi.Bodapati@ibm.com>
@github-actions github-actions Bot added the ggml label Oct 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant