[codex] Add speculative draft Prometheus metrics by nycdubliner · Pull Request #22 · AtomicBot-ai/atomic-llama-cpp-turboquant

nycdubliner · 2026-05-31T22:27:47Z

Summary

export speculative draft acceptance counters from the existing llama-server /metrics endpoint
label speculative counters by spec_type using the implementation type already tracked by common_speculative_state
document metric names, Prometheus/Grafana expressions, and curl verification

Validation

cmake --build build-hip-rocwmma --target llama-server -j "$(nproc)"
ran constrained Gemma 4 26B-A4B MTP server on 127.0.0.1:8084
verified curl -s http://127.0.0.1:8084/metrics | rg "speculative|draft" before and after one chat completion; counters increased from zero to generated/accepted values
git diff --check

Notes

The exported counters use the same in-memory source counters as the statistics <type> server log line. No log parsing or separate metrics endpoint is added.

Add speculative draft Prometheus metrics

ca97dde

github-actions Bot added examples server labels May 31, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[codex] Add speculative draft Prometheus metrics#22

[codex] Add speculative draft Prometheus metrics#22
nycdubliner wants to merge 1 commit into
AtomicBot-ai:feature/turboquant-kv-cachefrom
nycdubliner:codex/speculative-draft-metrics

nycdubliner commented May 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nycdubliner commented May 31, 2026

Summary

Validation

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant