fix(kv-cache): boost eviction score for cold/evict/shutdown checkpoints#177
Open
unsaltedbutter-ai wants to merge 1 commit into
Open
Conversation
The eviction score (hits+1) * tokens / size treats explicit checkpoints (cold, evict, shutdown) the same as continued snapshots. When a freshly written cold or evict snapshot competes against fresh hits=0 continued entries from a concurrently prefilling workload, a tiny tokens-per-byte difference is enough to make the explicit checkpoint the lowest-scoring victim, even though it represents an intentional save that the next same-prompt request might need. The protected_sha DBL_MAX gives one round of safety; after that the checkpoint is on its own. In production this kills a saved evict snapshot within minutes during a single competing prefill. A 1 GiB live-state save can be unlinked before the request that needs it ever arrives, forcing a full re-prefill from zero of a context that was sitting on disk verbatim. Bump the base term from 1.0 to 4.0 for DS4_KVSTORE_REASON_COLD, DS4_KVSTORE_REASON_EVICT, and DS4_KVSTORE_REASON_SHUTDOWN. Continued and unknown stay at 1.0. Effect: a fresh explicit checkpoint is scored as if it had three accumulated hits already, so a continued entry needs to actually serve three hits before it outscores a fresh same-shape checkpoint. The existing hit decay still applies on top, so a long-unused checkpoint slides back toward parity as it ages out. The boost is intrinsic to the file (driven by the reason byte in the header), so it persists across conversations, restarts, and any number of eviction rounds; it is not a one-shot protection like protected_sha. Adds two unit tests demonstrating cold and evict each surviving a denser fresh continued under a one-entry-fits budget.
cf4ca7b to
f6bd3ee
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #176.
kv_entry_eviction_scoreuses(hits+1) * tokens / sizeregardless ofreason, so a fresh cold or evict checkpoint loses the eviction race to a competing workload's fresh continued snapshots on density alone. Theprotected_shaDBL_MAX only protects the just-stored entry for one round; after that, explicit checkpoints have nothing distinguishing them from routine continued traffic. In production a 1.3 GiB live-state save died within ~5 min while a single competing prefill ran (issue #176 has the log).Change
Bump the base from
1.0to4.0forKV_REASON_COLD | EVICT | SHUTDOWN. Continued and unknown stay at1.0. The boost is intrinsic to the file (reasonbyte in the header) so it persists across conversations and restarts. Existing hit-decay logic still applies.Test plan
./ds4_test --serverpasses (existing eviction tests untouched)test_kv_cache_eviction_boosts_cold_over_continued: a fresh cold survives a denser fresh continued under a budget that fits exactly onetest_kv_cache_eviction_boosts_evict_over_continued: same for evict