-
Notifications
You must be signed in to change notification settings - Fork 333
DOC-6740 Claude Code post-tool hook to check shortcodes #3477
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
andy-stark-redis
merged 4 commits into
main
from
DOC-6740-investigate-claude-post-tool-hook-for-shortcode-checks
Jun 10, 2026
+328
−0
Merged
Changes from all commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
a0cbb0c
DOC-6740 /bugbot command
andy-stark-redis 70b3f8a
DOC-6740 add shared settings file to enable bugbot command
andy-stark-redis 0698bd8
DOC-6740 added post-edit hook to catch bad shortcode file references
andy-stark-redis 7788ea9
DOC-6740 Bugbot fixes
andy-stark-redis File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,65 @@ | ||
| --- | ||
| description: Triage and fix valid Cursor Bugbot comments on the current branch's PR | ||
| argument-hint: (no args — uses current branch's PR) | ||
| --- | ||
|
|
||
| You're going to review the comments left by **Cursor Bugbot** on the pull | ||
| request for the current branch, decide which raise genuine problems, and fix | ||
| the ones that do. | ||
|
|
||
| 1. **Find the branch and its PR.** Run `git branch --show-current`. Then find | ||
| the open PR for that branch: | ||
|
|
||
| ``` | ||
| gh pr view --json number,url,title,headRefName | ||
| ``` | ||
|
|
||
| If there's no PR for this branch, stop and tell me — Bugbot only comments on | ||
| PRs, so there's nothing to triage. | ||
|
|
||
| 2. **Collect Bugbot's comments.** Bugbot posts as the `cursor[bot]` app, and its | ||
| feedback shows up in two places, so check both: | ||
|
|
||
| - **Inline review comments** (tied to specific lines): | ||
| ``` | ||
| gh api repos/{owner}/{repo}/pulls/<number>/comments --paginate | ||
| ``` | ||
| - **Top-level PR comments** (the summary): | ||
| ``` | ||
| gh api repos/{owner}/{repo}/issues/<number>/comments --paginate | ||
| ``` | ||
|
|
||
| Filter both for comments whose author `login` is `cursor` / `cursor[bot]` | ||
| (match case-insensitively, and also treat any comment whose body mentions | ||
| "Bugbot" as a candidate). For each one, capture the body, and for inline | ||
| comments the `path` and line so you know exactly what code it's flagging. | ||
|
|
||
| If there are no Bugbot comments, stop and tell me — nothing to do. | ||
|
|
||
| 3. **Triage each issue.** For every distinct issue Bugbot raises, read the | ||
| actual code it points at (don't trust the comment blindly — open the file) | ||
| and decide for yourself whether it's valid. Bugbot has false positives, so | ||
| be critical. Classify each as one of: | ||
|
|
||
| - **Valid** — a real bug, broken link/shortcode, factual error, or anything | ||
| that would genuinely hurt the docs or a reader. | ||
| - **Invalid / won't-fix** — a false positive, a deliberate choice, or out of | ||
| scope for this branch. | ||
|
|
||
| This is a docs repo, so weigh issues the way they matter here: broken | ||
| `relref`/`image`/`embed-md` paths, wrong commands or code samples, incorrect | ||
| technical claims, and US-vs-UK spelling all count as real; stylistic nits | ||
| that don't affect correctness usually don't. | ||
|
|
||
| 4. **Show me the triage before changing anything.** Present a short table: | ||
| each issue, the file/line, your verdict (valid / invalid), and a one-line | ||
| reason. Don't edit yet. | ||
|
|
||
| 5. **Fix the valid ones.** Apply the fixes for everything you marked valid, | ||
| keeping each change minimal and matching the surrounding style. For issues | ||
| you marked invalid, leave the code alone — just note why in your summary so | ||
| I can sanity-check your reasoning. | ||
|
|
||
| 6. **Wrap up.** Summarise what you changed (with file paths) and what you | ||
| deliberately skipped and why. Don't commit, push, or reply to the PR | ||
| comments unless I ask you to. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,234 @@ | ||
| #!/usr/bin/env python3 | ||
| """PostToolUse hook: validate that Hugo shortcode file references point at real | ||
| project files, so hallucinated paths are caught at edit time rather than during | ||
| a doc build. | ||
|
|
||
| Two modes: | ||
| * Hook mode (default): reads the PostToolUse JSON blob on stdin, pulls | ||
| tool_input.file_path, validates that one markdown file. On a hard failure it | ||
| prints to stderr and exits 2, which feeds the message back to Claude. | ||
| * Scan mode (--scan [paths...] | --scan --all): validates the given files (or | ||
| every content/**/*.md when --all) and prints a report. Used for dry-runs. | ||
|
|
||
| Severity: | ||
| * hard — image/image-card(image)/embed-code/embed-yaml/embed-md. These point | ||
| at concrete files (static/ or content/embeds/); a miss is a real bug. | ||
| * soft — relref/image-card(url). Resolution is fuzzier (Hugo's GetPage does | ||
| section/anchor/relative magic), so misses are warnings only. | ||
| """ | ||
|
|
||
| import sys | ||
| import os | ||
| import re | ||
| import json | ||
| import glob | ||
|
|
||
| # (name, regex capturing the path in group 1, resolver key) | ||
| HARD_RULES = [ | ||
| ("image", re.compile(r'\{\{[<%]\s*image\s+[^>]*?\bfilename\s*=\s*["\']([^"\']+)["\']'), "static"), | ||
| ("image-card", re.compile(r'\{\{[<%]\s*image-card\s+[^>]*?\bimage\s*=\s*["\']([^"\']+)["\']'), "static"), | ||
| ("embed-code", re.compile(r'\{\{[<%]\s*embed-code\s+["\']([^"\']+)["\']'), "static-code"), | ||
| ("embed-yaml", re.compile(r'\{\{[<%]\s*embed-yaml\s+["\']([^"\']+)["\']'), "embeds"), | ||
| ("embed-md", re.compile(r'\{\{[<%]\s*embed-md\s+["\']([^"\']+)["\']'), "embeds"), | ||
| ] | ||
| # relref/image-card url resolution can't be replicated faithfully without Hugo | ||
| # (global lookup, aliases, generated /commands pages), and Hugo itself only WARNs | ||
| # on broken relrefs (config.toml: refLinksErrorLevel = "WARNING"). So these are | ||
| # OFF by default to keep the hook false-positive-free. Flip CHECK_RELREF to True | ||
| # (or set CHECK_SHORTCODE_RELREF=1 in the env) to surface them as non-blocking notes. | ||
| CHECK_RELREF = os.environ.get("CHECK_SHORTCODE_RELREF") == "1" | ||
| SOFT_RULES = [ | ||
| ("relref", re.compile(r'\{\{[<%]\s*relref\s+["\']([^"\']+)["\']'), "relref"), | ||
| ("image-card-url", re.compile(r'\{\{[<%]\s*image-card\s+[^>]*?\burl\s*=\s*["\']([^"\']+)["\']'), "relref"), | ||
| ] | ||
|
|
||
|
|
||
| def find_root(start): | ||
| """Walk up from a file/dir until we find the Hugo root (has content + layouts).""" | ||
| d = os.path.abspath(start) | ||
| if os.path.isfile(d): | ||
| d = os.path.dirname(d) | ||
| while True: | ||
| if os.path.isdir(os.path.join(d, "content")) and os.path.isdir(os.path.join(d, "layouts")): | ||
| return d | ||
| parent = os.path.dirname(d) | ||
| if parent == d: | ||
| return None | ||
| d = parent | ||
|
|
||
|
|
||
| def exists_any(*paths): | ||
| return any(os.path.exists(p) for p in paths) | ||
|
|
||
|
|
||
| def _strip_frag(ref): | ||
| return re.split(r"[#?]", ref, 1)[0] | ||
|
|
||
|
|
||
| def _strip_dots(rel): | ||
| parts = rel.split("/") | ||
| while parts and parts[0] in (".", ".."): | ||
| parts.pop(0) | ||
| return "/".join(parts) | ||
|
|
||
|
|
||
| def resolve_static(root, ref, src_file): | ||
| ref = _strip_frag(ref) | ||
| rel = ref.lstrip("/") | ||
| stripped = _strip_dots(rel) | ||
| cur = os.path.dirname(os.path.abspath(src_file)) | ||
| return exists_any( | ||
| os.path.join(root, "static", stripped), | ||
| os.path.join(root, "assets", stripped), | ||
| os.path.join(root, "content", stripped), # page-bundle resource | ||
| os.path.join(root, "static", rel), | ||
| os.path.join(root, "content", rel), | ||
| os.path.normpath(os.path.join(cur, ref)), # page-relative | ||
| ) | ||
|
|
||
|
|
||
| def resolve_static_code(root, ref, _src): | ||
| rel = _strip_dots(_strip_frag(ref).lstrip("/")) | ||
| return exists_any(os.path.join(root, "static", "code", rel), | ||
| os.path.join(root, "static", "code", os.path.basename(rel))) | ||
|
|
||
|
|
||
| def resolve_embeds(root, ref, _src): | ||
| rel = _strip_frag(ref).lstrip("/") | ||
| base = os.path.basename(rel) | ||
| cands = [] | ||
| for c in (os.path.join(root, "content", "embeds", base), | ||
| os.path.join(root, "content", "embeds", rel), | ||
| os.path.join(root, "content", rel)): | ||
| cands.append(c) | ||
| if not c.endswith(".md"): | ||
| cands.append(c + ".md") # GetPage resolves extensionless refs | ||
| return exists_any(*cands) | ||
|
|
||
|
|
||
| def resolve_relref(root, ref, src_file): | ||
| # strip anchor / query, then trailing slash | ||
| path = _strip_frag(ref).rstrip("/") | ||
| if not path: | ||
| return True # pure anchor ref to current page | ||
| if path.startswith("/"): | ||
| bases = [os.path.join(root, "content", path.lstrip("/"))] | ||
| else: | ||
| cur = os.path.dirname(os.path.abspath(src_file)) | ||
| bases = [os.path.join(cur, path), os.path.join(root, "content", path)] | ||
| candidates = [] | ||
| for b in bases: | ||
| if b.endswith(".md"): | ||
| candidates.append(b) # ref already carried the .md extension | ||
| else: | ||
| candidates += [b + ".md", os.path.join(b, "_index.md"), os.path.join(b, "index.md")] | ||
| if os.path.isdir(b): | ||
| candidates.append(b) # section dir served without an _index.md | ||
| if exists_any(*candidates): | ||
| return True | ||
| # case-insensitive fallback (Hugo is lenient about case) | ||
| lc = {c.lower() for c in candidates} | ||
| for b in bases: | ||
| parent = os.path.dirname(b) | ||
| if os.path.isdir(parent): | ||
| for entry in os.listdir(parent): | ||
| if os.path.join(parent, entry).lower() in lc: | ||
| return True | ||
| return False | ||
|
|
||
|
|
||
| RESOLVERS = { | ||
| "static": resolve_static, | ||
| "static-code": resolve_static_code, | ||
| "embeds": resolve_embeds, | ||
| "relref": resolve_relref, | ||
| } | ||
|
|
||
|
|
||
| def check_file(path, root): | ||
| """Return (hard_misses, soft_misses) as lists of (shortcode, ref).""" | ||
| try: | ||
| with open(path, encoding="utf-8") as f: | ||
| text = f.read() | ||
| except (OSError, UnicodeDecodeError): | ||
| return [], [] | ||
| hard, soft = [], [] | ||
| for name, rx, key in HARD_RULES: | ||
| for m in rx.finditer(text): | ||
| ref = m.group(1) | ||
| if not RESOLVERS[key](root, ref, path): | ||
| hard.append((name, ref)) | ||
| if CHECK_RELREF: | ||
| for name, rx, key in SOFT_RULES: | ||
| for m in rx.finditer(text): | ||
| ref = m.group(1) | ||
| if ref.startswith(("http://", "https://", "//", "mailto:")): | ||
| continue | ||
| if not RESOLVERS[key](root, ref, path): | ||
| soft.append((name, ref)) | ||
| return hard, soft | ||
|
|
||
|
|
||
| def run_scan(argv): | ||
| if "--all" in argv: | ||
| root = find_root(os.getcwd()) or os.getcwd() | ||
| files = glob.glob(os.path.join(root, "content", "**", "*.md"), recursive=True) | ||
| else: | ||
| files = [a for a in argv if a != "--scan"] | ||
| root = (find_root(files[0]) if files else find_root(os.getcwd())) or os.getcwd() | ||
| total_hard = total_soft = files_with_hard = 0 | ||
| sample_hard, sample_soft = [], [] | ||
| by_type = {} | ||
| for fp in files: | ||
| hard, soft = check_file(fp, root) | ||
|
cursor[bot] marked this conversation as resolved.
|
||
| if hard: | ||
| files_with_hard += 1 | ||
| total_hard += len(hard) | ||
| for sc, ref in hard: | ||
| by_type[sc] = by_type.get(sc, 0) + 1 | ||
| if len(sample_hard) < 40: | ||
| sample_hard.append(f"{os.path.relpath(fp, root)}: {sc} -> {ref}") | ||
| if soft: | ||
| total_soft += len(soft) | ||
| for sc, ref in soft: | ||
| if len(sample_soft) < 40: | ||
| sample_soft.append(f"{os.path.relpath(fp, root)}: {sc} -> {ref}") | ||
| print(f"Scanned {len(files)} files under {root}") | ||
| print(f"HARD misses: {total_hard} across {files_with_hard} files {dict(sorted(by_type.items()))}") | ||
| print(f"SOFT misses: {total_soft}") | ||
| if sample_hard: | ||
| print("\n--- sample HARD misses (these would block an edit) ---") | ||
| print("\n".join(sample_hard)) | ||
| if sample_soft: | ||
| print("\n--- sample SOFT misses (warn only) ---") | ||
| print("\n".join(sample_soft)) | ||
| return 0 | ||
|
|
||
|
|
||
| def run_hook(): | ||
| try: | ||
| data = json.load(sys.stdin) | ||
| except (json.JSONDecodeError, ValueError): | ||
| return 0 | ||
| fp = (data.get("tool_input") or {}).get("file_path", "") | ||
| if not fp or not fp.endswith(".md"): | ||
| return 0 | ||
| root = find_root(fp) | ||
| if not root or "/content/" not in os.path.abspath(fp).replace(os.sep, "/") + "/": | ||
| return 0 | ||
| hard, soft = check_file(fp, root) | ||
| if not hard and not soft: | ||
| return 0 | ||
| lines = [f"Shortcode reference check for {os.path.relpath(fp, root)}:"] | ||
| for sc, ref in hard: | ||
| lines.append(f" [broken] {sc} points at a file that does not exist: {ref}") | ||
| for sc, ref in soft: | ||
| lines.append(f" [warn] {sc} could not be resolved (verify it exists): {ref}") | ||
| sys.stderr.write("\n".join(lines) + "\n") | ||
| return 2 if hard else 0 | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| if "--scan" in sys.argv: | ||
| sys.exit(run_scan(sys.argv[1:])) | ||
| sys.exit(run_hook()) | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,29 @@ | ||
| { | ||
| "permissions": { | ||
| "allow": [ | ||
| "Bash(gh pr view *)", | ||
| "Bash(gh api repos/*/pulls/*/comments*)", | ||
| "Bash(gh api repos/*/issues/*/comments*)" | ||
|
andy-stark-redis marked this conversation as resolved.
|
||
| ], | ||
| "additionalDirectories": [ | ||
| "/tmp" | ||
| ] | ||
| }, | ||
| "sandbox": { | ||
| "excludedCommands": ["gh"] | ||
| }, | ||
| "hooks": { | ||
| "PostToolUse": [ | ||
| { | ||
| "matcher": "Edit|Write|MultiEdit", | ||
| "hooks": [ | ||
| { | ||
| "type": "command", | ||
| "command": "python3 \"$CLAUDE_PROJECT_DIR/.claude/hooks/check_shortcode_paths.py\"", | ||
| "statusMessage": "Checking Hugo shortcode paths" | ||
| } | ||
| ] | ||
| } | ||
| ] | ||
| } | ||
| } | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.