improve search_code query parameter description (fixes #2390)#2442
improve search_code query parameter description (fixes #2390)#2442jluocsa wants to merge 1 commit into
Conversation
The current `query` description for `search_code` gives the model little useful guidance on GitHub code search syntax, leading to repeated 422 ERROR_TYPE_QUERY_PARSING_FATAL responses from agents that guess at plausible-looking but invalid syntax (e.g. regex without slashes, `filename:` instead of `path:`, `OR` between qualifiers without parentheses). Replace the description with one that explicitly enumerates qualifiers, boolean operators, regex form (slashes), and path globs, plus realistic examples covering each. Updates the search_code toolsnap and README accordingly. Fixes github#2390.
SamMorrowDrums
left a comment
There was a problem hiding this comment.
Reviewing this from the perspective of being the consumer of this description (an LLM tool-caller): this is a clear, meaningful upgrade.
The current text is hand-wavy ("powerful... and more"), which forces guessing. The new version teaches things that materially change which query I'd construct:
symbol:— I would default tocontent:for finding a function/type definition. Knowingsymbol:exists changes the tool I reach for.- Regex with
/.../— without this, I'd fall back to multiple OR'd queries or multiple tool calls./GetAttributes|SetAttributes/is exactly the right shape. - Explicit boolean + parens — confirms
(Foo OR Bar) path:srcis supported rather than guessing. - Glob in
path:— disambiguatespath:*.tsvspath:**/*.ts. - Realistic combined examples — the previous examples were toy-shaped; these look like actual queries.
Token cost is ~3× longer, but search_code is a high-value, frequently-misused tool, so the trade is worth it.
Two tiny polish suggestions if you fancy a follow-up (not blocking):
- Case inconsistency:
language:Go(capital) in the qualifiers list vslanguage:go(lowercase) in the examples. Code search is case-insensitive on language, but the inconsistency reads as a smell. is:archived|fork— the pipe is ambiguous (literal? alternation?). Splitting to two qualifiers (is:archived,is:fork) would be clearer.
Approving — thanks for the contribution! 🙏
|
Reopened in #2513 with you (@jluocsa) and @danmoseley credited as co-authors, plus a tighter pass that fills a few gaps and — importantly — corrects against what this tool's endpoint actually supports. While verifying the proposed examples against the live API, I found that this tool calls go-github's The version in #2513 documents only what's actually supported by legacy Thanks for kicking this off — your version got us most of the way there. 🙏 Closing in favor of #2513. |
The current `search_code` query description is hand-wavy and gives the model little usable guidance on GitHub code search syntax, which (per analysis in #2390 across thousands of agent sessions) leads to repeated 422 ERROR_TYPE_QUERY_PARSING_FATAL responses from agents that guess at plausible-but-invalid syntax. Re-applies the spirit of #2442 by @jluocsa, originally suggested by @danmoseley in #2390, but corrected against the actual endpoint this tool calls. Critically, this tool uses go-github's `client.Search.Code`, which hits the legacy REST `/search/code` endpoint — NOT the new code search ("Blackbird"). Verified against the live API: symbol:WithContext repo:github/github-mcp-server -> 0 /Get|Set/ repo:github/github-mcp-server -> 0 path:**/*.go func repo:github/github-mcp-server -> 0 filename:*.md repo:github/github-mcp-server -> 0 (Foo OR Bar) -path:vendor language:go -> 422 So `symbol:`, `/regex/`, path globs, filename globs, and parenthesized boolean groups — features the proposal in #2442 listed — silently return zero or fail. Documenting them would teach the model syntax that doesn't work on this endpoint. The new description focuses on what's actually supported by legacy code search and the real bugs observed in #2390: - `path:dir` is a prefix, NOT a glob (displaces `path:**/*.ts` guesses). - `filename:exact.ext` is exact, NOT a glob (displaces `filename:*.md`). - `/regex/` and `\|` inside quotes don't work — call this out so the model stops generating them. - `symbol:` doesn't work on this endpoint — call this out. - Parenthesized boolean groups 422 — call this out so the model stops wrapping `OR` chains in parens. - Adds `extension:`, `in:file`, `in:path`, `size:`, `filename:`, `user:` qualifiers that the previous text omitted. - Implicit AND, `OR`, `NOT`, and `"quoted phrase"` for exact match are documented positively. - 256-char query limit. All four examples in the new description are verified against the live GitHub API and return non-zero results. Co-authored-by: jluocsa <103165870+jluocsa@users.noreply.github.com> Co-authored-by: danmoseley <6385855+danmoseley@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ce (#2513) The current `search_code` query description is hand-wavy and gives the model little usable guidance on GitHub code search syntax, which (per analysis in #2390 across thousands of agent sessions) leads to repeated 422 ERROR_TYPE_QUERY_PARSING_FATAL responses from agents that guess at plausible-but-invalid syntax. Re-applies the spirit of #2442 by @jluocsa, originally suggested by @danmoseley in #2390, but corrected against the actual endpoint this tool calls. Critically, this tool uses go-github's `client.Search.Code`, which hits the legacy REST `/search/code` endpoint — NOT the new code search ("Blackbird"). Verified against the live API: symbol:WithContext repo:github/github-mcp-server -> 0 /Get|Set/ repo:github/github-mcp-server -> 0 path:**/*.go func repo:github/github-mcp-server -> 0 filename:*.md repo:github/github-mcp-server -> 0 (Foo OR Bar) -path:vendor language:go -> 422 So `symbol:`, `/regex/`, path globs, filename globs, and parenthesized boolean groups — features the proposal in #2442 listed — silently return zero or fail. Documenting them would teach the model syntax that doesn't work on this endpoint. The new description focuses on what's actually supported by legacy code search and the real bugs observed in #2390: - `path:dir` is a prefix, NOT a glob (displaces `path:**/*.ts` guesses). - `filename:exact.ext` is exact, NOT a glob (displaces `filename:*.md`). - `/regex/` and `\|` inside quotes don't work — call this out so the model stops generating them. - `symbol:` doesn't work on this endpoint — call this out. - Parenthesized boolean groups 422 — call this out so the model stops wrapping `OR` chains in parens. - Adds `extension:`, `in:file`, `in:path`, `size:`, `filename:`, `user:` qualifiers that the previous text omitted. - Implicit AND, `OR`, `NOT`, and `"quoted phrase"` for exact match are documented positively. - 256-char query limit. All four examples in the new description are verified against the live GitHub API and return non-zero results. Co-authored-by: jluocsa <103165870+jluocsa@users.noreply.github.com> Co-authored-by: danmoseley <6385855+danmoseley@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Summary
Fixes #2390.
The current
queryparameter description for thesearch_codetool gives the model very little useful guidance on GitHub code search syntax, which (per analysis across thousands of agent sessions in #2390) leads to repeated422 ERROR_TYPE_QUERY_PARSING_FATALresponses from agents that guess at plausible-looking but invalid syntax — for example:\"foo\\|bar\"instead of regex/foo|bar/filename:*review*.mdinstead ofpath:*review*.mdpath:docs OR path:.githubwithout parentheses (surprising precedence)This PR replaces the description with one that explicitly enumerates the qualifiers, boolean operators, regex form (slashes), and path globs, plus realistic examples covering each. The new wording is the one suggested by the issue author.
Before
After
Files changed (3 files / 3 lines)
pkg/github/search.go— the description stringpkg/github/__toolsnaps__/search_code.snap— regenerated viaUPDATE_TOOLSNAPS=true go test ./pkg/github -run Test_SearchCodeREADME.md— regenerated viago run ./cmd/github-mcp-server generate-docsValidation
Test_SearchCode(and all subtests) passgo test ./...passes for all 22 packagesNote: I was unable to run
script/lintlocally (nogolangci-lintavailable in my environment) or-race(no CGO toolchain). CI should validate both. Happy to iterate if it surfaces anything.