fix: inline local $ref in tool inputSchema for LLM consumption#1563
fix: inline local $ref in tool inputSchema for LLM consumption#1563gogakoreli wants to merge 4 commits intomodelcontextprotocol:mainfrom
Conversation
@modelcontextprotocol/client
@modelcontextprotocol/server
@modelcontextprotocol/express
@modelcontextprotocol/fastify
@modelcontextprotocol/hono
@modelcontextprotocol/node
commit: |
cc0d216 to
925ab05
Compare
🦋 Changeset detectedLatest commit: b8a39a9 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
I also have a detailed ADR documenting the investigation, alternatives analysis, and design rationale. Happy to include it as |
There was a problem hiding this comment.
Thanks for working on this, appreciate the thorough work here.
-
Please rebase as we've moved to supporting standardSchema with #1689 so you'll need to refactor your PR here a bit.
dereferenceLocalRefs()should hook intostandardSchemaToJsonSchema()inpackages/core/src/util/standardSchema.ts. -
Instead of leaving
$refI think we should error when we detect cycles. There's no way to represent them and the LLMs can't handle them anyway AFAIK. The Go SDK throws "cycle detected for type X" in the same situation for example.
925ab05 to
e7895ce
Compare
|
Thanks for the review @felixweinberger — both items addressed. 1. Rebased and hooked into
2. Cycles throw instead of Fails at Cross-SDK note: Neither the Go nor Python SDK dereferences |
Tool schemas containing $ref cause LLM failures across multiple MCP clients. LLMs cannot resolve JSON Schema $ref pointers — they serialize referenced parameters as strings instead of objects. While $ref was always possible in tool schemas, modelcontextprotocol#1460's switch from zod-to-json-schema to z.toJSONSchema() widened the blast radius: registered types (z.globalRegistry) and recursive types (z.lazy) now produce $ref on common patterns that previously rarely triggered it. Adds dereferenceLocalRefs() which inlines all local $ref pointers, wired into standardSchemaToJsonSchema() so all tool schemas are self-contained and LLM-consumable regardless of schema library. Recursive schemas throw at tools/list time — they cannot be represented without $ref and LLMs cannot handle them. Fixes: modelcontextprotocol#1562
35edf9b to
9b42858
Compare
|
@claude review |
1 similar comment
|
@claude review |
Add 3 tests covering real-world Zod v4 output patterns: - $ref with multiple metadata siblings (.meta() pattern) - $ref with default value sibling (.default() pattern) - $def referencing another $def (nested registered types) These verify the sibling merge path handles all real schema generator output correctly. Exhaustive cross-library testing (Zod v4, ArkType, Valibot) confirmed no generator produces $ref with siblings containing nested $ref.
|
@claude review |
| // Merge sibling keywords onto the resolved schema | ||
| if (hasSiblings && resolved !== null && typeof resolved === 'object' && !Array.isArray(resolved)) { | ||
| return { ...(resolved as Record<string, unknown>), ...siblings }; | ||
| } | ||
| return resolved; |
There was a problem hiding this comment.
🔴 The sibling merge path in dereferenceLocalRefs fails to filter $defs/definitions before spreading them onto the resolved schema, so $defs can survive in the output -- directly undermining the PR goal of removing all $defs from tool schemas. Fix by also destructuring those keys out of siblings: const { $ref: _ref, $defs: _defs, definitions: _definitions, ...siblings } = obj.
Extended reasoning...
Bug: $defs leaks through the sibling merge path
In packages/core/src/util/schema.ts, the resolve() function has two paths for visiting object nodes: a $ref branch and a for-loop branch. The for-loop branch explicitly skips $defs/definitions:
for (const [key, value] of Object.entries(obj)) {
if (key === "$defs" || key === "definitions") continue; // correctly filtered
result[key] = resolve(value, stack);
}
But the $ref branch collects siblings via:
const { $ref: _ref, ...siblings } = obj; // captures EVERY key except $ref
and later spreads them unfiltered:
return { ...(resolved as Record<string, unknown>), ...siblings }; // $defs not filtered\!
If $defs or definitions are present as sibling keys alongside $ref, they survive verbatim in the merged output.
Concrete trigger
standardSchemaToJsonSchema builds the input to dereferenceLocalRefs as:
return dereferenceLocalRefs({ type: "object", ...result });
Consider a Zod type registered in z.globalRegistry and used directly as inputSchema, whose JSON Schema output is:
{ "$ref": "#/$defs/RootType", "$defs": { "RootType": { "type": "object", "properties": { "name": { "type": "string" } } } } }
The spread produces:
{ "type": "object", "$ref": "#/$defs/RootType", "$defs": { "RootType": { ... } } }
When resolve() processes this root node:
- It sees $ref: "#/$defs/RootType"
- Destructures: siblings = { type: "object", $defs: { RootType: {...} } }
- Resolves the ref: resolved = { type: "object", properties: { name: { type: "string" } } }
- hasSiblings is true, so it returns { ...resolved, type: "object", $defs: { RootType: {...} } }
- $defs is present in the final output -- not removed.
Why existing code does not prevent it
The asymmetry is a simple oversight: the for-loop path was written with an explicit guard for $defs/definitions, but the sibling-merge path was not. There is no other mechanism that strips these keys after resolve() returns.
Impact
This directly violates the PR core invariant: that $defs should be absent from the tool inputSchema delivered to LLM clients. A tool whose root inputSchema is a globally-registered Zod type (a legitimate and documented Zod feature) would still expose $defs in the schema, leaving the original problem unfixed for that scenario. The MCP client receives a schema that still contains $defs, undermining the PR goal.
Fix
Simply destructure the offending keys out of siblings before they can be spread:
const { $ref: _ref, $defs: _defs, definitions: _definitions, ...siblings } = obj;
This mirrors the intent of the for-loop guard and is a one-line change.
There was a problem hiding this comment.
Investigated — the trigger scenario (registered type as root producing { $ref, $defs } as siblings) doesn't happen with Zod v4. When a registered type is used as root, Zod inlines it (no $ref). When nested, the $ref is inside properties, not at the root alongside $defs.
That said, the sibling merge path now resolves all sibling values (the defensive fix from the previous comment), which would handle this if it ever occurred.
…nitions`/`$defs` Address review feedback from PR modelcontextprotocol#1563: 1. Defensive: resolve sibling values through resolve() before merging onto the resolved $ref schema. No known generator triggers this (Zod/ArkType/Valibot only produce metadata siblings), but it makes the sibling-merge and object-traversal paths consistent. 2. Bug fix: only strip $defs/definitions keys at the root schema level. Previously the filter fired at every depth, silently dropping any property named 'definitions' or '$defs' from nested objects. Tests added for both fixes.
|
@claude review |
1 similar comment
|
@claude review |
There was a problem hiding this comment.
All previously raised issues have been addressed — LGTM.
Extended reasoning...
Overview
This PR adds dereferenceLocalRefs() (~95 lines) to inline local $ref pointers in tool inputSchema before they reach LLM clients. It hooks into standardSchemaToJsonSchema() in standardSchema.ts (2-line change) and ships 14 tests (9 unit + 5 integration).
Security risks
No security-sensitive code paths (auth, crypto, permissions) are touched. The function operates purely on JSON Schema objects in memory.
Level of scrutiny
Moderate — this is a critical code path (all Zod-based tool schemas pass through it), but the scope is well-bounded. The PR underwent multiple review rounds and the author responded with detailed investigation to each concern.
Prior feedback resolution
All three issues from prior rounds are fixed in the committed code:
- Sibling keywords containing nested
$ref— siblings are now passed throughresolve()viaObject.fromEntries(Object.entries(siblings).map(([k, v]) => [k, resolve(v, stack)])). $defsleaking through sibling merge path — covered by fix #1 (resolved siblings cannot contain raw$defs).- Nested property named
definitions/$defssilently dropped — guarded withobj === schemaso only root-level JSON Schema keyword containers are stripped.
Tests for both the property-name collision cases and the defensive sibling resolution path are present. No new bugs were identified by the bug hunting system.
There was a problem hiding this comment.
Thanks for the rework and pointing out the layer where the Go SDK throws on cycle detection - that's actually a good point.
Doing this on registerTool might be a bigger refactor with the current shape of the SDK, but would probably be the better place to do this 🤔 maybe worth a follow up.
Regarding throwing - on second thought a graceful degradation is probably the safer approach, to avoid completely breaking servers with cycles in their schemas that might still "work" in a degraded way currently. Instead of just replacing with object though I think it would be better we try some kind of "best effort" dereferencing. Something like:
- Inline every
$refthat isn't part of a cycle - For the ones that are, leave the
$refin place and keep only those$defsentries. - That means non-recursive schemas get fully inlined, while recursive ones behave like today (no regression).
- Potential reference approach here: https://github.com/PrefectHQ/fastmcp/blob/main/src/fastmcp/utilities/json_schema.py#L103
Apologies for the back and forth here but would you be open to updating that approach here?
Filed a follow up for this to consider doing this during the |
Inline local
$refin toolinputSchemafor LLM consumptionFixes: #1562
Related: anthropics/claude-code#18260
Problem
Tool
inputSchemacontaining$refpointers causes LLM failures across multiple MCP clients. LLMs cannot resolve JSON Schema$ref— they treat referenced parameters as untyped and serialize objects as string literals:$refin tool schemas has always been possible (non-Zod servers, and the oldzod-to-json-schemawith its default$refStrategy: "root"for identity-based deduplication). However, #1460's switch toz.toJSONSchema()significantly widened the blast radius — registered types (z.globalRegistry) now produce$refeven on first and only use, and all recursive types (z.lazy) produce$ref. The old library only triggered on the second encounter of the same JS object reference.Confirmed across Claude Code (#18260) and Kiro CLI (independently).
Solution
Add
dereferenceLocalRefs()toschemaToJson()— inlines all local$refpointers (#/$defs/...,#/definitions/...) so tool schemas are self-contained.~95 lines of implementation, zero external dependencies.
Behavior:
$ref→ inlined,$defs/definitionsstripped$ref(URLs) → left as-is{ type: "object" }$ref→ preserved per JSON Schema 2020-12dereferenceLocalRefsis@internal— not intended as public APIAlternatives considered
zod-to-json-schemadereference-json-schemanpm packagereused: "inline"option$refregardless (verified)overridecallback$defsaren't fully built at override time; would need two-pass generationz.globalRegistry"globalRegistryis a legitimate Zod featureLimitations
schemaToJson(). Non-Zod servers sending raw JSON Schema with$refare not affected — their schemas don't hit this code path.{ type: "object" }vs the full type).Test plan
pnpm lint:allpassespnpm test:allpasses (all existing tests + 14 new)pnpm build:allpasses9 unit tests (
packages/core/test/schema.test.ts) — testdereferenceLocalRefsdirectly with crafted JSON Schema:$ref), registered types, recursive types, diamond references, non-existent$def(left as-is), external$ref(left as-is), sibling keyword preservation,$ref: "#"root self-reference, registry cleanup5 integration tests (
test/integration/test/server/mcp.test.ts) — full server→client pipeline:discriminatedUnion+ registry, mixed$ref+ inline params (Notion repro),$reftooneOfunion, recursive types$ref) AND runtimecallToolround-tripAll integration tests clean up
z.globalRegistryviaafterEach.Files changed
packages/core/src/util/schema.tsdereferenceLocalRefs(), modifyschemaToJson()to call itpackages/core/test/schema.test.tstest/integration/test/server/mcp.test.tsdescribeblock.changeset/inline-ref-in-tool-schema.md