fix(developer-hub): address agent-friendly docs check failures#3704
fix(developer-hub): address agent-friendly docs check failures#3704aditya520 wants to merge 2 commits into
Conversation
- Rewrite llms.txt to use the llmstxt.org link-list format so afdocs can parse links. - Add a visually-hidden /llms.txt directive (both `<link rel="alternate">` and an a11y-hidden anchor) to every page rendered by the root layout. - Prepend a blockquote pointing at /llms.txt to every markdown response so the directive is also present in the markdown variant. - Add a Next.js proxy that honors `Accept: text/markdown` by rewriting doc paths to the existing /mdx/[...slug] route, with `Vary: Accept` so CDNs don't mix variants. - Drop /llms.txt cache lifetime from 1d to 1h to clear the cache-header-hygiene warning.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
5 Skipped Deployments
|
There was a problem hiding this comment.
🚩 No test coverage for new middleware content negotiation
The new middleware (apps/developer-hub/src/middleware.ts) introduces non-trivial content negotiation logic including Accept header parsing, quality factor comparison, path skip lists, and URL rewriting. Per REVIEW.md, new functionality should have tests. The prefersMarkdown function in particular has enough edge cases (multiple entries, quality factors, missing entries, malformed input) that unit tests would be valuable. The AGENTS.md file notes that linting and type-checking are the primary gate, but the middleware logic is complex enough that tests would help catch regressions.
Was this helpful? React with 👍 or 👎 to provide feedback.
Rename src/proxy.ts to src/middleware.ts and rename the exported `proxy` function to `middleware` so Next.js App Router picks it up. Without this, the Accept-header content negotiation referenced in Root/index.tsx was a no-op. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
| export function middleware(request: NextRequest) { | ||
| if (request.method !== "GET" && request.method !== "HEAD") { | ||
| return NextResponse.next(); | ||
| } | ||
| if (!prefersMarkdown(request.headers.get("accept"))) { | ||
| return NextResponse.next(); | ||
| } | ||
|
|
||
| const { pathname } = request.nextUrl; | ||
| if (SKIP_EXACT.has(pathname)) return NextResponse.next(); | ||
| if (SKIP_PREFIXES.some((p) => pathname === p || pathname.startsWith(`${p}/`))) { | ||
| return NextResponse.next(); | ||
| } | ||
| // Already a .md/.mdx URL — let the existing rewrite handle it. | ||
| if (/\.[a-z0-9]+$/i.test(pathname)) return NextResponse.next(); | ||
|
|
||
| const url = request.nextUrl.clone(); | ||
| url.pathname = `/mdx${pathname}`; | ||
| const response = NextResponse.rewrite(url); | ||
| response.headers.set("Vary", "Accept"); | ||
| return response; |
There was a problem hiding this comment.
🟡 Missing Vary: Accept on non-rewritten responses breaks HTTP caching for content-negotiated paths
The middleware sets Vary: Accept only on responses that are rewritten to /mdx/... (src/middleware.ts:61), but does NOT set it on the pass-through NextResponse.next() responses for the same content-negotiable paths. Per HTTP spec (RFC 7231 §7.1.4), when a URL can return different representations based on request headers, ALL responses—including the "default" HTML one—must include Vary: Accept so that intermediate caches (CDN, forward proxy, browser cache) know to key on the Accept header. Without it, a cached HTML response could be served to a client requesting text/markdown, or vice versa.
On Vercel the impact is mitigated because middleware runs before CDN cache lookup, and rewrites use a different internal URL. But on any other deployment platform or when an external CDN sits in front, this is a correctness bug.
Prompt for agents
In middleware.ts, the Vary: Accept header is only set on the NextResponse.rewrite() path (line 61), but NOT on the NextResponse.next() returns for paths that are subject to content negotiation (i.e., paths that pass all skip checks but the client doesn't prefer markdown).
The fix: for the code path after all SKIP_EXACT / SKIP_PREFIXES / file-extension checks (i.e., after line 56, when the path IS content-negotiable), if prefersMarkdown returns false, you should still return a NextResponse.next() with Vary: Accept set. Currently, the prefersMarkdown check and the skip checks are interleaved — the prefersMarkdown check happens first (line 46) and returns NextResponse.next() before we even know if the path is content-negotiable.
Suggested restructuring: move the prefersMarkdown check AFTER the skip checks, so you can distinguish between 'skipped path' (no Vary needed) and 'content-negotiable path where client wants HTML' (Vary: Accept needed). For the latter, do:
const response = NextResponse.next();
response.headers.set('Vary', 'Accept');
return response;
Alternatively, add Vary: Accept to the headers config in next.config.js for the doc page paths, but per-path middleware logic is cleaner.
Was this helpful? React with 👍 or 👎 to provide feedback.
Summary
Brings the developer-hub up to passing on the agent-friendly doc check.
The site was failing four checks reported by
afdocs check:llms-txt-valid— links not parseable.llms-txt-directive-html— no page advertised/llms.txt.content-negotiation—Accept: text/markdownwas ignored.page-size-html— agents had to chew through 100K+ of HTML to get to docs.…plus a
cache-header-hygienewarning on/llms.txt.What changed
/llms.txtrewritten in the llmstxt.org link-list format(
- [name](url): descriptionunder headings) so links parse cleanly.Cache dropped from 1 d to 1 h.
<link rel="alternate" type="text/plain" href="/llms.txt">in
<head>plus a visually-hidden anchor in<body>of the root layout,so every page advertises the index.
get-llm-text.tsnow prepends a blockquotepointing at
/llms.txtto every markdown response.src/proxy.ts(Next.js 16 proxy/middleware)parses
Accept, rewrites doc URLs to the existing/mdx/[...slug]routewhen the client prefers markdown, and sets
Vary: Acceptso CDNsdon't mix HTML/markdown for the same URL. Skips
/,/api,/playground, asset paths, and the static/llms-*.txtroutes.Verification
Ran
npx afdocs check http://localhost:3627/price-feeds/core/getting-started --fixesagainst the local production build:
llms-txt-validllms-txt-directive-htmlllms-txt-directive-mdcontent-negotiationmarkdown-url-supportpage-size-htmlmarkdown-content-parityManual spot-checks:
pnpm buildandpnpm exec tsc --noEmitboth pass.Out of scope
content-start-position(a new ⚠ warning, not in the failing-checkslist) — would need a Fumadocs layout reshuffle.
s-maxage=365don doc pages — that's the Next.js ISR default; theuser's report only flagged
/llms.txtfor cache, which is now 1 h.Test plan
npx afdocs check https://docs.pyth.network/ --fixesand confirm the four checks above flip to ✓ and Agent Score climbs above 75/100curl -H 'Accept: text/markdown' https://docs.pyth.network/price-feeds/core/getting-startedreturns markdowncurl https://docs.pyth.network/llms.txtreturns the new link-list format withCache-Control: public, max-age=3600