Add repo-controlled robots.txt allowing AI crawlers#43
Draft
TaprootFreak wants to merge 1 commit into
Draft
Conversation
Add a version-controlled robots.txt that serves as the authoritative crawl policy for the JuiceDollar documentation site (docs.juicedollar.com). The file explicitly welcomes both search engines and AI agents to crawl, index, and learn from this public documentation. - Wildcard group allows all user-agents and sets a Content-Signal granting search, AI input / RAG, and AI training. - Major AI crawlers (ClaudeBot, GPTBot, Google-Extended, CCBot, Bytespider, Amazonbot, Applebot-Extended, meta-externalagent) are additionally listed by name, since some honor only their own record. The file lives in src/.vuepress/public/, which VuePress copies verbatim to the published site root (dist/robots.txt). No Sitemap directive is included because the site does not currently publish a sitemap.xml.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a version-controlled
robots.txtthat serves as the authoritative crawl policy for the JuiceDollar documentation site (docs.juicedollar.com). The site is public documentation, and we explicitly want both search engines and AI agents to crawl, index, and learn from it.What it does
Allow: /) and sets aContent-Signalgrantingsearch=yes,ai-input=yes(AI input / RAG), andai-train=yes(AI training). We deliberately do not signalai-train=no.Placement
The file lives in
src/.vuepress/public/, which VuePress copies verbatim to the published site root, so it is served at/robots.txt. Verified with a localnpm run build: the file appears byte-identical atsrc/.vuepress/dist/robots.txt.Sitemap
No
Sitemap:directive is included: the build does not generate asitemap.xml(no sitemap plugin is configured) andhttps://docs.juicedollar.com/sitemap.xmlcurrently returns 404. The directive can be added later if a sitemap is introduced.Notes
The site is published via Cloudflare Pages, which serves repository static assets as-is; no platform-specific configuration is required for this file.