SiteMCP

Warning

This repository is archived. Please read the notice below before using this project.

Archival Notice

SiteMCP was created on 8 April 2025, in the very early days of the Model Context Protocol (original tweet, introduction post, Japanese). At the time, MCP was a brand-new concept, and passing documentation to AI assistants via an MCP server was one of the only practical options available. Claude Code was exclusively available under a usage-based plan and had not yet reached wide adoption — and features like skills did not yet exist. (For that reason, all the demos in this repository were recorded with Claude Code Desktop.)

Back then, coding agents were far less capable and LLM-native web search was unreliable at best, so fetching an entire website and serving it through MCP was genuinely useful. SiteMCP was arguably one of the earliest and most practical ways to provide documentation to a local AI assistant.

The landscape has since changed significantly. MCP itself went through a period of rapid growth followed by a correction, and the community converged on simpler, more efficient alternatives for static documentation: publishing docs as downloadable Markdown files, distributing them via npm packages, or falling back to web search as a last resort. In my view, using MCP to serve static documentation has become an anti-pattern — the overhead is simply not worth it. I wrote about this in detail on my blog:

One year after its initial release, I have decided to archive SiteMCP. If you need to provide documentation to an AI assistant, please follow the guidance in my blog posts above and fetch documentation as statically as possible.

Fetch an entire site and use it as an MCP Server

svelte-claude-en-m.mov

Demo in Japanese

svelte-claude-m.mov

Note

sitemcp is a fork of sitefetch by @egoist

Install

One-off usage (choose one of the followings):

bunx sitemcp
npx sitemcp
pnpx sitemcp

Install globally (choose one of the followings):

bun i -g sitemcp
npm i -g sitemcp
pnpm i -g sitemcp

Usage

sitemcp https://daisyui.com

# or better concurrency
sitemcp https://daisyui.com --concurrency 10

Tool Name Strategy

Use -t, --tool-name-strategy to specify the tool name strategy, it will be used as the MCP server name (default: domain). This will be used as the MCP server name.

sitemcp https://vite.dev -t domain # indexOfVite / getDocumentOfVite
sitemcp https://react-tweet.vercel.app/ -t subdomain # indexOfReactTweet / getDocumentOfReactTweet
sitemcp https://ryoppippi.github.io/vite-plugin-favicons/ -t pathname # indexOfVitePluginFavicons / getDocumentOfVitePluginFavicons

Max Length of Content

Use -l, --max-length to specify the max length of content, default is 2000 characters. This is useful for sites with long content, such as blogs or documentation. The acceptable content length depends on the MCP client you are using, so please check the documentation of your MCP client for more details. Also welcome to open an issue if you have any questions.

sitemcp https://vite.dev -l 10000

Match specific pages

Use the -m, --match flag to specify the pages you want to fetch:

sitemcp https://vite.dev -m "/blog/**" -m "/guide/**"

The match pattern is tested against the pathname of target pages, powered by micromatch, you can check out all the supported matching features.

Content selector

We use mozilla/readability to extract readable content from the web page, but on some pages it might return irrelevant contents, in this case you can specify a CSS selector so we know where to find the readable content:

sitemcp https://vite.dev --content-selector ".content"

How to configure with MCP Client

You can execute server from your MCP client (e.g. Claude Desktop).

The below example configuration for Claude Desktop

{
  "mcpServers": {
    "daisy-ui": {
      "command": "npx",
      "args": [
        "-y",
        "sitemcp",
        "https://daisyui.com",
        "-m",
        "/components/**"
      ]
    }
  }
}

Tips

Some site has a lot of pages. It is better to run sitemcp before registering the server to the MCP client. sitemcp caches the pages in ~/.cache/sitemcp by default. You can disable by --no-cache flag.

License

MIT.

Name		Name	Last commit message	Last commit date
Latest commit History 209 Commits
.github		.github
src		src
.gitignore		.gitignore
.npmrc		.npmrc
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
biome.json		biome.json
bun.lock		bun.lock
package.json		package.json
shims.d.ts		shims.d.ts
tsconfig.json		tsconfig.json
tsdown.config.ts		tsdown.config.ts
typos.toml		typos.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SiteMCP

Archival Notice

Install

Usage

Tool Name Strategy

Max Length of Content

Match specific pages

Content selector

How to configure with MCP Client

Tips

License

Sponsors

Stats

About

Uh oh!

Releases 25

Sponsor this project

Uh oh!

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

SiteMCP

Archival Notice

Install

Usage

Tool Name Strategy

Max Length of Content

Match specific pages

Content selector

How to configure with MCP Client

Tips

License

Sponsors

Stats

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 25

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages