|
| 1 | +--- |
| 2 | +title: "I built an open protocol that makes any website readable by AI. Here's why." |
| 3 | +published: false |
| 4 | +description: "robots.txt tells machines to stay away. FlyWeb tells them: here's what I have." |
| 5 | +tags: ai, webdev, opensource, protocol |
| 6 | +cover_image: |
| 7 | +--- |
| 8 | + |
| 9 | +Every AI agent on the internet right now is doing the same thing: scraping HTML, guessing what's content, and hallucinating when it gets it wrong. |
| 10 | + |
| 11 | +Think about it. When ChatGPT, Claude, or Perplexity tries to answer a question using your website, here's what actually happens: |
| 12 | + |
| 13 | +1. It fetches your HTML |
| 14 | +2. It parses through `<div class="mx-4 flex items-center gap-2">` |
| 15 | +3. It guesses which part is the article and which part is the sidebar ad |
| 16 | +4. It generates an answer |
| 17 | +5. It gives you **zero credit** |
| 18 | + |
| 19 | +There's no standard way for a machine to ask: *"What content do you have?"* |
| 20 | + |
| 21 | +Every API is bespoke. Every integration is custom. There's no universal protocol. |
| 22 | + |
| 23 | +**Until now.** |
| 24 | + |
| 25 | +## Introducing FlyWeb |
| 26 | + |
| 27 | +FlyWeb is an open protocol — one JSON file at `/.well-known/flyweb.json` — that lets any website describe its content in a way machines can understand. |
| 28 | + |
| 29 | +Think of it as the inverse of `robots.txt`. Instead of "stay away," it says **"here's what I have."** |
| 30 | + |
| 31 | +```json |
| 32 | +{ |
| 33 | + "flyweb": "1.0", |
| 34 | + "entity": "My Tech Blog", |
| 35 | + "type": "blog", |
| 36 | + "attribution": { |
| 37 | + "required": true, |
| 38 | + "must_link": true |
| 39 | + }, |
| 40 | + "resources": { |
| 41 | + "posts": { |
| 42 | + "path": "/.flyweb/posts", |
| 43 | + "format": "jsonl", |
| 44 | + "fields": ["title", "author", "date", "tags", "content", "url"], |
| 45 | + "access": "free", |
| 46 | + "query": "?tag={tag}&limit={n}" |
| 47 | + } |
| 48 | + } |
| 49 | +} |
| 50 | +``` |
| 51 | + |
| 52 | +That's the entire config. One file. Any AI agent that finds it instantly knows: |
| 53 | +- What content you have (posts, products, articles, etc.) |
| 54 | +- Where to get it (clean JSON, not HTML soup) |
| 55 | +- How to query it (standard URL params) |
| 56 | +- How to credit you (attribution is enforced at the protocol level) |
| 57 | + |
| 58 | +## The Three Layers |
| 59 | + |
| 60 | +FlyWeb works in three layers: |
| 61 | + |
| 62 | +### 1. Discovery |
| 63 | +AI agents fetch `/.well-known/flyweb.json`. It's the first thing they check — like how crawlers check `robots.txt`, but for structured data. |
| 64 | + |
| 65 | +### 2. Structure |
| 66 | +Content is served as clean JSON or JSONL at the paths you define. No parsing HTML. No guessing. Just data. |
| 67 | + |
| 68 | +``` |
| 69 | +GET /.flyweb/posts |
| 70 | +``` |
| 71 | + |
| 72 | +Returns: |
| 73 | +```json |
| 74 | +{"title": "Why AI Needs Structure", "author": "Sarah Chen", "date": "2026-02-15", "content": "..."} |
| 75 | +{"title": "The Future of Web Protocols", "author": "Sarah Chen", "date": "2026-02-10", "content": "..."} |
| 76 | +``` |
| 77 | + |
| 78 | +### 3. Query |
| 79 | +Standard URL parameters. No SDK. No API key. No OAuth dance. |
| 80 | + |
| 81 | +``` |
| 82 | +GET /.flyweb/posts?tag=ai&limit=5 |
| 83 | +``` |
| 84 | + |
| 85 | +Any AI agent can construct this. It's just a URL. |
| 86 | + |
| 87 | +## Before & After |
| 88 | + |
| 89 | +**Without FlyWeb**, an AI agent sees this: |
| 90 | + |
| 91 | +```html |
| 92 | +<div class="post-container mx-4"> |
| 93 | + <div class="flex items-center gap-2"> |
| 94 | + <img src="/avatars/sarah.jpg" /> |
| 95 | + <span class="text-sm">Sarah Chen</span> |
| 96 | + </div> |
| 97 | + <h2 class="font-bold mt-4"> |
| 98 | + AI Agents Need Structure |
| 99 | + </h2> |
| 100 | + <div class="prose mt-2"> |
| 101 | + <p>The web was built for...</p> |
| 102 | + <div class="ad-banner">BUY NOW</div> |
| 103 | + </div> |
| 104 | +</div> |
| 105 | +``` |
| 106 | + |
| 107 | +Which part is the article? Where does content end? Is "Sarah Chen" the author or a commenter? The AI has to *guess*. |
| 108 | + |
| 109 | +**With FlyWeb**, the same AI agent gets: |
| 110 | + |
| 111 | +```json |
| 112 | +{ |
| 113 | + "title": "AI Agents Need Structure", |
| 114 | + "author": "Sarah Chen", |
| 115 | + "date": "2026-02-15", |
| 116 | + "tags": ["ai", "web"], |
| 117 | + "content": "The web was built for...", |
| 118 | + "url": "https://example.com/posts/42" |
| 119 | +} |
| 120 | +``` |
| 121 | + |
| 122 | +Clean. Structured. Zero guessing. |
| 123 | + |
| 124 | +## Attribution Is Non-Negotiable |
| 125 | + |
| 126 | +Here's what makes FlyWeb different from just "another API": |
| 127 | + |
| 128 | +```json |
| 129 | +"attribution": { |
| 130 | + "required": true, |
| 131 | + "license": "CC-BY-4.0", |
| 132 | + "must_link": true |
| 133 | +} |
| 134 | +``` |
| 135 | + |
| 136 | +This is enforced at the protocol level. When an AI agent reads your content through FlyWeb, it knows it **must** credit you. Must link back. Non-negotiable. |
| 137 | + |
| 138 | +You can set price to zero. You can never set attribution to zero. |
| 139 | + |
| 140 | +## 5 Minutes to Add It |
| 141 | + |
| 142 | +### Option 1: Framework plugin |
| 143 | + |
| 144 | +```bash |
| 145 | +# Next.js |
| 146 | +npm i next-flyweb |
| 147 | + |
| 148 | +# Astro |
| 149 | +npm i astro-flyweb |
| 150 | + |
| 151 | +# SvelteKit |
| 152 | +npm i sveltekit-flyweb |
| 153 | + |
| 154 | +# Nuxt |
| 155 | +npm i nuxt-flyweb |
| 156 | + |
| 157 | +# Express / Node.js |
| 158 | +npm i express-flyweb |
| 159 | +``` |
| 160 | + |
| 161 | +### Option 2: CLI |
| 162 | + |
| 163 | +```bash |
| 164 | +npx flyweb init |
| 165 | +``` |
| 166 | + |
| 167 | +This generates a `flyweb.json` template. Fill in your entity name, type, and resources. Done. |
| 168 | + |
| 169 | +### Option 3: WordPress |
| 170 | + |
| 171 | +Install the [FlyWeb WordPress plugin](https://github.com/flywebprotocol/flyweb/tree/master/wordpress-flyweb). It auto-generates flyweb.json from your posts, pages, and WooCommerce products. |
| 172 | + |
| 173 | +### Validate it |
| 174 | + |
| 175 | +```bash |
| 176 | +npx flyweb check https://your-site.com |
| 177 | +``` |
| 178 | + |
| 179 | +``` |
| 180 | +✓ Found /.well-known/flyweb.json |
| 181 | +✓ Valid FlyWeb v1.0 config |
| 182 | +✓ Attribution: required |
| 183 | +✓ Resources: posts, products |
| 184 | +All checks passed! |
| 185 | +``` |
| 186 | + |
| 187 | +## For AI Developers: Client SDK |
| 188 | + |
| 189 | +If you're building AI agents that *consume* web content, FlyWeb gives you structured data instead of HTML scraping: |
| 190 | + |
| 191 | +```typescript |
| 192 | +import { discover, fetchResource } from 'flyweb/client'; |
| 193 | + |
| 194 | +// Discover what a site exposes |
| 195 | +const site = await discover('https://techcrunch.com'); |
| 196 | +console.log(site.config.resources); |
| 197 | +// { articles: { path: "/.flyweb/articles", format: "jsonl", ... } } |
| 198 | + |
| 199 | +// Fetch structured data |
| 200 | +const articles = await fetchResource( |
| 201 | + 'https://techcrunch.com', |
| 202 | + site.config.resources.articles, |
| 203 | + { params: { tag: 'ai' }, limit: 10 } |
| 204 | +); |
| 205 | +// Clean JSON array. No scraping. No hallucination. |
| 206 | +``` |
| 207 | + |
| 208 | +## For AI Coding Tools: MCP Server |
| 209 | + |
| 210 | +If you use Claude Code, Cursor, or any MCP-compatible tool, add FlyWeb to your config: |
| 211 | + |
| 212 | +```json |
| 213 | +{ |
| 214 | + "mcpServers": { |
| 215 | + "flyweb": { |
| 216 | + "command": "npx", |
| 217 | + "args": ["-y", "flyweb-mcp"] |
| 218 | + } |
| 219 | + } |
| 220 | +} |
| 221 | +``` |
| 222 | + |
| 223 | +Your AI assistant can now: |
| 224 | +- `flyweb_discover` — Check if any site has FlyWeb |
| 225 | +- `flyweb_fetch` — Pull structured data from FlyWeb sites |
| 226 | +- `flyweb_validate` — Validate a flyweb.json you're writing |
| 227 | +- `flyweb_generate` — Generate a flyweb.json for any project |
| 228 | + |
| 229 | +## Why This Matters Now |
| 230 | + |
| 231 | +We're at an inflection point. AI agents are becoming the primary way people find information. If your content isn't machine-readable, it's invisible to this entire new layer of the internet. |
| 232 | + |
| 233 | +SEO took 20 years to become standard. The AI equivalent — making your site structured for machines — needs to happen now. Not in 5 years when every AI company has built their own proprietary scraping pipeline. Now, while we can still build an **open standard**. |
| 234 | + |
| 235 | +The flywheel is simple: |
| 236 | + |
| 237 | +``` |
| 238 | +AI agents build sites with FlyWeb |
| 239 | + ↓ |
| 240 | +Other AI agents read structured data from those sites |
| 241 | + ↓ |
| 242 | +FlyWeb proves its value (clean data > scraping) |
| 243 | + ↓ |
| 244 | +More AI agents add it by default |
| 245 | + ↓ |
| 246 | +The web becomes machine-readable |
| 247 | +``` |
| 248 | + |
| 249 | +## The Protocol Is Open |
| 250 | + |
| 251 | +- **Spec**: [github.com/flywebprotocol/flyweb/SPEC.md](https://github.com/flywebprotocol/flyweb/blob/master/SPEC.md) |
| 252 | +- **Website**: [flyweb.io](https://flyweb.io) |
| 253 | +- **Docs**: [flyweb.io/docs](https://flyweb.io/docs) |
| 254 | +- **npm**: [npmjs.com/package/flyweb](https://www.npmjs.com/package/flyweb) |
| 255 | +- **GitHub**: [github.com/flywebprotocol/flyweb](https://github.com/flywebprotocol/flyweb) |
| 256 | +- **MCP Server**: [npmjs.com/package/flyweb-mcp](https://www.npmjs.com/package/flyweb-mcp) |
| 257 | + |
| 258 | +MIT licensed. No vendor lock-in. No payment required. Just an open protocol for a machine-readable web. |
| 259 | + |
| 260 | +--- |
| 261 | + |
| 262 | +**Add FlyWeb to your site in 5 minutes.** `npx flyweb init` and your content stops being invisible to AI. |
| 263 | + |
| 264 | +If you think the web should be readable by machines — not just humans — [star the repo](https://github.com/flywebprotocol/flyweb) and add it to your next project. |
| 265 | + |
| 266 | +*The web was built for human eyes. It's time to give machines a structured way in.* |
0 commit comments