Date: February 2026 Status: Strategic Development Roadmap Scope: Phases 3 through 6 (Months 1-18) Audience: Founding team, engineering leads, investors Builds on: PRODUCT_STRATEGY.md (Phases 1-2 complete), LANDING_PAGE_PLAN.md, COMPETITOR_LANDING_ANALYSIS.md
BetterCodeWiki has completed its foundational phases: a redesigned UI with 3D landing page, Ask AI, dependency graphs, global search, multi-format export, and reading mode. The product is a compelling single-player web application. The next 18 months must transform it from a web tool into a platform -- one that lives inside developers' workflows, enables team collaboration, and builds network effects that create an uncatchable lead.
The thesis for Phase 3+ is:
The winner in AI code understanding will be the tool that becomes ambient -- always present in the IDE, always current with the code, always improving from team knowledge. BetterCodeWiki must go from "a place you visit" to "a layer that surrounds your code."
The plan is structured in four phases:
| Phase | Timeline | Theme | Key Outcome |
|---|---|---|---|
| Phase 3 | Months 1-3 | IDE & Real-Time | VS Code extension + GitHub App + live docs |
| Phase 4 | Months 3-6 | Collaboration & Teams | Multi-user workspaces, real-time editing, integrations |
| Phase 5 | Months 6-12 | Platform & Ecosystem | Plugin marketplace, API platform, enterprise |
| Phase 6 | Months 12-18 | Market Domination | Network effects, verticals, international, acquisitions |
Revenue target: $0 -> $50K MRR (Month 6) -> $250K MRR (Month 12) -> $1M+ MRR (Month 18)
Core conviction: GitHub Copilot proved developers will pay $20/month for AI that saves time in the IDE. Cursor proved they will pay $20/month for a better AI-IDE experience. BetterCodeWiki will prove they will pay $19-29/month for AI that eliminates the 60% of developer time spent reading and understanding code.
Market size and growth:
- The global developer tools market reached approximately $22-24 billion in 2025, growing at 18-22% CAGR.
- AI-powered developer tools specifically are growing at 25-40% CAGR, the fastest-growing sub-segment.
- Code understanding and documentation tools represent an estimated $500M-$1B sub-segment, projected to reach $2-3B by 2028.
Key market shifts observed in 2025:
- AI moved from "assistant" to "agent." GitHub Copilot Workspace, Cursor Composer, Claude Code, and Devin showed that AI is moving from suggesting code snippets to executing multi-step development tasks autonomously. Documentation generation fits naturally into this agent paradigm.
- IDE-first became mandatory. The success of Cursor (which reached $100M+ ARR by late 2025) and the growth of Continue.dev (20K+ GitHub stars, 500K+ VS Code installs) proved that developer tools must live in the IDE. Web-only tools are increasingly seen as friction.
- RAG for code matured. Retrieval-augmented generation for codebases -- using vector embeddings of code files for context-aware AI responses -- moved from experimental to production-ready. BetterCodeWiki's existing FAISS-based RAG system (in
api/rag.py) is a strong foundation, but competitors are adopting more sophisticated chunking strategies (AST-aware, semantic), re-ranking models, and hybrid search. - Multi-modal AI became real. Claude 3.5, GPT-4o, and Gemini 1.5/2.0 can now process images, diagrams, and code simultaneously. This unlocks analyzing architecture diagrams, Figma designs, and whiteboard photos alongside code -- a feature no competitor offers yet.
- Enterprise AI governance emerged. Large organizations now require audit trails for AI interactions, data residency controls, and model selection policies. BetterCodeWiki's multi-provider architecture and self-hosting option are perfectly positioned for this.
How successful IDE extensions work:
| Extension | Architecture | Key Pattern | Stars/Installs |
|---|---|---|---|
| GitHub Copilot | Cloud-only; thin VS Code client sends context to GitHub servers; inline completions via ghost text | Language Server Protocol (LSP) for deep editor integration; sidebar chat panel for Q&A | 15M+ installs |
| Cursor | Forked VS Code entirely; custom AI pipeline replaces built-in features; Composer for multi-file edits | Full IDE fork gives maximum control but limits distribution to Cursor users only | 2M+ users |
| Continue.dev | Open-source VS Code/JetBrains extension; connects to any LLM provider; sidebar chat + inline edits | Extension-only approach (no fork); modular provider system; local model support via Ollama | 20K+ stars, 500K+ installs |
| Cody (Sourcegraph) | VS Code extension + web app; uses Sourcegraph's code graph for context; chat + autocomplete + commands | Combines code search index with LLM; "codebase-aware" context through pre-indexed code graph | 10K+ stars |
| Mintlify Doc Writer | VS Code extension that generates docstrings for functions; lightweight, single-purpose | Simple, focused value prop; generates JSDoc/docstring on command; no sidebar, no chat | 500K+ installs |
Critical lesson: The most successful extensions combine (a) a sidebar panel for exploration/chat, (b) inline editor integration (hover tooltips, code actions), and (c) a connection to a cloud backend that provides the heavy computation. BetterCodeWiki should follow Continue.dev's open architecture rather than Cursor's fork approach.
What is now possible that was not possible 12 months ago:
-
1M+ token context windows. Gemini 1.5 Pro (2M tokens) and Claude 3.5 (200K tokens, with retrieval) can now ingest entire medium-sized codebases in a single prompt. This changes the architecture: instead of chunking + RAG for small repos, we can do full-context analysis. RAG remains essential for large repos (>500 files) but the threshold has moved dramatically.
-
AST-aware code embeddings. Research from Microsoft (CodeBERT, GraphCodeBERT) and open-source projects (Voyage AI's code embeddings, Jina's code models) show that embeddings that understand code structure (not just text) improve retrieval accuracy by 15-30% for code Q&A. BetterCodeWiki's current embedding approach (in
api/data_pipeline.py) uses general-purpose text embeddings -- upgrading to code-specific embeddings is a significant quality improvement. -
Agentic documentation generation. Instead of single-pass "generate docs for this file," agent frameworks (LangGraph, CrewAI, AutoGen) enable multi-step flows: (a) analyze repo structure, (b) identify architectural patterns, (c) generate docs for each module, (d) cross-reference and validate, (e) generate diagrams. BetterCodeWiki's websocket-based wiki generation (in
api/websocket_wiki.py) already does a version of this; the opportunity is to make it more sophisticated. -
Real-time incremental analysis. Instead of regenerating entire wikis, diff-based analysis can update only the sections affected by a code change. This requires: git diff parsing, file-to-wiki-section mapping, and incremental regeneration. No competitor does this well yet.
-
Multi-modal code analysis. AI can now analyze:
- Screenshots of UIs and map them to component code
- Architecture diagrams (hand-drawn or digital) and validate them against actual code structure
- Figma designs and generate component documentation
- Database schema diagrams and map them to ORM code This is completely uncharted territory for documentation tools.
| Competitor | Strengths | Weaknesses | Threat Level |
|---|---|---|---|
| Google CodeWiki | Google brand, free, Gemini integration | Closed source, Gemini-only, no IDE extension, no community, "Google Graveyard" risk | High (brand) / Medium (product) |
| Swimm | Team-focused, keeps docs in code, IDE extension exists | Closed source, no AI generation (manual docs), limited to inline comments, $30/seat | Medium |
| ReadMe | Beautiful API docs, developer hub, custom domains | Focused on API docs only (not codebase understanding), no AI generation, expensive ($99+/mo) | Low |
| Mintlify | Fast doc generation, CLI-based, nice templates | Documentation site builder (not codebase wiki), manual content, no AI code analysis | Low |
| GitBook | Established, team collaboration, good UX | General-purpose docs (not code-specific), no AI analysis, slow to adopt AI | Low-Medium |
| Notion AI | Massive user base, AI features, team collaboration | Not code-specific, no repo integration, no diagrams, general-purpose | Low |
| Sourcegraph Cody | Deep code search, code graph, IDE extension | Pivoting away from enterprise search; uncertain product direction; complex setup | Medium |
| GitHub Copilot Chat | GitHub integration, massive distribution, free for OSS | Not documentation-focused, no wiki generation, no diagrams, no export | High (adjacent threat) |
Key insight: No competitor combines all of: AI-generated wiki + IDE extension + real-time sync + multi-provider AI + self-hosting + open source. BetterCodeWiki can be first to market with this complete package.
The collaboration layer is where the highest willingness-to-pay resides:
- Linear ($8-12/seat/month) proved developers want collaboration tools that feel as good as consumer apps
- Notion ($8-18/seat/month) proved that collaborative knowledge bases command premium pricing
- Figma ($12-75/seat/month) proved that real-time collaborative editing creates massive switching costs
- Slack ($7.25-12.50/seat/month) proved that team communication tools become essential infrastructure
Pattern: Tools that start as single-player (free) and add collaboration (paid) achieve 3-5x higher ARPU than tools that stay single-player. BetterCodeWiki must follow this trajectory.
What works for product-led growth in dev tools:
- "Badge in README" distribution (Shields.io model) -- a "Docs powered by BetterCodeWiki" badge in GitHub READMEs creates viral awareness. Every README view is an impression.
- Public wiki directory as SEO engine -- hosting AI-generated docs for popular OSS projects creates a searchable index. When developers Google "how does React's reconciler work," BetterCodeWiki's wiki page should rank.
- GitHub Action / App for frictionless adoption -- install once, get auto-generated docs on every push. No ongoing effort required from the team.
- IDE extension as daily touchpoint -- web apps get visited occasionally; IDE extensions are used every day. Daily active usage drives conversion.
- "Invite your team" as upgrade trigger -- the single most effective PLG conversion moment is when a user wants to share their wiki with colleagues.
- Developer content marketing -- blog posts like "Understanding the React codebase with BetterCodeWiki" or "How Kubernetes networking actually works" drive organic traffic and demonstrate value.
Freemium conversion benchmarks for dev tools:
- Free -> Paid conversion: 2-5% is typical; best-in-class (Cursor, Vercel) achieve 5-10%
- Monthly churn for paid: 3-7% for Pro; <2% for Team/Enterprise
- Time to conversion: median 14-30 days from first use
- Expansion revenue: 20-40% of revenue growth from existing customers upgrading tiers
Product vision: A sidebar panel in VS Code that shows the BetterCodeWiki documentation for whatever code the developer is currently reading or editing. The wiki follows the developer's cursor.
VS Code Extension (TypeScript)
├── extension.ts # Activation, command registration
├── providers/
│ ├── SidebarProvider.ts # WebView panel showing wiki content
│ ├── HoverProvider.ts # Tooltip wiki summaries on hover
│ ├── CodeLensProvider.ts # Inline "View Docs" links above functions
│ └── TreeDataProvider.ts # Wiki structure in Explorer sidebar
├── services/
│ ├── ApiClient.ts # HTTP/WS client to BetterCodeWiki backend
│ ├── ContextTracker.ts # Tracks active file, function, symbol
│ ├── CacheManager.ts # Local SQLite cache for wiki content
│ └── AuthManager.ts # Token/API key management
├── webview/
│ ├── WikiPanel.tsx # React app rendered in WebView
│ ├── AskPanel.tsx # Ask AI interface in sidebar
│ └── DiagramPanel.tsx # Mermaid diagram viewer
└── package.json # Extension manifest, contributes, activation events
Key technical decisions:
-
WebView-based sidebar (not native VS Code UI). This allows us to reuse BetterCodeWiki's existing React components (Markdown renderer, Mermaid diagrams, Ask AI) inside the extension. The tradeoff is slightly more memory usage vs. significant code reuse.
-
Context tracking via VS Code API. The extension listens to
vscode.window.onDidChangeActiveTextEditorandvscode.window.onDidChangeTextEditorSelectionto detect which file and function the developer is looking at. It then fetches the corresponding wiki page from the BetterCodeWiki backend. -
Tiered connection model:
- Cloud mode: Extension connects to BetterCodeWiki's hosted API (for Pro/Team users)
- Local mode: Extension connects to a locally-running BetterCodeWiki instance (for self-hosted/free users)
- Offline mode: Extension uses cached wiki content when no connection is available
-
Incremental wiki generation. The extension does not require the entire wiki to be pre-generated. When a developer opens a file that has no wiki page, the extension can generate it on-demand (single-file analysis) and cache the result.
| Feature | Free Tier | Pro Tier | Description |
|---|---|---|---|
| Wiki sidebar panel | View-only (cached) | Full interactive | Shows wiki content for current file/function |
| Hover documentation | -- | Yes | Hover over any symbol for a wiki tooltip |
| CodeLens links | -- | Yes | "View Docs" / "Ask AI" links above functions |
| Ask AI in sidebar | 5 questions/day | Unlimited | Chat with AI about the current code context |
| Diagram viewer | -- | Yes | View Mermaid diagrams inline in the sidebar |
| Wiki tree in Explorer | Yes | Yes | Full wiki structure in VS Code's Explorer panel |
| Generate wiki | 1 repo | Unlimited | Trigger wiki generation from the extension |
| Offline cache | -- | Yes | Browse wiki content without internet connection |
Sprint 1 (Weeks 1-2): Foundation
- Set up VS Code extension project with TypeScript + WebView
- Implement
extension.tswith activation on workspace open - Build
ApiClient.tswith authentication (API key) and HTTP/WS support - Build
ContextTracker.tsto detect active file and cursor position - Create basic
SidebarProvider.tswith a static "Hello World" WebView
Sprint 2 (Weeks 3-4): Wiki Panel
- Build
WikiPanel.tsxReact app for WebView (port existing Markdown renderer) - Implement file-to-wiki-page mapping (match file paths to wiki page
filePathsfield) - Add
TreeDataProvider.tsto show wiki structure in Explorer - Implement
CacheManager.tswith SQLite for offline wiki storage - Wire up context tracking -> API fetch -> sidebar update pipeline
Sprint 3 (Weeks 5-6): Intelligence Features
- Build
HoverProvider.tsfor wiki tooltips on symbol hover - Build
CodeLensProvider.tsfor inline "View Docs" links - Port Ask AI component to
AskPanel.tsxin the sidebar - Implement on-demand single-file wiki generation
- Add settings UI (API key, server URL, offline mode toggle)
Sprint 4 (Weeks 7-8): Polish & Launch
- Performance optimization (lazy loading, debounced context tracking)
- Cross-platform testing (macOS, Windows, Linux)
- Write extension README and marketplace page
- Record demo video and screenshots
- Publish to VS Code Marketplace
- Announce on Twitter/X, Hacker News, Reddit r/vscode, r/programming
Estimated effort: 2 engineers, 8 weeks Dependencies: Backend API must support single-file wiki generation endpoint (new)
New backend endpoints needed for IDE extension support:
# New endpoint: Generate wiki for a single file
POST /api/wiki/file
{
"repo_url": "https://github.com/owner/repo",
"file_path": "src/auth/login.ts",
"provider": "openai",
"model": "gpt-4o",
"token": "ghp_xxx" # optional, for private repos
}
# Returns: { "page_id": "...", "title": "...", "content": "...", "filePaths": [...] }
# New endpoint: Get wiki page by file path
GET /api/wiki/page?repo_url=...&file_path=src/auth/login.ts
# Returns: Cached wiki page content if available, 404 if not generated
# New endpoint: Get wiki structure for a repo
GET /api/wiki/structure?repo_url=...
# Returns: WikiStructure JSON (sections, pages, relationships)
# New endpoint: Symbol-level documentation
POST /api/wiki/symbol
{
"repo_url": "https://github.com/owner/repo",
"file_path": "src/auth/login.ts",
"symbol_name": "authenticateUser",
"symbol_type": "function",
"line_number": 42
}
# Returns: { "summary": "...", "parameters": [...], "related_pages": [...] }Product vision: When code changes (push, PR merge, branch update), the wiki automatically updates. Developers see "living documentation" that is always current.
Code Change Event Flow:
[Git Push] ──> [GitHub Webhook] ──> [BetterCodeWiki Webhook Receiver]
│
▼
[Diff Analyzer]
"What files changed?"
│
▼
[Impact Mapper]
"Which wiki pages are affected?"
(file_path -> wiki_page mapping)
│
▼
[Incremental Generator]
"Regenerate only affected pages"
│
┌───────────┴───────────┐
▼ ▼
[Update Wiki Cache] [Notify Subscribers]
(overwrite pages) (WebSocket push to
web app + IDE ext)
# New file: api/webhooks.py
@app.post("/webhooks/github")
async def github_webhook(request: Request):
"""Handle GitHub push/PR webhooks for auto-documentation."""
payload = await request.json()
event_type = request.headers.get("X-GitHub-Event")
if event_type == "push":
# Extract changed files from push payload
changed_files = extract_changed_files(payload)
repo_url = payload["repository"]["html_url"]
# Find affected wiki pages
affected_pages = map_files_to_wiki_pages(repo_url, changed_files)
if affected_pages:
# Queue incremental regeneration
await queue_regeneration(
repo_url=repo_url,
page_ids=affected_pages,
trigger="push",
commit_sha=payload["after"]
)
elif event_type == "pull_request" and payload["action"] == "closed" and payload["pull_request"]["merged"]:
# PR merged -- regenerate affected pages
changed_files = await get_pr_changed_files(payload)
# ... same flow as push
return {"status": "queued"}
@app.post("/webhooks/gitlab")
async def gitlab_webhook(request: Request):
"""Handle GitLab push webhooks."""
# Similar implementation for GitLab
@app.post("/webhooks/bitbucket")
async def bitbucket_webhook(request: Request):
"""Handle Bitbucket push webhooks."""
# Similar implementation for BitbucketThe key technical challenge: mapping file changes to wiki page updates without regenerating the entire wiki.
Solution: File-to-Page Index
When a wiki is first generated, build an index mapping each source file to the wiki pages that reference it (using the existing filePaths field in WikiPage):
# file_page_index example:
{
"src/auth/login.ts": ["page_auth_overview", "page_login_flow"],
"src/auth/jwt.ts": ["page_auth_overview", "page_jwt_tokens"],
"src/api/routes.ts": ["page_api_reference", "page_routing"],
# ...
}When files change, look up the index, and regenerate only those pages. The regeneration prompt includes the git diff as additional context:
You previously generated this documentation for the auth module:
[previous content]
The following files have changed since then:
[git diff for src/auth/login.ts]
Please update the documentation to reflect these changes.
Keep unchanged sections as-is. Highlight what changed.
Freshness indicators on the frontend:
// New component: FreshnessIndicator.tsx
interface FreshnessProps {
lastVerified: Date; // When the page was last checked against code
lastUpdated: Date; // When the page content last changed
commitSha: string; // The commit this page is synced to
totalCommitsBehind: number; // How many commits since last sync
}
function FreshnessIndicator({ lastVerified, totalCommitsBehind }: FreshnessProps) {
if (totalCommitsBehind === 0) {
return <Badge variant="green">Up to date</Badge>
} else if (totalCommitsBehind < 5) {
return <Badge variant="yellow">{totalCommitsBehind} commits behind</Badge>
} else {
return <Badge variant="red">May be outdated ({totalCommitsBehind} commits behind)</Badge>
}
}Estimated effort: 1 backend engineer, 4 weeks Dependencies: GitHub/GitLab/Bitbucket webhook documentation; existing wiki cache system
Product vision: Install BetterCodeWiki as a GitHub App on your repository. Every push automatically generates or updates documentation. Zero ongoing effort.
| Field | Value |
|---|---|
| App name | BetterCodeWiki |
| Description | Automatically generates and maintains AI-powered documentation for your repository |
| Permissions | contents: read, pull_requests: read, metadata: read |
| Webhook events | push, pull_request (opened, synchronized, closed) |
| Setup URL | https://app.bettercodewiki.com/github/install |
| Callback URL | https://app.bettercodewiki.com/api/github/callback |
- Auto-generate wiki on first install. When a user installs the GitHub App on a repository, trigger a full wiki generation.
- Incremental updates on push. Every push to the default branch triggers the incremental regeneration engine (Section 3.2.3).
- PR documentation preview. When a PR is opened, generate a comment showing what documentation would change if the PR is merged. Example:
## BetterCodeWiki Documentation Preview
This PR would update the following documentation pages:
| Page | Change Type | Details |
|------|-------------|---------|
| Auth Module Overview | Updated | New OAuth2 flow added |
| API Reference | Updated | 2 new endpoints documented |
| Getting Started | No change | -- |
[View full documentation preview](https://app.bettercodewiki.com/preview/owner/repo/pr/42)-
Status checks. Add a GitHub status check that shows whether documentation is up-to-date. Can be configured as required (block merge if docs are stale).
-
Badge for README. Provide a dynamic badge:
[](https://app.bettercodewiki.com/owner/repo)Week 1-2: Register GitHub App, implement OAuth flow, set up webhook receiver Week 3: Implement push webhook -> incremental regeneration pipeline Week 4: Implement PR comment preview feature Week 5: Build badge endpoint, status check integration Week 6: Testing, documentation, launch
Estimated effort: 1 full-stack engineer, 6 weeks
| Milestone | Target Date | Success Metric |
|---|---|---|
| VS Code extension v0.1 (sidebar + tree) | Month 1, Week 4 | Internal dogfooding |
| VS Code extension v1.0 (hover + ask + cache) | Month 2, Week 4 | Published to Marketplace |
| GitHub App v1.0 (auto-doc-on-push) | Month 2, Week 2 | Available on GitHub Marketplace |
| Incremental regeneration engine | Month 2, Week 4 | <30s update time for single-file change |
| VS Code extension: 1,000 installs | Month 3 | Marketplace analytics |
| GitHub App: 100 installations | Month 3 | GitHub App dashboard |
| Real-time sync latency | Month 3 | <60s from push to updated wiki |
Product vision: Organizations can create a workspace where all their repositories' wikis are unified. Team members see a shared view with permissions, annotations, and activity feeds.
-- Core workspace tables (PostgreSQL)
CREATE TABLE workspaces (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name VARCHAR(255) NOT NULL,
slug VARCHAR(255) UNIQUE NOT NULL,
owner_id UUID REFERENCES users(id),
plan VARCHAR(50) DEFAULT 'free', -- free, pro, team, enterprise
created_at TIMESTAMP DEFAULT NOW(),
settings JSONB DEFAULT '{}'
);
CREATE TABLE workspace_members (
workspace_id UUID REFERENCES workspaces(id),
user_id UUID REFERENCES users(id),
role VARCHAR(50) DEFAULT 'viewer', -- owner, admin, editor, viewer
invited_at TIMESTAMP DEFAULT NOW(),
accepted_at TIMESTAMP,
PRIMARY KEY (workspace_id, user_id)
);
CREATE TABLE workspace_repos (
workspace_id UUID REFERENCES workspaces(id),
repo_url VARCHAR(500) NOT NULL,
repo_type VARCHAR(50), -- github, gitlab, bitbucket, local
display_name VARCHAR(255),
auto_sync BOOLEAN DEFAULT true,
last_synced_at TIMESTAMP,
wiki_structure JSONB,
PRIMARY KEY (workspace_id, repo_url)
);
CREATE TABLE annotations (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
workspace_id UUID REFERENCES workspaces(id),
wiki_page_id VARCHAR(255) NOT NULL,
repo_url VARCHAR(500) NOT NULL,
author_id UUID REFERENCES users(id),
content TEXT NOT NULL,
content_type VARCHAR(50) DEFAULT 'comment', -- comment, correction, context, question
target_selector JSONB, -- { "type": "text", "start": 142, "end": 198 } or { "type": "heading", "id": "h2-auth" }
resolved BOOLEAN DEFAULT false,
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW()
);
CREATE TABLE annotation_replies (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
annotation_id UUID REFERENCES annotations(id),
author_id UUID REFERENCES users(id),
content TEXT NOT NULL,
created_at TIMESTAMP DEFAULT NOW()
);
CREATE TABLE activity_feed (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
workspace_id UUID REFERENCES workspaces(id),
actor_id UUID REFERENCES users(id),
action VARCHAR(100) NOT NULL, -- wiki_generated, page_updated, annotation_added, member_invited
target_type VARCHAR(50), -- wiki_page, repo, annotation, member
target_id VARCHAR(255),
metadata JSONB,
created_at TIMESTAMP DEFAULT NOW()
);+------------------------------------------------------------------+
| BetterCodeWiki - Acme Corp Workspace [Settings] [Team] |
+------------------------------------------------------------------+
| |
| Repositories (4) Activity Feed |
| +---------------------------+ +-------------------------+ |
| | [*] acme/backend | | Sarah updated Auth docs | |
| | 23 pages, synced 2m | | 5 minutes ago | |
| | ago | | | |
| | [*] acme/frontend | | Mike added annotation | |
| | 18 pages, synced 1h | | on API Routes | |
| | ago | | 1 hour ago | |
| | [*] acme/mobile-app | | | |
| | 12 pages, synced 3h | | Auto-sync: 3 pages | |
| | ago | | updated in backend | |
| | [ ] acme/infra (paused) | | 2 hours ago | |
| +---------------------------+ +-------------------------+ |
| |
| Quick Stats |
| Total pages: 53 | Annotations: 127 | Team members: 8 |
| Freshness: 94% up-to-date | Avg. generation quality: 4.2/5 |
| |
+------------------------------------------------------------------+
| Permission | Owner | Admin | Editor | Viewer |
|---|---|---|---|---|
| Manage workspace settings | Yes | Yes | -- | -- |
| Invite/remove members | Yes | Yes | -- | -- |
| Add/remove repositories | Yes | Yes | Yes | -- |
| Trigger wiki regeneration | Yes | Yes | Yes | -- |
| Add annotations/comments | Yes | Yes | Yes | -- |
| View wiki content | Yes | Yes | Yes | Yes |
| Export wiki | Yes | Yes | Yes | Yes |
| Delete workspace | Yes | -- | -- | -- |
| Manage billing | Yes | -- | -- | -- |
Product vision: Multiple team members can view and annotate the same wiki page simultaneously. Annotations, corrections, and AI-generated content merge seamlessly.
CRDT-based real-time sync using Yjs (MIT-licensed collaborative editing framework):
Architecture:
[User A: Browser] ──WebSocket──> [Yjs Server (Hocuspocus)] <──WebSocket── [User B: Browser]
│ │ │
▼ ▼ ▼
[Yjs Document] [Persistence Layer] [Yjs Document]
(local state) (PostgreSQL + S3) (local state)
Why Yjs over Operational Transform (OT):
- CRDT (Conflict-free Replicated Data Types) handles offline edits gracefully -- essential for IDE extension offline mode
- Yjs is the industry standard (used by Notion, Figma, Linear, Hocuspocus)
- MIT licensed and actively maintained
- Supports awareness (cursor positions, selection) for presence indicators
- Sub-50ms merge latency for typical edits
-
Presence indicators. See who else is viewing the same wiki page. Show avatar + cursor position for users making annotations.
-
Inline annotations. Select any text in a wiki page and add a comment. Comments appear as highlighted regions with a thread in a sidebar panel.
-
AI content + human corrections merge. When AI regenerates a page, human annotations are preserved and repositioned using text anchoring (fuzzy matching of annotation target text).
-
Suggestion mode. Team members can propose edits to AI-generated content (similar to Google Docs' "Suggest" mode). Suggestions are reviewed and accepted/rejected by editors.
-
Version history. Full history of all changes to a wiki page, with diff view. Powered by Yjs's built-in versioning.
| Dependency | Package | Purpose |
|---|---|---|
yjs |
Real-time CRDT | Collaborative document state |
@hocuspocus/server |
WebSocket server | Multi-user sync relay |
@hocuspocus/extension-database |
Persistence | Store Yjs documents in PostgreSQL |
y-prosemirror or y-tiptap |
Rich text binding | Connect Yjs to a rich text editor for annotations |
Estimated effort: 2 engineers, 8 weeks
Product vision: Documentation changes (whether AI-generated or human-written) go through a review process before being published, ensuring quality and accuracy.
[Code push triggers regeneration]
│
▼
[AI generates updated content]
│
▼
[Draft created in workspace]
- Shows diff from previous version
- Highlights AI-generated vs. human-written sections
- Tags affected team members (based on file ownership / CODEOWNERS)
│
▼
[Review phase]
- Reviewers can approve, request changes, or comment
- AI-generated content can be edited before publishing
- Minimum 1 approval required (configurable)
│
▼
[Published to live wiki]
- Previous version archived
- Changelog entry generated automatically
- Notification sent to workspace
If the repository has a CODEOWNERS file, BetterCodeWiki can automatically assign documentation reviewers based on code ownership:
# CODEOWNERS
src/auth/ @sarah @mike
src/api/ @john @sarah
src/frontend/ @emma
# When src/auth/login.ts changes and the Auth Module wiki page is regenerated,
# Sarah and Mike are automatically requested to review the documentation update.
Product vision: Documentation updates, review requests, and annotation notifications are delivered to the team's existing communication channels.
| Event | Default Channel | Message Format |
|---|---|---|
| Wiki generated for new repo | #documentation | "Wiki generated for acme/backend -- 23 pages, 12 diagrams. [View Wiki]" |
| Pages updated (auto-sync) | #documentation | "3 pages updated in acme/backend wiki after push by @sarah. [View Changes]" |
| Annotation added | #doc-reviews | "@mike left a comment on Auth Module: 'The JWT expiry should be 24h, not 1h.' [View]" |
| Review requested | DM to reviewer | "Documentation review requested: Auth Module updated. 2 sections changed. [Review]" |
| Stale documentation alert | #documentation | "Warning: API Reference for acme/backend is 15 commits behind. [Regenerate]" |
- OAuth scopes:
chat:write,incoming-webhook,commands - Slash commands:
/wiki [repo-url]-- Generate a link to the wiki for a repo/wiki-ask [repo-url] [question]-- Ask a question about a codebase, get an inline answer/wiki-status [repo-url]-- Check documentation freshness
- Interactive messages: Buttons for "View Wiki," "Approve Review," "Regenerate"
Estimated effort: 1 engineer, 3 weeks per platform (Slack, Discord, Teams)
| Milestone | Target Date | Success Metric |
|---|---|---|
| Workspace creation + member invite flow | Month 3, Week 4 | Internal testing |
| Annotations system (inline comments) | Month 4, Week 2 | Usable on production |
| Real-time collaborative viewing | Month 4, Week 4 | <100ms presence updates |
| Review workflow v1 | Month 5, Week 2 | End-to-end flow working |
| Slack integration v1 | Month 5, Week 4 | Published to Slack App Directory |
| 50 team workspaces created | Month 6 | Analytics dashboard |
| Team tier conversion rate | Month 6 | >3% of free users who invite teammates convert |
| Average annotations per workspace | Month 6 | >10 annotations per active workspace |
Product vision: Third-party developers can build plugins that extend BetterCodeWiki's capabilities -- custom analyzers, documentation templates, visualization types, export formats, and integrations.
// @bettercodewiki/plugin-sdk
interface Plugin {
id: string;
name: string;
version: string;
description: string;
author: string;
// Lifecycle hooks
onInstall?: (context: PluginContext) => Promise<void>;
onUninstall?: (context: PluginContext) => Promise<void>;
// Extension points
analyzers?: Analyzer[]; // Custom code analysis
templates?: Template[]; // Documentation templates
visualizations?: Visualization[]; // Custom diagram types
exportFormats?: ExportFormat[]; // Custom export formats
integrations?: Integration[]; // Third-party service connectors
}
interface Analyzer {
id: string;
name: string;
filePatterns: string[]; // Glob patterns for files this analyzer handles
analyze: (files: SourceFile[], context: AnalysisContext) => Promise<AnalysisResult>;
}
interface Template {
id: string;
name: string;
description: string;
category: string; // "api-docs", "architecture", "onboarding", "runbook"
generate: (wikiStructure: WikiStructure, context: TemplateContext) => Promise<string>;
}
// Example: A Terraform analyzer plugin
const terraformPlugin: Plugin = {
id: "terraform-analyzer",
name: "Terraform Infrastructure Docs",
version: "1.0.0",
description: "Generates infrastructure documentation from Terraform files",
author: "community",
analyzers: [{
id: "terraform",
name: "Terraform Analyzer",
filePatterns: ["**/*.tf", "**/*.tfvars"],
analyze: async (files, context) => {
// Parse Terraform HCL, extract resources, modules, variables
// Generate infrastructure documentation with resource dependency graph
return {
pages: [
{ title: "Infrastructure Overview", content: "..." },
{ title: "Resource Dependencies", content: "...", diagram: "..." }
]
};
}
}]
};| Plugin | Category | Description |
|---|---|---|
| OpenAPI/Swagger Docs | Analyzer | Parses OpenAPI specs and generates rich API documentation with request/response examples |
| Database Schema Docs | Analyzer | Reads migration files, Prisma/TypeORM schemas, and generates ERD + table documentation |
| Terraform/Pulumi Infra Docs | Analyzer | Generates infrastructure documentation from IaC files |
| Storybook Component Docs | Analyzer | Reads Storybook stories and generates component documentation with visual examples |
| Confluence Export | Export | Exports wiki to Confluence pages (already partially built in ExportMenu.tsx, make it a plugin) |
| Docusaurus Export | Export | Generates a full Docusaurus site from wiki content |
| Architecture Decision Records | Template | Generates ADR documents from git history and code changes |
| Runbook Generator | Template | Generates operational runbooks from CI/CD configs and deployment scripts |
Plugin Marketplace Architecture:
[Plugin Developer] ──> [Plugin Registry (npm-like)]
│
▼
[Review Queue]
(automated security scan +
manual review for featured)
│
▼
[Marketplace UI]
(browse, search, install)
│
▼
[Plugin Runtime]
(sandboxed execution in
BetterCodeWiki backend)
Security model: Plugins run in a sandboxed environment (WebAssembly or containerized) with limited permissions. Plugins cannot access the network, filesystem, or other plugins unless explicitly granted.
Revenue model: Free plugins listed for free. Premium plugins can charge, with BetterCodeWiki taking a 15-20% platform fee (similar to Shopify App Store, Figma Community).
Product vision: A fully documented REST + WebSocket API that allows any application to generate, query, and manage BetterCodeWiki documentation programmatically.
# OpenAPI 3.0 specification (summary)
paths:
# Wiki Generation
/api/v1/wikis:
post:
summary: Generate a wiki for a repository
requestBody:
content:
application/json:
schema:
type: object
properties:
repo_url: { type: string }
provider: { type: string }
model: { type: string }
options:
type: object
properties:
wiki_type: { type: string, enum: [comprehensive, quick, api-only] }
language: { type: string }
excluded_dirs: { type: array, items: { type: string } }
responses:
202:
description: Wiki generation queued
content:
application/json:
schema:
type: object
properties:
wiki_id: { type: string }
status_url: { type: string }
/api/v1/wikis/{wiki_id}:
get:
summary: Get wiki status and content
/api/v1/wikis/{wiki_id}/pages:
get:
summary: List all pages in a wiki
/api/v1/wikis/{wiki_id}/pages/{page_id}:
get:
summary: Get a specific wiki page
# Search
/api/v1/search:
get:
summary: Search across all wikis in a workspace
parameters:
- name: q
in: query
type: string
- name: workspace_id
in: query
type: string
# Ask AI
/api/v1/ask:
post:
summary: Ask a question about a repository
requestBody:
content:
application/json:
schema:
type: object
properties:
repo_url: { type: string }
question: { type: string }
context: { type: object } # optional: file path, page ID
# Webhooks (outgoing)
/api/v1/webhooks:
post:
summary: Register a webhook for wiki events
get:
summary: List registered webhooks
# Workspace Management
/api/v1/workspaces:
get:
summary: List workspaces
post:
summary: Create a workspace- CI/CD integration: Generate documentation as part of the build pipeline. If docs fail to generate, fail the build.
- Custom dashboards: Build internal dashboards that show documentation coverage and freshness across all repos.
- Chatbot integration: Build a Slack/Discord bot that answers code questions using BetterCodeWiki's API.
- Content pipeline: Export wiki content to feed into internal knowledge bases, training materials, or customer-facing docs.
- Automated testing: Write tests that verify documentation accuracy against code (e.g., "does the API docs page list all endpoints in routes.ts?").
| Tier | Rate Limit | Price |
|---|---|---|
| Free | 100 requests/day | $0 |
| Pro | 5,000 requests/day | Included in Pro ($19/mo) |
| Team | 25,000 requests/day | Included in Team ($29/seat/mo) |
| Enterprise | Unlimited | Included in Enterprise ($59/seat/mo) |
| API-only (no UI) | Custom | $49/mo base + $0.01/request over 10K |
Product vision: Enterprise customers can use their own fine-tuned models for documentation generation, or choose from a curated list of specialized code models.
Extend the existing multi-provider system (OpenAI, Google, OpenRouter, Ollama, Bedrock, etc.) with:
- Custom endpoint support. Any OpenAI-compatible API endpoint (vLLM, TGI, llama.cpp server). This is a simple extension of the existing
api/openai_client.py:
# In api/config.py, add:
CUSTOM_ENDPOINTS = os.getenv("CUSTOM_MODEL_ENDPOINTS", "").split(",")
# Format: "name:url:api_key" e.g., "internal-llama:http://internal:8080/v1:sk-xxx"-
Model recommendation engine. Based on repository characteristics (language, size, framework), recommend the best model:
- Small Python repos (<100 files): GPT-4o-mini (fast, cheap, good enough)
- Large Java enterprise repos (>1000 files): Claude 3.5 Sonnet (best at long context)
- Repos with heavy math/algorithms: Gemini 1.5 Pro (strong at technical reasoning)
- Air-gapped environments: Ollama with CodeLlama or DeepSeek Coder
-
Fine-tuning pipeline (Enterprise). Provide a service that fine-tunes an open model (Llama 3, Mistral, DeepSeek) on a customer's specific codebase and documentation style. The fine-tuned model generates docs that match the organization's voice and terminology.
Implementation: Use an SSO middleware library (e.g., passport-saml for Node.js, or a managed service like WorkOS/Auth0 Enterprise).
| Provider | Protocol | Priority |
|---|---|---|
| Okta | SAML 2.0 + OIDC | P0 |
| Azure AD | OIDC + SAML | P0 |
| Google Workspace | OIDC | P1 |
| OneLogin | SAML | P2 |
| PingIdentity | SAML | P2 |
Enterprise SSO is the single most important gating feature. Without SSO, enterprise procurement cannot approve the tool. Budget: 3-4 weeks of engineering, or $5K-15K/year for a managed SSO service (WorkOS, Clerk Enterprise).
Every significant action is logged to an append-only audit trail:
{
"timestamp": "2026-06-15T14:23:00Z",
"actor": { "id": "user_123", "email": "sarah@acme.com", "ip": "10.0.1.42" },
"action": "wiki.page.regenerated",
"target": { "type": "wiki_page", "id": "page_auth_overview", "repo": "acme/backend" },
"metadata": {
"provider": "openai",
"model": "gpt-4o",
"trigger": "auto_sync",
"commit_sha": "abc123"
},
"workspace_id": "ws_acme"
}Audit logs are:
- Stored for 1 year minimum (configurable up to 7 years for compliance)
- Exportable as CSV/JSON
- Searchable by actor, action, target, and time range
- Available via API for integration with SIEM tools (Splunk, Datadog, etc.)
| Certification | Timeline | Effort | Cost |
|---|---|---|---|
| SOC 2 Type I | Month 8 | 4-6 weeks prep | $20-50K (auditor) |
| SOC 2 Type II | Month 12 | 3-month observation period | $30-80K |
| GDPR compliance | Month 7 | 2-3 weeks | Internal |
| HIPAA (healthcare) | Month 14 | 6-8 weeks | $50-100K |
| FedRAMP (government) | Month 18+ | 12-18 months | $250K+ |
| Milestone | Target Date | Success Metric |
|---|---|---|
| Plugin SDK v1.0 published | Month 7 | npm package available |
| 5 first-party plugins | Month 8 | Published and documented |
| API v1 with documentation | Month 7 | OpenAPI spec published |
| SSO integration (Okta + Azure AD) | Month 8 | First enterprise pilot |
| Audit logs system | Month 8 | Queryable via API |
| SOC 2 Type I certification | Month 9 | Audit report available |
| Plugin marketplace UI | Month 10 | Browsable, installable |
| 10 enterprise customers | Month 12 | Signed contracts |
| $100K MRR | Month 12 | Billing system |
| API: 100 integrations built | Month 12 | API key registrations |
The "Wikipedia of Code" vision from PRODUCT_STRATEGY.md is the single most powerful network effect opportunity. Here is the concrete implementation plan:
-
Auto-generate wikis for the top 10,000 GitHub repositories (by stars). This creates a massive, searchable index of code documentation.
-
SEO optimization. Each wiki page is a static, indexable page with proper meta tags, schema.org markup, and canonical URLs. When a developer searches "how does React fiber reconciler work," BetterCodeWiki should appear on page 1.
-
Community curation layer. Allow anyone with a GitHub account to:
- Upvote/downvote AI-generated explanations
- Add corrections and context
- Submit alternative explanations
- Flag inaccurate content
-
Maintainer verification. Repository maintainers can claim their wiki and add a "Verified by Maintainer" badge. Verified wikis rank higher in search.
Network effect loop:
More public wikis -> More Google traffic -> More developers discover BCW
-> More developers use BCW for their own repos -> More content generated
-> More wikis become community-curated -> Higher quality -> More traffic
As BetterCodeWiki indexes more repositories, it can provide unique insights:
- "Used by" links. "This library is used by 347 indexed repositories. Here is how they typically configure it."
- Pattern detection. "87% of Express.js applications in our index use this middleware pattern for authentication."
- Migration guides. "23 repositories recently migrated from Webpack to Vite. Here is the common migration pattern."
- Dependency documentation. "Your project uses
jsonwebtoken@9.0.0. Here is how it works based on our wiki of the jsonwebtoken repo."
This cross-repository intelligence is a data moat that no competitor can replicate without indexing the same breadth of repositories.
Current state: BetterCodeWiki already supports multi-language wiki generation (via next-intl and the LanguageContext). The foundation is in place.
Expansion plan:
| Priority | Language | Market Size | Strategy |
|---|---|---|---|
| P0 | Japanese | 1.8M developers | Japan has the highest per-seat willingness to pay outside the US. Partner with Japanese developer communities (Qiita, Zenn). |
| P0 | Chinese (Simplified) | 10M+ developers | Largest developer population outside the US. Must navigate regulatory requirements. Partner with Gitee (Chinese GitHub). |
| P1 | Korean | 800K developers | Strong developer community, high adoption of paid dev tools. |
| P1 | German | 1.5M developers | Strong enterprise market. GDPR compliance is a selling point. |
| P1 | Portuguese (Brazilian) | 1.5M developers | Fast-growing developer market in Latin America. |
| P2 | Spanish | 2M+ developers | Large market across Spain and Latin America. |
| P2 | French | 1M developers | Enterprise market, especially in francophone Africa. |
Localization scope:
- UI language (already supported via
next-intl) - Wiki generation in target language (already supported via model prompts)
- Landing page and marketing copy
- Documentation and support
- Local payment methods (especially important for Japan and China)
The insight: Different industries have radically different documentation needs. Vertical solutions command 2-3x premium pricing.
- Regulatory documentation. Auto-generate compliance documentation mapping code to regulatory requirements (PCI-DSS, SOX, MiFID II).
- Audit trail integration. Every code change and documentation update is linked to compliance requirements.
- Risk scoring. Identify code sections that handle financial data and ensure they are thoroughly documented.
- Pricing: $99/seat/month (3x base enterprise)
- HIPAA compliance mapping. Identify code that handles PHI (Protected Health Information) and generate compliance documentation.
- Data flow documentation. Auto-generate data flow diagrams showing how patient data moves through the system.
- HL7/FHIR documentation. Specialized analyzers for healthcare interoperability standards.
- Pricing: $99/seat/month
- FedRAMP-authorized deployment. Self-hosted in GovCloud or on-premises.
- Classification-aware documentation. Support for CUI (Controlled Unclassified Information) marking.
- Air-gapped operation. Full functionality with local models (Ollama), no internet required.
- SBOM (Software Bill of Materials) generation. Auto-generate from dependency analysis.
- Pricing: $149/seat/month (contract-based)
Potential acquisition targets (if funded):
| Target | What They Have | Why Acquire | Estimated Range |
|---|---|---|---|
| Small OSS doc tool (e.g., Mintlify competitor) | User base, templates, content pipeline | Acqui-hire + user base + technology | $1-5M |
| Code visualization startup | Advanced graph visualization, possibly patented algorithms | Technology that would take 12+ months to build | $2-10M |
| Developer analytics tool | Usage data, developer behavior insights | Data moat + analytics features for enterprise | $5-15M |
| Browser extension for GitHub | Distribution (Chrome Web Store presence), user base | Instant distribution channel for BetterCodeWiki's GitHub overlay | $500K-3M |
Build vs. Buy criteria:
- Build if: Core to the product vision, unique to our architecture, <3 months to build
- Buy if: Would take >6 months to build, target has significant user base, technology is non-trivial to replicate
| Milestone | Target Date | Success Metric |
|---|---|---|
| Public wiki directory: 10,000 repos | Month 13 | Auto-generated and indexed |
| SEO traffic: 100K monthly visitors | Month 15 | Google Analytics |
| Cross-repo intelligence v1 | Month 14 | "Used by" and pattern features live |
| Japanese localization + launch | Month 13 | First Japanese enterprise customer |
| Chinese market entry | Month 15 | Gitee integration + localization |
| FinTech vertical v1 | Month 15 | First 3 FinTech customers |
| Community contributions: 1,000 edits/month | Month 18 | Platform analytics |
| $1M+ MRR | Month 18 | Billing system |
| 50+ enterprise customers | Month 18 | CRM |
LOAD BALANCER (Cloudflare / AWS ALB)
│
┌─────────────────────────┼────────────────────────────┐
│ │ │
▼ ▼ ▼
[Next.js Frontend] [FastAPI Backend] [Hocuspocus Server]
(Vercel / K8s) (K8s / ECS) (Collaboration)
- Landing page - Wiki generation - Real-time sync
- Wiki viewer - Chat/Ask AI - Presence
- Workspace UI - Search - Annotations
- Team management - Webhook processing - CRDT merge
│ │ │
│ ┌────┴─────┐ │
│ │ │ │
│ ▼ ▼ │
│ [Task Queue] [AI Providers] │
│ (Celery/Bull) - OpenAI │
│ - Wiki gen - Google Gemini │
│ - Incremental - Anthropic Claude │
│ - Webhooks - OpenRouter │
│ - Notifications - Ollama (local) │
│ │ - Azure OpenAI │
│ │ - AWS Bedrock │
│ │ │
└────────┬───────────┴──────────────────────┬──────────┘
│ │
▼ ▼
[PostgreSQL] [Redis]
- Users, workspaces - Cache
- Wiki structure + pages - Sessions
- Annotations - Rate limiting
- Audit logs - Pub/sub
- Plugin registry - Task queue
│
▼
[Object Storage (S3/R2)]
- Wiki content (large pages)
- Generated diagrams
- Plugin packages
- Export files
[Vector Database (FAISS -> Pinecone/Qdrant)]
- Code embeddings for RAG
- Cross-repo search index
Current state (from codebase analysis):
- Frontend: Next.js 15 + React 19 (deployed separately)
- Backend: FastAPI (Python) with WebSocket support (
api/api.py,api/websocket_wiki.py) - AI: Multi-provider via separate client files (
openai_client.py,bedrock_client.py, etc.) - RAG: FAISS + adalflow (
api/rag.py,api/data_pipeline.py) - Storage: File-based wiki cache (JSON files on disk)
- No database, no user accounts, no authentication
Migration steps:
| Step | Current | Target | Effort | Phase |
|---|---|---|---|---|
| 1 | File-based wiki cache | PostgreSQL + S3 | 2 weeks | 3 |
| 2 | No auth | JWT + OAuth (GitHub, Google) | 3 weeks | 3 |
| 3 | No user accounts | User table + workspace model | 2 weeks | 3 |
| 4 | Single-process backend | Task queue (Celery) for generation | 2 weeks | 3 |
| 5 | FAISS (local) | FAISS + Qdrant (for cross-repo) | 3 weeks | 5 |
| 6 | No real-time collab | Hocuspocus + Yjs | 4 weeks | 4 |
| 7 | Single-server deployment | Kubernetes + Terraform | 3 weeks | 5 |
VS Code Extension
├── src/
│ ├── extension.ts
│ │ ├── activate()
│ │ │ ├── Register SidebarProvider (WebView)
│ │ │ ├── Register HoverProvider
│ │ │ ├── Register CodeLensProvider
│ │ │ ├── Register TreeDataProvider
│ │ │ ├── Register Commands (generate, ask, refresh)
│ │ │ └── Start ContextTracker
│ │ └── deactivate()
│ │
│ ├── providers/
│ │ ├── SidebarProvider.ts
│ │ │ └── Creates a WebView panel
│ │ │ └── Loads React app from webview/dist/
│ │ │ └── Communicates via postMessage API
│ │ │
│ │ ├── HoverProvider.ts
│ │ │ └── Implements vscode.HoverProvider
│ │ │ └── On hover: look up symbol in wiki cache
│ │ │ └── Return MarkdownString with summary
│ │ │
│ │ ├── CodeLensProvider.ts
│ │ │ └── Implements vscode.CodeLensProvider
│ │ │ └── Adds "View Docs" lens above functions/classes
│ │ │ └── Click -> opens sidebar to relevant wiki section
│ │ │
│ │ └── TreeDataProvider.ts
│ │ └── Implements vscode.TreeDataProvider
│ │ └── Shows wiki structure in Explorer panel
│ │ └── Click -> opens sidebar to selected page
│ │
│ ├── services/
│ │ ├── ApiClient.ts
│ │ │ └── HTTP client for BetterCodeWiki API
│ │ │ └── WebSocket client for real-time updates
│ │ │ └── Handles auth (API key from settings)
│ │ │
│ │ ├── ContextTracker.ts
│ │ │ └── Listens to editor events
│ │ │ └── Determines current file, function, symbol
│ │ │ └── Debounced (300ms) to avoid excessive API calls
│ │ │ └── Emits events for sidebar to consume
│ │ │
│ │ └── CacheManager.ts
│ │ └── SQLite database in extension storage
│ │ └── Caches wiki pages, structure, embeddings
│ │ └── TTL-based invalidation (configurable, default 1hr)
│ │ └── Offline mode: serve from cache when disconnected
│ │
│ └── webview/
│ ├── src/
│ │ ├── App.tsx # Main WebView app
│ │ ├── WikiPanel.tsx # Wiki content renderer
│ │ ├── AskPanel.tsx # Ask AI interface
│ │ ├── DiagramPanel.tsx # Mermaid diagram viewer
│ │ └── components/ # Shared components
│ │ ├── Markdown.tsx # Ported from web app
│ │ └── Mermaid.tsx # Ported from web app
│ └── vite.config.ts # Build config for WebView
│
├── package.json # Extension manifest
│ └── contributes:
│ ├── viewsContainers # Sidebar panel registration
│ ├── views # Tree views
│ ├── commands # bettercodewiki.generate, .ask, .refresh
│ ├── configuration # Settings (API key, server URL, etc.)
│ ├── menus # Right-click context menu items
│ └── keybindings # Ctrl+Shift+W: toggle sidebar
│
└── .vscodeignore # Files to exclude from .vsix package
-- Users and Authentication
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
email VARCHAR(255) UNIQUE NOT NULL,
name VARCHAR(255),
avatar_url VARCHAR(500),
auth_provider VARCHAR(50), -- github, google, email
auth_provider_id VARCHAR(255),
api_key VARCHAR(255) UNIQUE, -- For API/extension access
plan VARCHAR(50) DEFAULT 'free',
created_at TIMESTAMP DEFAULT NOW(),
last_login_at TIMESTAMP
);
-- (workspace tables from Section 4.1.1)
-- Wiki Storage
CREATE TABLE wikis (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
workspace_id UUID REFERENCES workspaces(id),
repo_url VARCHAR(500) NOT NULL,
repo_type VARCHAR(50),
structure JSONB NOT NULL, -- WikiStructure
provider VARCHAR(50),
model VARCHAR(100),
language VARCHAR(10) DEFAULT 'en',
status VARCHAR(50) DEFAULT 'generating', -- generating, ready, error, stale
last_synced_commit VARCHAR(40),
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW(),
UNIQUE(workspace_id, repo_url)
);
CREATE TABLE wiki_pages (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
wiki_id UUID REFERENCES wikis(id) ON DELETE CASCADE,
page_id VARCHAR(255) NOT NULL, -- Original page ID from generation
title VARCHAR(500),
content TEXT,
file_paths TEXT[], -- Source files this page covers
importance VARCHAR(20),
related_pages TEXT[],
version INTEGER DEFAULT 1,
last_verified_commit VARCHAR(40),
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW(),
UNIQUE(wiki_id, page_id)
);
-- File-to-Page Index (for incremental regeneration)
CREATE TABLE file_page_index (
wiki_id UUID REFERENCES wikis(id) ON DELETE CASCADE,
file_path VARCHAR(500) NOT NULL,
page_id VARCHAR(255) NOT NULL,
PRIMARY KEY (wiki_id, file_path, page_id)
);
CREATE INDEX idx_file_page_index_file ON file_page_index(wiki_id, file_path);
-- Generation Queue
CREATE TABLE generation_jobs (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
wiki_id UUID REFERENCES wikis(id),
job_type VARCHAR(50), -- full, incremental, single_file
status VARCHAR(50) DEFAULT 'queued', -- queued, processing, completed, failed
page_ids TEXT[], -- Pages to regenerate (null for full)
trigger VARCHAR(50), -- manual, push_webhook, pr_merge, schedule
commit_sha VARCHAR(40),
started_at TIMESTAMP,
completed_at TIMESTAMP,
error_message TEXT,
created_at TIMESTAMP DEFAULT NOW()
);
-- Plugins
CREATE TABLE plugins (
id VARCHAR(255) PRIMARY KEY, -- npm-style: @author/plugin-name
name VARCHAR(255) NOT NULL,
description TEXT,
version VARCHAR(50),
author_id UUID REFERENCES users(id),
category VARCHAR(100),
install_count INTEGER DEFAULT 0,
package_url VARCHAR(500), -- S3 URL for plugin package
manifest JSONB,
published_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW()
);
CREATE TABLE workspace_plugins (
workspace_id UUID REFERENCES workspaces(id),
plugin_id VARCHAR(255) REFERENCES plugins(id),
installed_at TIMESTAMP DEFAULT NOW(),
config JSONB DEFAULT '{}',
PRIMARY KEY (workspace_id, plugin_id)
);
-- API Keys and Webhooks
CREATE TABLE api_keys (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID REFERENCES users(id),
workspace_id UUID REFERENCES workspaces(id),
key_hash VARCHAR(255) NOT NULL, -- SHA256 hash of the API key
name VARCHAR(255),
scopes TEXT[], -- read, write, admin
last_used_at TIMESTAMP,
expires_at TIMESTAMP,
created_at TIMESTAMP DEFAULT NOW()
);
CREATE TABLE webhooks_outgoing (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
workspace_id UUID REFERENCES workspaces(id),
url VARCHAR(500) NOT NULL,
events TEXT[] NOT NULL, -- wiki.generated, page.updated, annotation.created
secret VARCHAR(255), -- HMAC secret for signature verification
active BOOLEAN DEFAULT true,
created_at TIMESTAMP DEFAULT NOW()
);Launch with a simple two-tier model to start generating revenue:
| Feature | Free | Pro ($19/month) |
|---|---|---|
| Public repos | Unlimited | Unlimited |
| Private repos | 3 | Unlimited |
| AI providers | All | All |
| Ask AI | 20/day | Unlimited |
| VS Code extension | View-only | Full |
| GitHub App | -- | Yes |
| Auto-sync on push | -- | Yes |
| Export | Markdown only | All formats |
| Support | Community | Email (48h) |
Why start simple: Two tiers reduce decision paralysis for early adopters. The free tier must demonstrate clear value; the Pro tier must feel like an obvious upgrade for anyone using BetterCodeWiki daily.
Add Team tier when collaboration features ship:
| Feature | Free | Pro ($19/month) | Team ($29/seat/month) |
|---|---|---|---|
| Everything in Free/Pro | Yes | Yes | Yes |
| Team workspace | -- | -- | Yes |
| Collaborative annotations | -- | -- | Yes |
| Real-time presence | -- | -- | Yes |
| Review workflows | -- | -- | Yes |
| Slack/Discord integration | -- | -- | Yes |
| Priority support | -- | -- | Email (24h) |
| Max team size | -- | -- | 50 |
Add Enterprise tier when enterprise features ship:
| Feature | Team ($29/seat/month) | Enterprise ($59/seat/month) |
|---|---|---|
| Everything in Team | Yes | Yes |
| SSO (SAML/OIDC) | -- | Yes |
| Audit logs | -- | Yes |
| RBAC | -- | Yes |
| Custom AI model support | -- | Yes |
| On-premises deployment | -- | Yes |
| SLA (99.9% uptime) | -- | Yes |
| Dedicated support (4h) | -- | Yes |
| SOC 2 compliance | -- | Yes |
| API (unlimited) | Standard | Unlimited + webhooks |
| Plugin marketplace access | Community | Community + premium |
| Max team size | 50 | Unlimited |
Add vertical-specific pricing:
| Vertical | Base | Premium Features |
|---|---|---|
| Standard Enterprise | $59/seat/month | All enterprise features |
| FinTech | $99/seat/month | + Compliance mapping, risk scoring |
| Healthcare | $99/seat/month | + HIPAA compliance, PHI detection |
| Government | $149/seat/month | + FedRAMP, air-gapped, CUI marking |
All tiers: 20% discount for annual billing.
| Tier | Monthly | Annual (per month) | Annual Total |
|---|---|---|---|
| Pro | $19 | $15.20 | $182.40 |
| Team | $29/seat | $23.20/seat | $278.40/seat |
| Enterprise | $59/seat | $47.20/seat | $566.40/seat |
| FinTech/Healthcare | $99/seat | $79.20/seat | $950.40/seat |
| Government | $149/seat | $119.20/seat | $1,430.40/seat |
| Week | Milestone | Channel | Expected Impact |
|---|---|---|---|
| W1-2 | VS Code extension alpha (internal) | -- | Foundation |
| W3 | GitHub App beta (selected repos) | GitHub | 50 beta testers |
| W4 | VS Code extension beta (public) | VS Code Marketplace | 200 installs |
| W5-6 | "Understanding [Popular Repo] with BetterCodeWiki" blog series | Dev.to, Hashnode, HN | 5K visitors |
| W7-8 | VS Code extension v1.0 launch | HN, Reddit, Twitter/X, VS Code Marketplace | 1,000 installs |
| W9 | GitHub App v1.0 launch | Product Hunt, GitHub Marketplace | 100 installations |
| W10 | Stripe integration live, Pro tier available | In-product | First $1K MRR |
| W11-12 | Auto-doc-on-push shipping to all Pro users | Email to Pro users | 50% Pro retention |
Q1 targets: 1,000 VS Code installs, 100 GitHub App installs, 50 Pro subscribers ($950 MRR)
| Week | Milestone | Channel | Expected Impact |
|---|---|---|---|
| W13-14 | Team workspaces beta | Invite-only | 20 beta teams |
| W15-16 | Annotations system launch | In-product | 10x engagement for team users |
| W17-18 | Slack integration launch | Slack App Directory | 50 workspace installs |
| W19-20 | Team tier launch ($29/seat) | Product Hunt (again), email | 30 Team subscriptions |
| W21-22 | Public wiki directory (top 1,000 repos) | SEO, social | 10K monthly visitors |
| W23-24 | Developer conference talks (2-3 conferences) | In-person | Brand awareness |
Q2 targets: 5,000 VS Code installs, 50 team workspaces, 200 paying customers ($15K MRR)
| Week | Milestone | Channel | Expected Impact |
|---|---|---|---|
| W25-28 | SSO + Audit logs + RBAC shipped | Direct outreach | Enterprise readiness |
| W29-30 | SOC 2 Type I audit completed | Security page | Enterprise gate cleared |
| W31-32 | Plugin SDK v1.0 + 5 first-party plugins | Developer blog, npm | Ecosystem kickoff |
| W33-34 | API v1 with documentation | Developer portal | 50 API integrations |
| W35-36 | First enterprise pilot program (3-5 companies) | Direct sales | $50-100K pipeline |
Q3 targets: 10K VS Code installs, 10 enterprise pilots, $50K MRR
| Week | Milestone | Channel | Expected Impact |
|---|---|---|---|
| W37-40 | Enterprise tier launch ($59/seat) | Direct sales | First 5 enterprise deals |
| W41-42 | Plugin marketplace launch | Developer community | 10 third-party plugins |
| W43-44 | Public wiki directory (10,000 repos) | SEO | 50K monthly visitors |
| W45-48 | SOC 2 Type II observation period complete | Compliance | Enterprise gate cleared |
Q4 targets: 25K VS Code installs, 10 enterprise customers, 500 total paying customers ($100K MRR)
| Quarter | Key Milestones | Revenue Target |
|---|---|---|
| Q5 | Japanese market launch, FinTech vertical, cross-repo intelligence | $200K MRR |
| Q6 | Chinese market entry, Healthcare vertical, FedRAMP prep, acquisitions | $500K-1M MRR |
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| VS Code extension low adoption | Medium | High | Launch with compelling demo video; target Continue.dev and Copilot communities; ensure extension is genuinely useful in free tier |
| GitHub webhook reliability issues | Low | Medium | Implement webhook retry logic with exponential backoff; add manual "sync now" button as fallback |
| Incremental regeneration produces inconsistent docs | Medium | Medium | Fall back to full regeneration if incremental quality score is below threshold; A/B test incremental vs. full |
| API latency too high for IDE sidebar | Medium | High | Aggressive caching (SQLite in extension); pre-fetch wiki for all open files; lazy load diagrams |
| WebView performance in VS Code | Low | Medium | Optimize React bundle size; use virtual scrolling for long pages; profile and optimize render pipeline |
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Low team adoption (single-player tool mentality) | High | Very High | Build viral invite mechanics; show clear ROI of team features; offer extended trial for teams |
| Real-time sync conflicts | Low | High | Use Yjs CRDT (conflict-free by design); extensive testing with simultaneous editors |
| Slack/Discord integration maintenance burden | Medium | Medium | Use official SDKs; build generic notification system that adapts to any platform |
| CODEOWNERS integration breaks on edge cases | Medium | Low | Make CODEOWNERS integration optional; fall back to manual reviewer assignment |
| Annotation anchoring breaks when AI regenerates content | High | Medium | Use fuzzy text matching for anchor repositioning; notify users when anchors move; allow manual re-anchoring |
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Plugin ecosystem does not attract developers | Medium | High | Build compelling first-party plugins to demonstrate value; offer bounties for community plugins; make SDK extremely easy to use |
| SSO implementation delays enterprise deals | Medium | Very High | Use managed SSO service (WorkOS) to ship faster; do not build from scratch |
| SOC 2 certification takes longer than expected | Medium | High | Start the process in Month 6 (before Phase 5 officially begins); hire a compliance consultant |
| API abuse (scraping, excessive generation) | Medium | Medium | Implement rate limiting (already have Redis planned); require API keys; monitor usage patterns |
| Custom model fine-tuning quality is poor | High | Medium | Set clear expectations; offer "model assessment" service; provide quality benchmarks; make fine-tuning optional, not required |
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Google CodeWiki ships IDE extension and team features | High | Very High | Maintain speed advantage; compete on openness, privacy, and multi-provider; focus on segments Google neglects (GitLab/Bitbucket, self-hosted, regulated industries) |
| GitHub Copilot adds documentation features | High | High | Differentiate on depth (full wikis vs. inline docs); offer cross-platform support; emphasize living documentation |
| International expansion slows due to localization complexity | Medium | Medium | Start with Japanese (high value) and Chinese (high volume); use AI for translation; hire local community managers |
| Vertical solutions require deep domain expertise | High | Medium | Partner with domain experts rather than building in-house; hire advisors from FinTech/Healthcare |
| Public wiki directory attracts legal/copyright concerns | Low | High | Only process public repositories; respect robots.txt; provide opt-out mechanism for repository owners; consult with IP lawyer |
| Acquisition targets are overpriced or poorly integrated | Medium | Medium | Acqui-hire over technology acquisition; ensure cultural fit; start with small deals (<$3M) to build M&A muscle |
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| AI model costs exceed revenue | Medium | Very High | Implement tiered generation (cheaper models for free tier, premium for paid); cache aggressively; negotiate volume discounts with AI providers |
| Team burnout (small team, ambitious roadmap) | High | Very High | Hire aggressively in Q2; phase delivery realistically; cut scope if needed (better to ship 80% on time than 100% late) |
| Funding gap before enterprise revenue materializes | Medium | Very High | Bootstrap with Pro/Team revenue; seek seed funding ($1-3M) during Q2 to fund enterprise push; keep burn rate low |
| Security breach exposes customer code | Low | Critical | Security-first architecture (encryption at rest + in transit); regular penetration testing; bug bounty program; SOC 2 certification |
Why first: Every subsequent feature (payments, workspaces, API keys, extension auth) depends on user accounts. This is the foundation.
Specification:
- OAuth 2.0 via GitHub (primary) and Google (secondary)
- JWT-based session management
- API key generation for programmatic access (VS Code extension, API)
- User profile storage (PostgreSQL)
- Session management (Redis)
Technical implementation:
// New file: src/app/api/auth/github/route.ts
// GitHub OAuth flow:
// 1. Redirect to GitHub authorization URL
// 2. Handle callback with authorization code
// 3. Exchange code for access token
// 4. Fetch user profile from GitHub API
// 5. Create/update user in database
// 6. Issue JWT session token
// 7. Redirect to app with session cookie
// New middleware: src/middleware.ts
// Protect routes that require authentication
// Check JWT validity on every request to /api/* (except /api/auth/*)Database migration: Create users table (see Section 7.4)
Estimated effort: 1 engineer, 2 weeks
Why second: The file-based wiki cache cannot support multi-user, multi-workspace, or incremental updates. PostgreSQL is required for all Phase 3+ features.
Specification:
- Migrate from JSON files on disk to PostgreSQL
- Create tables:
users,wikis,wiki_pages,file_page_index,generation_jobs - Implement data migration script for existing cached wikis
- Set up connection pooling (pg-pool)
- Add database health check endpoint
Migration strategy:
- Deploy PostgreSQL alongside existing file cache
- Write migration script that reads existing JSON caches and inserts into PostgreSQL
- Update
api/api.pyto read/write from PostgreSQL instead of files - Keep file cache as fallback for 2 weeks, then remove
Estimated effort: 1 backend engineer, 3 weeks
Why third: The extension is the highest-impact differentiator. Starting the skeleton early allows parallel development while auth and database are being built.
Specification (MVP):
- Extension activates when a workspace contains a Git repository
- Sidebar panel shows "Connect to BetterCodeWiki" (authentication flow)
- Once connected, shows wiki tree structure for the current repo
- Clicking a tree item opens the wiki page in the sidebar (WebView with Markdown rendering)
- Basic context tracking: when the user switches files, highlight the relevant wiki section
What is NOT in the MVP:
- Hover documentation (Phase 3, Sprint 3)
- CodeLens (Phase 3, Sprint 3)
- Ask AI (Phase 3, Sprint 3)
- Offline cache (Phase 3, Sprint 4)
Estimated effort: 1 frontend engineer, 3 weeks
Why fourth: The VS Code extension needs to generate documentation for individual files on-demand, not just serve pre-generated full wikis. This is a new backend capability.
Specification:
- New endpoint:
POST /api/wiki/file(see Section 3.1.4) - Takes a repo URL + file path + provider/model
- Generates documentation for just that file (and its immediate dependencies)
- Returns structured wiki page content
- Caches the result in PostgreSQL for subsequent requests
- 5-15 second response time target
Implementation approach:
- Extract the file content and its imports/dependencies
- Build a focused prompt: "Generate documentation for this file in the context of these related files"
- Use the existing AI client infrastructure (
openai_client.py, etc.) - Cache the result with the file's git hash for invalidation
Estimated effort: 1 backend engineer, 2 weeks
Why fifth: Revenue must start flowing as soon as possible. The Pro tier ($19/month) should be available when the VS Code extension launches.
Specification:
- Stripe Checkout for subscription creation
- Stripe Billing Portal for subscription management (upgrade, downgrade, cancel)
- Webhook handler for subscription events (created, updated, cancelled, payment failed)
- Plan enforcement: check user's plan before allowing premium features
- Usage tracking: count private repos, Ask AI questions, API calls per day
Implementation:
// New file: src/app/api/billing/checkout/route.ts
// Creates a Stripe Checkout session for the selected plan
// New file: src/app/api/billing/webhook/route.ts
// Handles Stripe webhook events
// New file: src/app/api/billing/portal/route.ts
// Creates a Stripe Billing Portal session
// New component: src/components/PricingPage.tsx
// Displays pricing tiers with "Subscribe" buttons
// New middleware addition:
// Check user's plan and enforce limits (private repo count, daily question limit)Stripe products to create:
bcw_pro_monthly-- $19/monthbcw_pro_annual-- $182.40/year ($15.20/month)
Estimated effort: 1 full-stack engineer, 2 weeks
| Decision | Choice | Rationale |
|---|---|---|
| Database | PostgreSQL | Best fit for relational data (users, workspaces, RBAC) + JSONB for wiki content; mature, well-supported |
| Cache | Redis | Session management, rate limiting, pub/sub for real-time; industry standard |
| Task Queue | Celery (Python) | Natural fit with existing FastAPI backend; supports scheduled tasks, retries, priorities |
| Real-time Collaboration | Yjs + Hocuspocus | CRDT-based (conflict-free), MIT licensed, used by Notion/Figma, excellent React bindings |
| Vector Database | Keep FAISS (Phase 3-4), migrate to Qdrant (Phase 5) | FAISS works for single-repo; Qdrant needed for cross-repo search at scale |
| Object Storage | AWS S3 or Cloudflare R2 | R2 preferred (no egress fees, S3-compatible API) |
| Hosting | Vercel (frontend) + AWS/Railway (backend) initially; Kubernetes at scale | Minimize infrastructure complexity early; K8s when >10 enterprise customers |
| SSO | WorkOS or Clerk Enterprise | Build vs. buy: SSO is complex and error-prone; managed service ships faster |
| Payments | Stripe | Industry standard for SaaS billing; excellent developer experience |
| VS Code Extension Framework | Standard VS Code Extension API + React (WebView) | Maximum control + code reuse from web app |
| Analytics | PostHog (self-hosted) | Open-source, self-hostable (important for privacy positioning), feature flags |
| Role | Phase | Priority | Rationale |
|---|---|---|---|
| Senior Backend Engineer (Python/FastAPI) | 3 | P0 | Build auth, database migration, webhook system, API |
| Senior Frontend Engineer (React/TypeScript) | 3 | P0 | Build VS Code extension, workspace UI |
| DevOps/Infrastructure Engineer | 3-4 | P1 | Set up PostgreSQL, Redis, CI/CD, staging environments |
| Product Designer | 4 | P1 | Design collaboration UI, workspace experience, extension UX |
| Developer Advocate | 4 | P1 | Content marketing, conference talks, community management |
| Enterprise Sales Lead | 5 | P0 | Drive enterprise pipeline, manage pilots, close deals |
| Security Engineer | 5 | P1 | SOC 2 preparation, security audits, pen testing |
| Additional Backend Engineers (2) | 5-6 | P1 | Plugin system, API platform, vertical features |
| Additional Frontend Engineers (2) | 5-6 | P1 | Marketplace UI, enterprise admin dashboard |
Team size trajectory:
- Phase 3: 3-4 people (founders + 1-2 hires)
- Phase 4: 5-7 people
- Phase 5: 8-12 people
- Phase 6: 15-20 people
- AI model costs will continue to decrease. Current trend: 2-5x cost reduction per year for equivalent capability. This assumption underpins the free tier economics.
- Developer willingness to pay for AI tools will increase. The success of Cursor ($100M+ ARR in <2 years) suggests this market is real and growing.
- GitHub will not build a comprehensive documentation tool. GitHub's focus is on Copilot (code generation) and Actions (automation). Documentation remains a peripheral feature for them.
- Google CodeWiki will remain closed-source and Gemini-only. Google's organizational incentives prevent them from supporting competing AI providers.
- Enterprise sales cycles are 3-6 months. SOC 2 and SSO are hard gates. Pipeline must start 6 months before target revenue dates.
- The open-source community will contribute. The existing contributor base and MIT license create favorable conditions for community growth, especially if the product gains traction.
This plan should be reviewed monthly and adjusted based on actual metrics, market feedback, and resource availability. Phases are sequential in theme but may overlap in execution. The first 5 priorities in Section 11 are the critical path -- nothing else matters until these are shipping.