Skip to content

aydiler/chrome-mcp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Chrome MCP Server

MCP (Model Context Protocol) server for Chrome browser automation with accessibility tree support. Enables AI assistants to interact with web applications using semantic element selection instead of screen coordinates.

Features

  • Accessibility Tree Navigation - Uses Chrome's accessibility tree (like Playwright MCP)
  • Element References - Interact with elements via refs like [ref=e1] (same as egui/Tauri MCPs)
  • Virtual Display Support - Works with Xvfb for headless CI testing
  • Full Browser Control - Navigate, click, type, select, evaluate JavaScript
  • Consistent API - Matches egui-mcp and tauri-mcp interfaces

Architecture

Claude Code ──stdio──▶ MCP Server ──Puppeteer──▶ Chrome Browser
                       (TypeScript)      CDP         (Headless/GUI)

Key Design:

  • Uses Puppeteer Core with Chrome DevTools Protocol (CDP)
  • Accessibility tree extraction via Accessibility.getFullAXTree
  • Element references (e1, e2, etc.) stored as data-mcp-ref attributes
  • Singleton browser instance pattern

Installation

npm install
npm run build

Prerequisites:

  • Node.js >= 18.0.0
  • Chrome, Chromium, or Brave browser installed

Supported platforms:

  • Linux (tested on Arch/KDE Plasma)
  • macOS
  • Windows

Available Tools

Session Management

  • chrome_connect - Connect to Chrome browser
  • chrome_launch - Launch Chrome with options
  • chrome_disconnect - Disconnect and cleanup
  • chrome_status - Get connection status

Navigation

  • chrome_navigate - Navigate to URL
  • chrome_reload - Reload page
  • chrome_back - Go back
  • chrome_forward - Go forward

Inspection

  • chrome_snapshot - Get accessibility tree with element refs
  • chrome_get_text - Get element text content
  • chrome_get_value - Get input element value
  • chrome_get_attribute - Get element attribute

Interaction

  • chrome_click - Click element by ref
  • chrome_type - Type text into input
  • chrome_fill - Set value directly (sliders, etc.)
  • chrome_select - Select dropdown option
  • chrome_hover - Hover over element

Evaluation

  • chrome_evaluate - Execute JavaScript in page context

Usage Examples

Basic Workflow

// 1. Launch browser
chrome_launch({
  url: 'https://example.com',
  headless: true
})

// 2. Get accessibility snapshot
chrome_snapshot()
// Returns:
// - button "Sign In" [ref=e1]
// - textbox "Email" [ref=e2]
// - textbox "Password" [ref=e3]

// 3. Interact by ref
chrome_type({ ref: 'e2', text: 'user@example.com' })
chrome_type({ ref: 'e3', text: 'password123' })
chrome_click({ ref: 'e1' })

// 4. Cleanup
chrome_disconnect()

Virtual Display Testing (Xvfb)

For isolated E2E testing without interfering with your desktop:

# 1. Start Xvfb (once per session)
Xvfb :99 -screen 0 1920x1080x24 &
// 2. Launch browser on virtual display
chrome_launch({
  url: 'https://example.com',
  headless: false,  // Run with GUI on virtual display
  display: ':99'     // Use Xvfb display
})

// 3. Interact normally
chrome_snapshot()
chrome_click({ ref: 'e1' })

// 4. Take screenshot from virtual display (if you have screenshot-mcp)
screenshot_window({ pattern: 'Chrome', display: ':99' })

// 5. Cleanup
chrome_disconnect()

Benefits of virtual display:

  • Tests don't steal focus or interfere with desktop
  • Consistent window positioning and size
  • Can run in headless CI environments
  • Screenshots are isolated from real screen content

Testing Web Applications

// Launch your local dev server
chrome_launch({
  url: 'http://localhost:3000',
  headless: false,
  width: 1920,
  height: 1080
})

// Interact with UI
chrome_snapshot()
chrome_click({ ref: 'e5' })  // Click button
chrome_fill({ ref: 'e7', value: '50' })  // Set slider to 50

// Evaluate JavaScript
chrome_evaluate({ script: 'document.title' })
chrome_evaluate({ script: 'localStorage.getItem("user")' })

// Navigate
chrome_navigate({ url: 'http://localhost:3000/dashboard' })
chrome_back()

chrome_disconnect()

Form Automation

chrome_launch({ url: 'https://example.com/form' })

chrome_snapshot()
// - textbox "Name" [ref=e1]
// - combobox "Country" [ref=e2]
// - checkbox "Subscribe" [ref=e3]
// - button "Submit" [ref=e4]

chrome_type({ ref: 'e1', text: 'John Doe' })
chrome_select({ ref: 'e2', value: 'United States' })
chrome_click({ ref: 'e3' })  // Check checkbox
chrome_click({ ref: 'e4' })  // Submit

chrome_disconnect()

MCP Configuration

Add to your .mcp.json or Claude Desktop config:

Development (with tsx):

{
  "mcpServers": {
    "chrome-webdriver": {
      "command": "npx",
      "args": ["tsx", "src/index.ts"],
      "cwd": "/absolute/path/to/chrome-mcp"
    }
  }
}

Production (compiled):

{
  "mcpServers": {
    "chrome-webdriver": {
      "command": "node",
      "args": ["dist/index.js"],
      "cwd": "/absolute/path/to/chrome-mcp"
    }
  }
}

Global installation:

npm install -g .
{
  "mcpServers": {
    "chrome-webdriver": {
      "command": "mcp-chrome-webdriver"
    }
  }
}

Comparison with Other MCP Servers

Feature chrome-mcp Playwright MCP Puppeteer MCP Tauri MCP
Accessibility Tree
Element Refs [ref=e1] [ref=e1]
Virtual Display ✅ Xvfb ⚠️ Limited ✅ Xvfb
Browser Support Chrome only Multi-browser Chrome only Tauri only
Consistent API ✅ Matches egui/Tauri

Why use chrome-mcp?

  • Unified interface with egui-mcp and tauri-mcp
  • Lightweight (Puppeteer Core)
  • Virtual display integration out of the box
  • Chrome-specific optimizations

Development

# Install dependencies
npm install

# Run in development mode
npm run dev

# Build
npm run build

# Type check
npm run typecheck

How It Works

Accessibility Tree Extraction

  1. MCP server connects to Chrome via Puppeteer
  2. Uses Chrome DevTools Protocol Accessibility.getFullAXTree API
  3. Injects data-mcp-ref attributes into interactive elements
  4. Returns text-based tree with element references

Element Selection

  1. chrome_snapshot assigns refs (e1, e2, ...) to elements
  2. Refs stored as data-mcp-ref attribute on DOM elements
  3. Interaction tools find elements using CSS selector [data-mcp-ref="e1"]
  4. Fallback to __mcpRef property search if attribute not found

Virtual Display Detection

When display parameter is provided (e.g., :99):

  • Sets DISPLAY environment variable for Chrome process
  • Chrome automatically uses the specified X11 display
  • Works on Wayland systems (unlike native apps that need WINIT_UNIX_BACKEND=x11)

Testing

Test the server manually:

# List available tools
echo '{"jsonrpc":"2.0","id":1,"method":"tools/list"}' | npx tsx src/index.ts

# Call a tool (after starting server in separate terminal)
echo '{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"chrome_status","arguments":{}}}' | npx tsx src/index.ts

Troubleshooting

Chrome not found:

  • Install Chrome, Chromium, or Brave browser
  • The server checks common installation paths automatically

Virtual display not working:

  • Ensure Xvfb is running: ps aux | grep Xvfb
  • Verify DISPLAY: DISPLAY=:99 xdpyinfo | head
  • Use headless=false to see the browser window on virtual display

Element refs not found:

  • Always call chrome_snapshot before interacting with elements
  • Refs are regenerated on each snapshot
  • Page navigation clears refs (call snapshot again)

License

MIT

Related Projects

About

MCP server for Chrome browser E2E testing via CDP accessibility tree and semantic element refs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors