Skip to content

Owloops/browserbird

Repository files navigation

BrowserBird

BrowserBird

A self-hosted AI assistant in Slack, with a real browser and a scheduler.

License: MIT npm version Node.js

BrowserBird gives an AI agent a place to work: a seat in your Slack channels, a schedule, keys and instructions scoped per channel, and a real Chromium browser you can watch live over VNC. Chat with it in Slack threads or from the web dashboard, trigger it with slash commands or the CLI, and set up automations that run on a cron and post results where your team already works. The browser keeps logins and cookies across runs, so it handles sites that require authentication.

BrowserBird is the harness, not the agent. The agent CLI supplies reasoning, memory, tools, and sub-agents. BrowserBird decides where the agent runs, when it runs, what it can access, and where the results land.

Use Cases

Most of these come from live deployments.

Use Case What it does
Social media Drafts X and LinkedIn posts in your style (text, images, video, carousels) from the browser, then publishes or schedules them.
Database analytics Runs in a private network with read-only database access. Non-technical teammates ask questions in Slack and get answers with tables and trends.
Cloud and DevOps A weekly automation compares AWS spend to the previous week through a read-only IAM user and flags deviations. The same pattern covers infra, logs, and metrics troubleshooting.
Code review Bind a GitHub token and your style guides to a channel, then ask for pull request reviews on demand.
News digest Browses your sources every morning and posts a summary to the channel.
Job and listing tracking Watches job boards or marketplaces for listings matching your criteria and tracks application progress. Job hunting for individuals, head hunting for teams.

These are starting points. Every automation has a full AI agent behind it that can browse the web, run shell commands, write and analyze code, call APIs, use MCP servers, and work with any CLI tool installed in the environment.

Installation

On first run, open the web UI and complete the onboarding wizard. It walks through agent config, API keys, and optional integrations (Slack, browser).

Docker

curl -fsSL https://raw.githubusercontent.com/Owloops/browserbird/main/compose.yml -o compose.yml
docker compose up -d

The image bundles the agent CLI, Chromium, VNC, and Playwright MCP. Open http://<host>:18800 to begin onboarding. The browser keeps logins across sessions and runs one agent at a time by default. Set BROWSER_MODE=isolated in .env for parallel sessions with fresh contexts (restart required).

Railway

Deploy on Railway

Two services are deployed: browserbird-app (web dashboard, API, Slack) and browserbird-vm (Chromium browser, VNC). Open the app service URL for the dashboard. Deploy both services in the same region, closest to you, since a distant region makes VNC latency too high for interactive browser tasks. The app service volume at /app/.browserbird persists your database and config across redeployments, and enabling automatic deployments in Railway service settings keeps the version current.

AWS

Launch Stack

Deploys both containers as a sidecar pair on ECS Fargate with an ALB and EFS for persistent storage. Select your VPC and subnets, create the stack (CloudFormation asks you to acknowledge CAPABILITY_NAMED_IAM), then open the dashboard URL to complete onboarding.

HTTPS, database access, and DNS
  • HTTPS: provide an ACM certificate ARN at stack creation to enable TLS on the ALB. After the stack is created, add a DNS record (CNAME or alias) pointing to the ALB DNS name from the stack outputs, so the domain matches your certificate.
  • Database access: provide an RDS security group ID and the template adds an ingress rule so BrowserBird can reach your database. Store database credentials in BrowserBird's vault keys after deployment.

Slack

Slack is the primary interface: conversational threads, slash commands, and the channel-scoped bindings that give each project its own keys, skills, and automations. Create the app from the pre-filled manifest:

Create Slack App

The manifest pre-configures all scopes, events, and slash commands. After creating the app, install it to your workspace and grab two tokens: the Bot User OAuth Token (xoxb-...) from OAuth & Permissions, and an app-level token (xapp-...) with connections:write scope from Basic Information.

Slash Commands

Once the app is installed, /automation is available in any channel:

/automation list              Show all configured automations
/automation run <name> [args] Trigger an automation (replaces $ARGUMENTS in the prompt)
/automation stop <name>       Stop a running automation
/automation logs <name>       Show recent runs
/automation enable <name>     Enable an automation
/automation disable <name>    Disable an automation
/automation create            Create a new automation (opens modal form)
/automation status            Show daemon status

To stop an active response in a thread, send stop as a message (or @BrowserBird stop in channels). The agent process is killed and a reaction is added to confirm.

Tip

If /automation fails or routes to the wrong app, you may have another Slack app in the workspace with the same slash command. Remove or rename the duplicate from api.slack.com/apps.

Skills

Store markdown documents in .browserbird/docs/ that get injected into the agent's system prompt at spawn time. Use them for tone guides, project context, channel-specific instructions, or any reusable prompt content.

  • File-backed. Each skill is a .md file you can edit with any text editor. Drop a file in the directory and it gets auto-discovered.
  • Scoped with bindings. Bind a skill to specific channels via the web UI or CLI. Use * as the channel to apply everywhere. Unbound skills are not injected (same semantics as vault keys).
  • Managed from the web UI or CLI. Create, edit, and manage bindings from the Skills page, or use browserbird skills from the terminal.

Vault Keys

Store API keys and secrets in the web UI (Resources, Keys tab) and bind them to specific channels. At spawn time, bound keys are injected as environment variables into the agent subprocess.

  • Encrypted at rest with AES-256-GCM. The encryption key is auto-generated on first start and stored in .env as BROWSERBIRD_VAULT_KEY.
  • Redacted from output. If the agent prints a vault key value, it appears as [redacted] in Slack and logs.
  • Bound to channels. A key bound to channel * applies to all channels. A key bound to a specific channel applies only there.

Example: GitHub integration. Store a GitHub personal access token as GITHUB_TOKEN in the vault and bind it to a channel (or * for all channels). The agent can then create issues, open PRs, push code, review changes, and manage repositories using the GitHub API or CLI.

Example: authenticated browsing. For automations that need to browse logged-in sites (X, LinkedIn, etc.), export cookies from your local browser using Cookie-Editor, store the JSON as a vault key (e.g. X_COOKIES), and bind it to the channel where the automation runs. In the automation's prompt, instruct it to read the cookies from the env var and inject them via addCookies before browsing. Pre-injecting cookies ensures the agent starts in a logged-in state, making scheduled tasks more reliable.

CLI

Available on npm: npx @owloops/browserbird

$ browserbird --help

   .__.
   ( ^>
   / )\
  <_/_/
   " "
usage: browserbird [command] [options]

commands:

  sessions    manage sessions and chat
  automations manage scheduled automations
  skills      manage skills (agent instructions)
  keys        manage vault keys
  config      view configuration
  logs        show recent log entries
  jobs        inspect and manage the job queue
  backups     manage database backups
  doctor      check system dependencies
  login       authenticate to the daemon
  logout      clear saved daemon credentials
  whoami      show the authenticated user
  reset-password  reset a password locally (run on the host)

options:

  -h, --help     show this help
  -v, --version  show version
  --verbose      enable debug logging
  --config       config file path (env: BROWSERBIRD_CONFIG)
  --db           database file path (env: BROWSERBIRD_DB)

run 'browserbird <command> --help' for command-specific options.

The CLI talks to a running daemon. Log in once with browserbird login using your dashboard email and password, then manage automations, skills, keys, and sessions from the terminal. Every command explains itself with --help. If you lose the dashboard password, run browserbird reset-password on the host where BrowserBird runs and it prints a new one.

Development

Setup, local run, Docker build, and checks
git clone https://github.com/Owloops/browserbird.git
cd browserbird
npm ci
cd web && npm ci && cd ..

Run locally

npm run dev:all

Starts the backend with restart-on-save and the Vite dev server with hot reload at http://localhost:3000. For a production-style run, build the backend and the web UI once, then start the daemon, which serves everything at http://localhost:18800:

npm run build
cd web && npm run build && cd ..
./bin/browserbird

Docker (build locally)

cp .env.example .env
docker compose -f oci/compose.yml up -d --build

Checks

npm run typecheck          # tsc --noEmit
npm run lint               # eslint
npm run format:check       # prettier
npm test                   # node --test

Web UI (from web/):

npm run check              # svelte-check
npm run format:check       # prettier
Publish the AWS CloudFormation template

The Launch Stack button in the README points to s3://browserbird-releases/cloudformation/latest.yaml. After changing aws/cloudformation.yaml, validate and upload:

cfn-lint aws/cloudformation.yaml
aws cloudformation validate-template \
  --template-body file://aws/cloudformation.yaml

aws s3 cp aws/cloudformation.yaml \
  s3://browserbird-releases/cloudformation/latest.yaml \
  --content-type "application/x-yaml" \
  --region us-east-1

The bucket (browserbird-releases, account 267013046707, us-east-1) has a public read policy scoped to the cloudformation/ prefix so CloudFormation can fetch the template during stack creation.

Configuration

The onboarding wizard handles initial setup. For manual configuration, fetch the example config and edit it:

curl -o browserbird.json https://raw.githubusercontent.com/Owloops/browserbird/main/browserbird.example.json

Any string value can reference an environment variable with "env:VAR_NAME" syntax (e.g. "env:SLACK_BOT_TOKEN"). The top-level timezone field (IANA format, default "UTC") applies to cron schedules and automation active hours. Quiet hours use their own slack.quietHours.timezone field.

slack - Slack connection and behavior
"slack": {
  "botToken": "env:SLACK_BOT_TOKEN",
  "appToken": "env:SLACK_APP_TOKEN",
  "requireMention": true,
  "coalesce": { "debounceMs": 3000, "bypassDms": true },
  "channels": ["*"],
  "quietHours": { "enabled": false, "start": "23:00", "end": "08:00", "timezone": "UTC" }
}
  • botToken, appToken: Optional. Bot user OAuth token and app-level token for Socket Mode. Required only for Slack integration
  • requireMention: Only respond in channels when @mentioned. DMs always respond
  • coalesce.debounceMs: Wait N ms after last message before dispatching (groups rapid messages)
  • coalesce.bypassDms: Skip debouncing for DMs
  • channels: Channel names or IDs to listen in, or "*" for all
  • quietHours: Silence the bot in channels during specified hours (DMs still respond). Start/end in HH:MM format, can wrap midnight
agents - Agent routing and model config
"agents": [
  {
    "id": "default",
    "name": "BrowserBird",
    "model": "sonnet",
    "fallbackModel": "haiku",
    "maxBudgetUsd": 5,
    "maxTurns": 50,
    "systemPrompt": "You are responding in a Slack workspace. Be concise, helpful, and natural.",
    "channels": ["*"]
  }
]

Each agent is scoped to specific channels. Multiple agents are matched in order, first match wins.

  • id, name: Required. Unique identifier and display name
  • model: Required. Short names (sonnet, haiku) or full model IDs
  • fallbackModel: Fallback when primary model is unavailable
  • maxBudgetUsd: Cap API spend per invocation in USD (agent exits when reached)
  • maxTurns: Max conversation turns per session
  • systemPrompt: Instructions prepended to every session
  • channels: Required. Channel names or IDs this agent handles, or "*" for all
  • processTimeoutMs: Per-agent subprocess timeout override (inherits from sessions if not set)
  • permissionMode: Agent CLI permission mode. One of auto (default), default, acceptEdits, bypassPermissions
  • suggestedPrompts: Suggested prompts shown in new Slack DM threads. Array of { "title", "message" } objects
sessions - Session lifecycle
"sessions": {
  "ttlHours": 72,
  "maxConcurrent": 5,
  "processTimeoutMs": 300000
}
  • ttlHours: Hours of inactivity before a session expires. The timer resets on each message. When a session expires, the agent starts fresh with no memory of the previous conversation. Messages are still stored in BrowserBird's database, but the agent itself begins a new context. Default is 72 (3 days)
  • maxConcurrent: Max simultaneous agent processes
  • processTimeoutMs: Per-request timeout in milliseconds
browser - Playwright MCP and VNC
"browser": {
  "enabled": false,
  "mcpConfigPath": null,
  "vncPort": 5900,
  "novncPort": 6080,
  "novncHost": "localhost"
}
  • enabled: Enable Playwright MCP for the agent
  • mcpConfigPath: Path to your MCP config (relative or absolute)
  • vncPort: VNC server port
  • novncPort: Upstream noVNC WebSocket port
  • novncHost: Upstream noVNC host (e.g. "vm" in Docker)

Browser mode (persistent or isolated) is controlled by the BROWSER_MODE environment variable, not the config file.

automations - Scheduled task settings
"automations": {
  "maxAttempts": 3,
  "maxConsecutiveFailures": 2
}
  • maxAttempts: Max job attempts before an automation stops retrying
  • maxConsecutiveFailures: Disable an automation after this many consecutive failed runs (0 to never auto-disable)

Each automation supports per-automation active hours set via CLI --active-hours 09:00-17:00 or the API. Wrap-around windows (e.g. 22:00-06:00) are supported.

database - Retention policy
"database": {
  "retentionDays": 30,
  "backups": { "maxCount": 10, "auto": true }
}
  • retentionDays: How long to keep sessions, messages, run logs, jobs, and log entries
  • backups.auto: Enable the daily automatic database backup
  • backups.maxCount: Number of backups to keep before the oldest are pruned
web - Dashboard and API server
"web": {
  "enabled": true,
  "host": "127.0.0.1",
  "port": 18800,
  "corsOrigin": ""
}
  • enabled: Enable the web dashboard and API
  • host: Bind address (0.0.0.0 for Docker/remote)
  • port: Web UI and REST API port
  • corsOrigin: Allowed origin for CORS headers (for cross-origin SPA hosting)

Authentication is handled via the web UI. On first visit, you create an account. All subsequent visits require login.

Environment Variables

All environment variables
Variable Description
SLACK_BOT_TOKEN Bot user OAuth token (optional, for Slack integration)
SLACK_APP_TOKEN App-level token for Socket Mode (optional, for Slack integration)
ANTHROPIC_API_KEY Anthropic API key (pay-per-token)
CLAUDE_CODE_OAUTH_TOKEN OAuth token (uses your Claude Pro/Max subscription). Takes priority when both are set
BROWSER_MODE persistent (default) or isolated. Requires container restart
BROWSERBIRD_NOVNC_HOST Browser VM hostname for container deployments (default vm in the shipped compose files). Sets browser.novncHost for the generated Playwright MCP config, the VNC proxy, and health checks
BROWSERBIRD_CONFIG Path to browserbird.json. Overridden by --config flag
BROWSERBIRD_DB Path to SQLite database file. Overridden by --db flag
BROWSERBIRD_VAULT_KEY Vault encryption key (auto-generated on first start, stored in .env)
BROWSERBIRD_VERBOSE Set to 1 to enable debug logging. Same as --verbose flag
BROWSERBIRD_AUTOMATION_DATA Persistent data directory for the current automation. Set automatically per automation run
BROWSERBIRD_TOKEN CLI auth token. Takes priority over saved credentials files
BROWSERBIRD_CREDENTIALS Path to a credentials JSON file. Used when BROWSERBIRD_TOKEN is unset
BROWSERBIRD_API_URL Daemon URL the CLI talks to (default http://127.0.0.1:18800). Overridden by --url flag
NO_COLOR Disable colored output

License

This project is licensed under the MIT License.

Note

This project was built with assistance from LLMs. Human review and guidance provided throughout.

About

Self-hosted AI agent orchestrator with a real browser, a cron scheduler, and a web dashboard.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors