Open-source infrastructure for AI-assisted development
We build the missing pieces — the tools you reach for when your AI coding agent hits a wall it shouldn't have hit.
Browse Projects · Who This Is For · Get Started · Contribute · Roadmap
AI coding tools are fast, but they have blind spots. They write frontend code they can't see. They claim things work without checking. They lose context when you move a folder. They review their own work and call it done.
Every project here came from a real workflow problem — something that broke, something that was missing, something that should have existed already. If you work with AI coding agents daily, you'll recognize a few of them.
- Developers using AI coding agents (Claude Code, Cursor, Windsurf, Copilot) who want their tools to actually see and verify what they build
- Teams building AI agent pipelines who need quality gates between "code generated" and "code shipped"
- Anyone tired of re-explaining context to their AI tools after moving a folder, switching machines, or losing a session
If you've ever watched an AI agent confidently tell you something works when it clearly doesn't — these tools are for you.
Give your AI agent a real browser.
Most AI coding tools are blind — they can write frontend code but never see what it actually looks like. Playwright Pool is an MCP server that gives any AI coding tool full browser control with persistent authentication.
- 75+ tools — navigate, click, fill forms, screenshot, run JavaScript
- 28 built-in audits — accessibility, color contrast, overflow detection, dark mode, Core Web Vitals, broken links, tap targets
- 143 device presets — test on any phone, tablet, or desktop viewport without leaving your editor
- Golden profile auth — log in once, stay logged in across sessions
- 4 modes — MCP server or standalone CLI, headed or headless
Works with Claude Code, Cursor, Windsurf, and anything that speaks MCP.
Why not just use Playwright directly? You could — but you'd need to wire up MCP bindings, build audit logic, manage browser lifecycle, handle auth persistence, and maintain device presets yourself. Playwright Pool packages all of that into one install.
Quickstart
npm install -g playwright-pool
# Run as MCP server (add to your claude_desktop_config.json or .mcp.json)
playwright-pool serve
# Or use the CLI directly
playwright-pool open https://example.com
playwright-pool audit accessibility https://example.com
playwright-pool screenshot https://example.com --device "iPhone 15"A quality layer between "code that runs" and "code that ships."
AI agents are fast but sloppy. They'll write code that passes tests but looks broken on mobile. They'll fix one bug by introducing two more. They'll tell you something works without actually verifying it.
The Human Engine is a 21-category cognitive review system modeled on how experienced developers actually evaluate work — from visual correctness to security to "did you even do what was asked?" Structured as phases that agents can run before claiming they're done, so you don't have to be the one catching everything.
Why not just use a linter? Linters catch syntax and style. The Human Engine catches the stuff linters miss — visual regressions, logic that doesn't match the spec, missing edge cases, security holes that pass CI, and the classic "it works on my machine" problems. It's the review a senior dev would do, not a style check.
The 21 Categories
The framework evaluates across 10 phases covering:
- Request comprehension
- Scope accuracy
- Visual correctness
- Responsive behavior
- Accessibility compliance
- Security review
- Performance impact
- Error handling
- Edge case coverage
- Documentation accuracy
- Test coverage
- Code quality
- API contract compliance
- State management
- Data integrity
- Cross-browser compatibility
- Deployment readiness
- Monitoring and observability
- Backwards compatibility
- User experience flow
- Adversarial robustness
Full documentation in the repo.
What's the minimum an AI agent actually needs to be useful?
Not a theoretical exercise — a working Python implementation with 508 tests, backed by the research that informed every design decision.
- Memory with teeth — retrieval that fires when it matters, not just storage
- Stateful daemon — persistence across sessions without duct tape
- Browser, vision, audio — multimodal capabilities without bolting on five libraries
- Multi-agent orchestration — agents that coordinate, not just run in parallel
Why not use LangChain / CrewAI / AutoGen? Those frameworks optimize for flexibility and plugin ecosystems. This one optimizes for the smallest set of components that actually work in production. Ten components, no plugin system, no YAML configs, no "framework within a framework." The research and the code live in the same repo so you can see exactly why each decision was made.
Quickstart
git clone https://github.com/zbrooklyn-claude-labs/Less-Is-More-for-AI-Autonomous-Agents.git
cd Less-Is-More-for-AI-Autonomous-Agents
pip install -r requirements.txt
python -m pytest # 508 testsMoved your project folder? This fixes everything that breaks.
Claude Code, Codex CLI, and Gemini CLI all store absolute paths — to your project, your MCP configs, your conversation history. Move your folder or sync to a new machine and all of it silently breaks. No error messages, just tools that stop working and no obvious reason why.
This toolkit detects broken paths, fixes symlinks, updates configs, and gets you back to working in one command. Windows, macOS, Linux, and Termux.
Why not just fix paths manually? You could, if you knew where they all were. Claude Code alone scatters state across ~/.claude/, project-level .claude/ directories, MCP configs, and conversation metadata. Multiply that by three tools and two machines and you're spelunking through hidden directories for an hour. This does it in seconds.
Quickstart
git clone https://github.com/zbrooklyn-claude-labs/ai-workspace-move-sync.git
cd ai-workspace-move-sync
# Scan for broken paths
python sync.py scan
# Fix everything
python sync.py fixPick the project that matches your problem:
| Problem | Solution | Install |
|---|---|---|
| My AI agent can't see the UI it builds | Playwright Pool | npm i -g playwright-pool |
| AI-generated code ships with obvious bugs | Human Engine | Clone repo, add to agent workflow |
| I want to build autonomous agents from scratch | Less Is More | pip install -r requirements.txt |
| I moved my project and everything broke | Workspace Sync | python sync.py fix |
This org is actively growing. Here's what's in the pipeline:
- More MCP tools — extending the Playwright Pool model to other domains where AI agents need real-world access
- Agent evaluation benchmarks — standardized ways to measure whether an AI coding agent actually did what you asked
- Workflow templates — pre-built configurations for common Claude Code / Cursor / Windsurf setups
- Documentation and guides — deeper writeups on the problems these tools solve and the design decisions behind them
Want to influence what gets built next? Open a discussion or file an issue on the relevant repo.
Contributions are welcome across all repos. Here's how to get started:
- Pick a repo — each one has its own issues and contribution needs
- Check open issues — look for
good first issueorhelp wantedlabels - Fork and branch — standard GitHub flow. One feature per PR.
- Test your changes — each project has its own test suite. Make sure it passes before opening a PR.
- Open a PR — describe what you changed and why. Link to the issue if there is one.
No CLA, no contributor agreement, no hoops. Just good code and clear communication.
All projects are licensed under the MIT License — use them however you want.
I'm Edward, a developer in Brooklyn, NY. I spend most of my time inside AI coding tools, building for the way this workflow actually works in practice — not how it looks in a demo.
If something here is useful, star it. If something is broken, open an issue. If you want to talk about any of it, start a discussion.