The most comprehensive, structured guide to AI agent frameworks, tools, and resources. Updated weekly. Compared side-by-side. Built for developers who ship.
- Orchestration Frameworks
- Coding Agents
- Memory and Context
- Multi-Agent Systems
- Agent Communication Protocols
- Browser and Computer Use Agents
- Agent Tooling and Infrastructure
- Low and No-Code Builders
- Voice and Multimodal Agents
- Safety Guardrails and Observability
- Agent Deployment and Hosting
- Agent Evaluation and Benchmarks
- Learning Resources
- Modern AI System
- Changelog
- Star History
The core frameworks for building, orchestrating, and running AI agents.
How to choose: Need enterprise compliance? Semantic Kernel, LangGraph. TypeScript shop? Mastra, VoltAgent, Vercel AI SDK. Just getting started? CrewAI, PydanticAI, OpenAI Agents SDK.
| Framework | Language | Multi-Agent | Memory | MCP | Stars |
|---|---|---|---|---|---|
| LangGraph | Python | Yes | Yes | Yes | ~12k |
| CrewAI | Python | Yes | No | Yes | ~41k |
| AutoGen | Python | Yes | Yes | Yes | ~52k |
| PydanticAI | Python | No | No | Yes | ~8k |
| Mastra | TypeScript | Yes | Yes | Yes | ~8k |
| Semantic Kernel | Python/C#/Java | Yes | Yes | Yes | ~22k |
- Agno - Multi-agent framework with a runtime and control plane for managing agent deployments at scale.
- AutoGen - Event-driven multi-agent framework merged with Semantic Kernel for production workflows.
- CrewAI - Role-playing agent orchestration for collaborative agent teams.
- Google ADK - Modular agent dev kit integrating Gemini and Vertex AI natively.
- Haystack - Production-ready AI orchestration framework focused on building customizable LLM applications and RAG pipelines.
- LangGraph - Enterprise framework for stateful, graph-based agent workflows.
- Letta - Formerly MemGPT. Stateful agents with built-in long-term memory and a REST API server.
- LlamaIndex - The leading framework for connecting LLMs to your data, with powerful indexing and retrieval capabilities.
- Mastra - Opinionated TypeScript framework with RAG, observability, and MCP support built in.
- Modus - Serverless framework for high-throughput agent workloads with minimal cold starts.
- OpenAI Agents SDK - Lightweight multi-agent SDK with tracing and guardrails from OpenAI.
- Open-AutoGLM - Open-source phone agent model and framework for building mobile device automation agents.
- PraisonAI - Production multi-agent framework with self-reflection, MCP integration, and workflow automation.
- PydanticAI - Type-safe agent framework from the Pydantic team with a FastAPI-style developer experience.
- Semantic Kernel - Microsoft enterprise SDK for Python, C#, and Java with modular plugins, memory, and goal planning.
- Smolagents - Hugging Face code-first framework where agents write and execute Python instead of JSON tool calls.
- Strands Agents SDK - AWS model-driven agent SDK with native Bedrock integration.
- Vercel AI SDK - Streaming-first primitives for AI UIs with React Server Components and edge runtime support.
- VoltAgent - TypeScript agent framework with built-in observability and a self-improving context engine.
AI-powered tools that write, edit, debug, and ship code from terminal pair programmers to full autonomous software engineers.
How to choose: Want terminal-first? Aider, Claude Code, gemini-cli. IDE-integrated? Cline, Continue, Cursor. Full autonomy? OpenHands, SWE-agent, Devin.
| Agent | Type | Open Source | Interface | Best For |
|---|---|---|---|---|
| Aider | Terminal | Yes | CLI | Git-aware pair programming |
| Claude Code | Terminal | Yes | CLI | Multi-file edits + tests |
| gemini-cli | Terminal | Yes | CLI | Google ecosystem |
| Codex CLI | Terminal | Yes | CLI | Fast autonomous tasks |
| Cline | IDE | Yes | VS Code | Permission-gated editing |
| Continue | IDE | Yes | VS Code | CI-enforceable checks |
| Cursor | IDE | No | Desktop | Deep codebase refactoring |
| Windsurf | IDE | No | Desktop | Team collaboration |
| Devin | Autonomous | No | Cloud | End-to-end engineering |
| OpenHands | Autonomous | Yes | Web | Full dev lifecycle |
- Aider - Terminal-first pair programmer that edits code in local repos, preserves Git history, and supports multi-file changes.
- AutoGPT - Mature autonomous agent platform with Forge framework and public benchmarks for evaluating agent capabilities.
- Claude Code - Terminal-first agentic coding tool with multi-file edits, test running, and Git operations baked in.
- Cline - Autonomous coding agent in your IDE that creates/edits files, runs commands, and uses the browser with permission-gated steps.
- Codex CLI - OpenAI's lightweight, open-source terminal coding agent with fast execution and strong benchmark scores.
- Codex-CLI - CLI tool that turns natural language commands into Bash, ZShell, and PowerShell equivalents.
- Continue - Source-controlled AI checks enforceable in CI, powered by the open-source Continue CLI.
- Cursor - AI-native IDE (VS Code fork) with deep codebase awareness, multi-file refactoring, and agentic workflows.
- Devin - Fully autonomous AI software engineer that plans, codes, tests, and deploys in a cloud sandbox.
- gemini-cli - Open-source AI agent that brings the power of Gemini directly into your terminal.
- Goose - Open-source extensible AI agent that goes beyond code suggestions, installs, executes, edits, and tests with any LLM.
- Open Interpreter - Execute code locally via natural-language model instructions with a ChatGPT-like interface.
- opencode - Open-source coding agent available as a desktop application with a visual interface.
- OpenHands - AI-driven development platform that writes, tests, and deploys code autonomously.
- SWE-agent - Takes a GitHub issue and tries to automatically fix it. Also used for cybersecurity and competitive coding.
- Windsurf - AI-native IDE with Cascade agent for multi-step autonomous tasks and team workflows.
Persistent memory, knowledge graphs, and context management for agents that need to remember, learn, and adapt.
How to choose: Need plug-and-play memory? Mem0. Knowledge graphs? graphiti, cognee. Video/document retrieval? Memvid. Full-stack solution? Cortex Memory.
| Solution | Approach | Graph Support | Multi-Modal | Stars |
|---|---|---|---|---|
| Mem0 | Hybrid | No | No | ~30k |
| graphiti | Knowledge Graph | Yes | No | ~4k |
| cognee | Graph + Vector | Yes | No | ~3k |
| Supermemory | Vector | No | No | ~7k |
| Memvid | Video-based | No | Yes | ~4k |
- Acontext - Manages agent skills and long-term memory as a layered data structure for persistent context.
- cognee - Knowledge engine for AI agent memory, set up in 6 lines of code with graph-based knowledge extraction.
- Cortex Memory - Full-stack solution for agent memory covering extraction, vector search, and optimization.
- graphiti - Build real-time knowledge graphs for AI agents with automatic entity extraction and linking.
- Langmem - Helps agents learn and adapt from their interactions over time with persistent memory.
- Mem0 - Memory layer for AI applications with long-term, short-term, and semantic memory extraction.
- Memvid - Replace complex RAG pipelines with a serverless, single-file memory layer for instant retrieval.
- SimpleMem - Efficient lifelong memory for LLM agents supporting both text and multimodal inputs.
- Supermemory - Extremely fast and scalable memory engine and API designed for the AI era.
Frameworks specifically designed for orchestrating multiple agents working together on shared objectives.
How to choose: Need a quick prototype? Swarm. Full software team simulation? MetaGPT. Production-scale orchestration? Swarms Framework. Research playground? AgentVerse.
- AgentVerse - Framework for building custom multi-agent environments to accomplish collaborative tasks.
- EvoAgentX - Evaluates and evolves agentic workflows over time using automatic optimization.
- Hivemoot - Autonomous agent teams that collaboratively build software on GitHub.
- MetaGPT - Simulates a full software company workflow from requirements to PRs using role-playing agents.
- Swarm - Lightweight framework for agent handoffs, context variables, and function calling patterns from OpenAI.
- Swarms Framework - Multi-agent orchestration for production use cases with scalability and reliability at its core.
The protocol layer that enables agents to discover tools, communicate with each other, and interoperate across ecosystems.
How to choose: Connecting agents to tools? MCP. Agent-to-agent communication? A2A. Both? Use MCP for tools and A2A for coordination.
| Protocol | Purpose | Creator | Status |
|---|---|---|---|
| MCP | Agent-to-tool | Anthropic | Standard |
| A2A | Agent-to-agent | Growing | |
| ACP | Agent communication | IBM/BeeAI | Early |
- Arcade AI - Tool-use platform with authentication, authorization, and logging for agent-tool interactions.
- Composio - Integration platform with 250+ pre-built tool connectors for AI agents and LLMs.
- Docker MCP - Docker's MCP gateway CLI plugin for running MCP servers in isolated containers.
- MCP Registry - Official Model Context Protocol specification and server implementations for standardized tool access.
- Toolhouse - Cloud-hosted tool infrastructure for agents with optimized execution and low-latency access.
- Zapier MCP Server - Connect agents to 7,000+ app integrations via MCP, powered by Zapier's automation platform.
- A2A Protocol - Google's open protocol enabling AI agents to communicate, collaborate, and delegate tasks across frameworks.
Agents that navigate the web, interact with UIs, and automate computer tasks.
How to choose: Need Playwright-based reliability? Stagehand. Vision-based generalization? Skyvern. Full browser control? Browser Use. Web scraping? AgentQL.
| Agent | Approach | Base | Stars |
|---|---|---|---|
| Browser Use | LLM navigation | Playwright | ~30k |
| Stagehand | NL selectors | Playwright | ~11k |
| Skyvern | Computer vision | Custom | ~11k |
| LaVague | Action model | Selenium | ~6k |
| AgentQL | Semantic query | Custom | ~2k |
- AgentQL - AI-powered web scraping and automation with a semantic query language for page elements.
- Browser Use - Open-source framework to let LLMs navigate and interact with any website programmatically.
- LaVague - Large Action Model framework to turn natural language instructions into browser automation.
- Skyvern - Automate browser-based workflows with computer vision and LLMs, no brittle selectors needed.
- Stagehand - AI web browsing framework built on Playwright with natural-language selectors and actions.
Sandboxes, web scrapers, browser automation, and networking layers that agents depend on.
How to choose: Need code sandboxing? E2B. Web scraping for LLMs? Firecrawl. Full agent deployment? AgentDock.
- AgentDock - Framework for building and deploying production-ready AI agents with composable node architecture.
- E2B - Cloud sandboxes for AI agents to run code securely in isolated environments.
- Engram - Universal bridge for multi-protocol AI agent systems with automated semantic mapping.
- Firecrawl - Web scraping API built for LLMs that converts websites to clean, structured markdown.
- Notte - Browser automation engine optimized for production AI pipelines.
- Pilot Protocol - Networking stack for distributed agent systems with encrypted tunnels.
Visual and browser-based tools for building agents without writing code.
How to choose: Need full RAG orchestration? Dify. Drag-and-drop pipelines? Langflow. Zero-setup browser agent? AgentGPT.
- AgentGPT - Deploy AI agents in the browser with zero local setup required.
- Dify - Open-source LLM app development platform with visual workflow builder and RAG orchestration.
- Langflow - Visual drag-and-drop builder for LLM workflows, RAG agents, and multi-step pipelines.
Frameworks for building agents that can hear, speak, see, and interact across modalities.
How to choose: Need real-time voice? LiveKit Agents, Vapi. Streaming pipelines? Pipecat. Multimodal RAG? Agentset.
- Agentset - Production RAG platform with reasoning, hybrid search, and full multimodal support.
- LiveKit Agents - Framework for building real-time, multimodal AI agents with voice, video, and data channels.
- Pipecat - Open-source framework for voice and multimodal conversational AI with streaming pipelines.
- Vapi - Platform for building voice AI agents with low-latency speech-to-speech capabilities.
Tools for governing, monitoring, and securing autonomous AI agents in production.
How to choose: Need full observability? Langfuse, Arize Phoenix. Runtime guardrails? AgentGuard, AgentDoG. Python-native monitoring? Logfire.
- Agent OS - Kernel architecture for governing autonomous AI agents with policy enforcement.
- AgentDoG - Diagnostic guardrails that analyze full agent execution trajectories to detect instruction hijacking and tool misuse.
- AgentGuard - Runtime observability and guardrails for AI agents with loop detection and anomaly alerts.
- APort Agent Guardrails - Pre-action authorization plugin for agent frameworks with policy-based access control.
- Arize Phoenix - Open-source observability platform built on OpenTelemetry for tracing, evaluating, and debugging AI agents.
- DriftGuard - Semantic memory guardrails using causal graphs to prevent agents from repeating past failures.
- Laminar - Open-source observability and analytics platform purpose-built for the full lifecycle of AI agents.
- Langfuse - Open-source LLM observability platform for tracing, prompt versioning, and LLM-as-a-judge evaluations.
- Logfire - Python-native observability from the Pydantic team with deep integration for high-performance agent monitoring.
- Orchard Kit - Modules for agent runtime security, self-audit trails, and collective cognition patterns.
Platforms and tools for deploying, scaling, and hosting AI agents in production.
How to choose: Need serverless GPU? Modal. AWS-native? Bedrock AgentCore. Git-push deploy? Railway, Northflank. Background jobs? Trigger.dev.
- AWS Bedrock AgentCore - Managed AWS infrastructure for Bedrock-based agents with compliance, scaling, and monitoring built in.
- Modal - Serverless GPU compute purpose-built for AI workloads with fast cold starts and Python-native deployment.
- Northflank - Full-stack platform with GPU orchestration, Git-based CI/CD, and bring-your-own-cloud support.
- Railway - One-click deploy from GitHub with persistent volumes and databases for stateful agent deployments.
- Trigger.dev - Background job platform with cron, webhook, and event triggers purpose-built for long-running agent tasks.
If you are building agents, you need to measure them. These tools help you evaluate, score, and compare agent performance.
How to choose: Evaluating coding agents? SWE-bench. General reasoning? GAIA. Multi-environment testing? AgentBench. Custom evals? Inspect AI.
- AgentBench - Comprehensive benchmark for evaluating LLMs as agents across 8 distinct environments.
- GAIA Benchmark - Benchmark for General AI Assistants measuring real-world reasoning and tool use.
- Inspect AI - Framework for evaluating large language models with composable tasks and scoring.
- SWE-bench - Benchmark for evaluating LLMs on real-world software engineering tasks from GitHub issues.
Courses, papers, and guides for understanding AI agents.
- AI Agents in LangGraph - Short course on building production agents with LangGraph by Andrew Ng's platform.
- AgentBench: Evaluating LLMs as Agents - The benchmark paper for evaluating LLMs as agents across diverse environments.
- Building Effective Agents - Anthropic's guide on agent design patterns, evaluation strategies, and production best practices.
- LLM Powered Autonomous Agents - Deep breakdown of LLM-powered agent components: planning, memory, and tool use.
- Prompt Engineering Guide - Community-maintained guide covering prompt engineering techniques and agent strategies.
- ReAct: Synergizing Reasoning and Acting in Language Models - The foundational paper behind the ReAct prompting pattern used in most agent frameworks.
A high-level architecture view of how modern AI agent systems are structured from foundation models to orchestration layers, memory, tools, and deployment.
See CHANGELOG.md for the full update history.
Your contributions are what keep this list useful. Read Contributing.md for the entry format, inclusion criteria, and style guide.
