Skip to content

Releases: phuetz/code-buddy

v1.1.0 — Goal Ralph loop (Hermes Agent parity)

11 Jun 08:33

Choose a tag to compare

Features

Goal Ralph loop — full Hermes Agent parity on four surfaces (feat(goals), 6ac4d15):

  • /goal <text> + /subgoal (interactive): after each turn an auxiliary judge model returns {"done": bool, "reason": str}; on continue a plain user-role continuation prompt is auto-submitted until the goal is done, paused, or the turn budget (default 20) is spent. Real user messages preempt the loop; Esc auto-pauses. Per-session persistence under ~/.codebuddy/goals/ survives --resume/--continue.
  • buddy goal "<text>" (headless): drives the full agent (tools included) in-process until done (exit 0) or paused (exit 1). --max-turns, --judge-model, -m.
  • Colab board goal-mode (buddy fleet tasks add --goal-mode [--goal-max-turns N]): the autonomous worker's successful attempt must pass the judge (acceptanceCriteria become strict numbered criteria); "continue" re-opens the task with a continuation nudge (persisted budget, default 5); an exhausted budget blocks the task for human review instead of spinning.
  • peer.chat-session.goal (fleet gateway parity): set/status/pause/resume/clear/subgoal-*; server-side judge after continue/continue-stream with caller-driven continuation; GOAL_ACTIVE rejection of a new goal mid-run; metadata-only fleet:chat-session:goal events.

The judge is fail-open (transport errors → continue; 3 consecutive parse failures auto-pause with a config hint) and records its token usage in the session cost ledger. Config: goals.{maxTurns,judgeModel,judgeMaxTokens,judgeTimeoutMs} + CODEBUDDY_GOAL_MAX_TURNS / CODEBUDDY_GOAL_JUDGE_MODEL — route the judge to a free local Ollama model.

Also in this release (WS3 « Mémoire & continuité du run »): session-end flush (HANDOFF.md + review-gated lesson candidates), periodic context snapshots for very long sessions, and the session-duration pause suggestion middleware.

Fixes

  • fix(repo): dropped two phantom Windows gitlink entries that broke actions/checkout (d963a64)
  • fix(mcp): capture stdio MCP server stderr instead of inheriting it (9a57b3d)

Installation

npm install -g @phuetz/code-buddy@1.1.0

Full details: CHANGELOG.md · Hermes/OpenClaw parity

Code Buddy 1.0.0 — General Availability

10 Jun 10:22

Choose a tag to compare

Code Buddy 1.0.0 — General Availability 🎉

The first GA release. Code Buddy is a multi-provider AI coding agent for the terminal, with a desktop cockpit and a multi-AI fleet mesh — built to run as well on a free local Ollama model as on frontier APIs.

Headline features

🛰️ Multi-AI Fleet Hub

A stateful WebSocket mesh where Code Buddy peers observe each other live and delegate work:

  • peer.chat — one-shot LLM calls to any peer
  • peer.chat-session.* — multi-turn sessions, FIFO-serialized, persisted across restarts
  • peer.tool.invoke — remote read-only tool execution behind three fail-closed security gates (allowlist → fleetSafe registry flag → mandatory workspace root)
  • route_peer — task classification + privacy/cost/latency-aware routing, with PII lint before anything leaves the machine
  • Live fleet load: heartbeat-carried utilization + daemon saturation backpressure

🖥️ Cowork — desktop cockpit (Electron)

Pilot everything from a GUI: autonomy daemon, task board, saga cancel/replay, fleet spend dashboard, route preview, peer chat sessions, service logs, Slack/Telegram/Feishu remote channels, MCP OAuth, bundled Office/PDF skills.

🤖 The agent itself

  • 15 providers via OpenAI-compatible routing (Grok, Claude, GPT, Gemini, Ollama, LM Studio, Bedrock, Azure, Groq, Together, Fireworks, OpenRouter, vLLM, Copilot, Mistral) + Gemini native
  • Autonomous coding cell: edit-proposal producer → review gate → verification loop (red → green, bounded) → checkpoint/resume — guardrails enforced, not documented
  • ToT + MCTS reasoning, extended thinking, RAG tool selection over ~110 tools
  • Context engine: compression, transcript repair, JIT context, persistent memory with auto-writeback

Hardening that landed for GA

  • DB migrations proven end-to-end: old installs (v1/v2 schema or legacy JSON) upgrade cleanly — covered by a real-SQLite e2e suite
  • Feishu webhook security fix: signature verification now spec-conform (SHA-256 of timestamp+nonce+encrypt_key+body), decrypt-first flow, timing-safe token checks, fail-closed without secrets
  • Production deployment guide: systemd, Docker, Kubernetes, nginx/WS, monitoring, upgrade procedure → docs/deployment.md
  • Gateway device pairing (OpenClaw-style approval flow), protocol negotiation, origin hardening
  • ~30,000 Vitest tests

Getting started

npm install -g @phuetz/code-buddy
buddy onboard        # setup wizard
buddy                # chat
buddy server         # HTTP + fleet gateway

Docs: getting-started · fleet guide · deployment · CHANGELOG

Release 0.4.0

22 Feb 02:14

Choose a tag to compare

Changes in this release

  • feat: v2.9.0 — i18n de/es/ja/zh, PersonaManager hot-reload + /persona, ComputerSkills LLM step (d48532b)
  • fix(ci): replace deprecated create-release/upload-release-asset actions (8ce02ad)

Installation

npm install -g @phuetz/code-buddy@0.4.0

or

npx @phuetz/code-buddy@0.4.0

Release 0.3.0

22 Feb 00:32

Choose a tag to compare

Changes in this release

  • fix(test): increase timeout for security deepScan test in CI (6279212)
  • fix(build): escape backticks in system-base.ts template literal and fix CacheBreakpointMessage type (2add8d4)
  • ci(release): also exclude extra-handlers and parsers.heavy from release gate (f52746a)
  • ci(release): exclude known pre-existing test failures from release gate (62c6ef0)
  • fix(ci): resolve test failures from tool registry TS errors and test logic bugs (284be76)
  • chore: bump version to 0.3.0 (208b8a2)
  • fix: continue provider chain when a provider returns 0 results (0e601b6)
  • fix: detect DuckDuckGo CAPTCHA and surface API key hint (8c97098)
  • feat: v2.8.0 — LessonsTracker analytics, slash command, RunStore lesson_added (f86f3e0)
  • test: add coverage for v2.6.0 self-improvement loop (92 tests, 0 failures) (37bf4d1)
  • chore: bump version to 0.2.0 (14f463a)
  • feat: OpenClaw parity — slash commands, MMR memory, cron backoff, sub-agent depth (v2.7.0) (3923d9a)
  • chore: bump version to 0.1.26 (edb8e9b)
  • feat: workflow orchestration integration — self-improvement loop (v2.6.0) (aca13d9)
  • feat: OpenClaw/Manus AI/Codex full parity — Phase 14 (v2.5.0) (fc90c73)
  • docs: restructure README by functional areas with research references (be631e8)
  • docs: add Telegram capabilities section to README (018d1f8)
  • docs: add Telegram setup guide and channel documentation to README (9353113)
  • docs: add comprehensive YOLO mode / autonomy section to README and CLAUDE.md (ddf4b35)
  • docs: fix RTK descriptions to reflect command proxy pattern (a7aaf20)
  • chore: bump version to 0.1.25 (4a7fabc)
  • fix: RTK integration uses command proxy pattern instead of stdin compression (90d84e6)
  • chore: bump version to 0.1.24 (5e4625c)
  • feat: integrate RTK output compressor and ICM persistent memory bridge (08ad0fc)
  • chore: bump version to 0.1.23 (d8b4dc4)
  • feat: implement 6 high-priority gap fixes with full test coverage (81c2fbe)
  • chore: bump version to 0.1.22 and update README with Phase 8 features (2a3bb5e)
  • feat: add skills auto-discovery, device node connectors, and canvas bidirectional events (02c8702)
  • chore: bump version to 0.1.21 (836e76e)
  • docs: add DeepWiki documentation badge to README (376dc92)
  • feat(mcp): add agent intelligence layer with 8 tools, 4 resources, 5 prompts (95d670e)
  • docs: update README test stats to 23,500+ tests / 549+ suites (02fbcce)
  • chore: bump version to 0.1.20 (4487147)
  • fix: complete rebrand from Grok CLI to Code Buddy across codebase (0e760d2)
  • fix: remove unused @ts-expect-error in WhatsApp adapter (dbbe7dc)
  • chore: bump version to 0.1.19 (24e6de9)
  • chore(deps): clean deprecated dependencies (46cf3c0)
  • chore: bump version to 0.1.18 (14032f6)
  • fix: rebrand Grok references to Code Buddy in /init and context system (3947ddc)
  • fix(tests): fix flaky bash-parser and pdf-agent tests in full suite runs (a4b2a3b)
  • feat: Claude Code full parity — 32 features, 9,542 lines, 483 tests (a3cdfc5)
  • chore: bump version to 0.1.17 (6a96f72)
  • feat: Claude Code + OpenClaw parity — 71 features, 12,490 lines, 545 tests (f1584c0)
  • chore: bump version to 0.1.16 (7948ea4)
  • feat: Claude Code parity — 11 high-priority features implemented (f37c802)
  • chore: bump version to 0.1.15 (cbfa55d)
  • feat: Codex CLI parity — 11 missing features implemented (54f9097)
  • fix(tests): fix 18 failing test suites — all 22,197 tests now pass (2b069e5)
  • feat: Phase 6-7 — OpenClaw parity, code generation security, full README update (53596cf)
  • chore: bump version to 0.1.13 (42fc45d)
  • fix(security): Gemini API key in URL + WS tool handler init error (4f1f035)
  • fix: 4 bugs — git push recursion, notebook null crash, cellIndex validation, chunker overlap (b59b257)
  • fix: stream timeout leak + snapshot memory cap (802972e)
  • fix: 3 bugs — health monitor double-count, cron step=0 loop, logger circular ref crash (ba4a27a)
  • fix: 6 bugs — Gemini key in URL, Slack ping leak, TTS queue, AudioReader timeout, spawn leak (6619d1f)
  • fix(security): 10 bugs — DM block key mismatch, XSS, path traversal, DoS, pagination (a0ce8be)
  • fix: path traversal in apply-patch, import limits, metrics cap, response drain (e03cfd6)
  • fix: token count accumulation, timer leak, output caps, env leak, API timeout (c2551b3)
  • fix: unbounded message queue, ffmpeg timeout, similarity scores cap, tmpdir leak (8c434bd)
  • chore: bump version to 0.1.12 (6e20530)
  • fix: apply-patch trailing newline loss, history-repair synthetic ID collisions (27305bb)
  • fix: HTTP stream error handler, unbounded event/metrics queues, stdout cap (718da57)
  • fix: Gemini JSON.parse crash, routeHistory unbounded, browser JS injection (befdc1b)
  • fix: timer leaks, async cleanup, polling race, unbounded audio buffer (cdf5ae1)
  • fix: Map mutation during iteration, osascript injection, xrandr injection, splice correctness (b6b5953)
  • fix(security): command injection, JWT confusion, CORS, proto pollution, metrics injection (7e9a058)
  • fix: 5 bugs — SQL injection whitelist, embedding buffer copy, checkpoint bounds, edge-tts timeout, pruning ratio clamp (ff9e8f9)
  • fix: 3 bugs — orchestrator timer leak, WS heartbeat Map mutation, reasoning dead code (f29cc20)
  • fix: 6 bugs — parallel timeout leak, cache eviction, MCP timeout cleanup, JSON-RPC server lifecycle (35caf5a)
  • chore: bump version to 0.1.11 (1c6516f)
  • fix: 3 bugs — retry abort listener leak, rate-limit backoff overflow, git log count validation (542db39)
  • fix: 5 bugs — notebook JSON parse, Discord heartbeat leak, DM pairing cross-channel block, webhook template safety, routing format (7d531b8)
  • fix: 5 bugs — queue memory leak, vector-store save race, branch error logging, redaction dedup, chunker off-by-one (e6bdab6)
  • fix(security+providers): 8 bugs — credential decrypt validation, CSRF timing-safe, safe-eval proto pollution, provider null guards, plugin worker timer leak (4e071c4)
  • fix: web-search cache cleanup, Slack cache bounds, peer-routing history trim (12ecbcb)
  • fix: auth scope null guard, rate-limit safe iteration, router error handling, cron persistence (fa1a832)
  • chore: bump version to 0.1.10 (88702d6)
  • fix: voice process timeouts, cost validation, TTS pre-synth error handling (ebb5cc0)
  • fix: health monitor warning count, shared-context lock TTL, canvas broadcast errors (6528efe)
  • chore: bump version to 0.1.9 (91a714a)
  • fix: session tool call ID collision on restore (aaf7c3f)
  • fix: hotkey output, screenshot NaN, generator null guard, client null check (6f879f1)
  • fix: glob unclosed brackets, truncation overshoot, history null merge, scanner regex (8d3e0c9)
  • chore: bump version to 0.1.8 (392eb3c)
  • fix: apply-patch partial write, bash-parser $() regex, stream DOMException, session-lock ESM (8516e3c)
  • fix: headless output format, context injection, ACP routing, skill scoring (f76210d)
  • fix: ModelFailoverChain constructor defaults healthy/consecutiveFailures (7f786a4)
  • chore: bump version to 0.1.7 (e3f81a2)
  • fix: getBundledSkillsPath() __dirname not defined in ESM (f065e8b)
  • fix: server version now reads from package.json instead of hardcoded 1.0.0 (67e2098)
  • fix: wire applyToolFilter into tool selection pipeline (2069dfe)
  • chore: bump version to 0.1.6 (dd23f7d)
  • feat(desktop-automation): snapshot+screenshot combo, annotated screenshots, LLM normalization (2b8558a)
  • fix: CLI routing, server circular dep, daemon ESM, and dynamic version (4033f3b)
  • chore: bump version to 0.1.5 (6b863d4)
  • fix: add missing npm dependencies and correct README CLI options (8a06df9)
  • feat(desktop-automation): platform-native providers (OpenClaw-inspired) (72d2ede)
  • feat(screenshot): WSL2 PowerShell interop for real desktop capture + fix web-search test (2759180)
  • chore: bump version to 0.1.4 (81b0517)
  • fix: web search failed-query cache to prevent repeated timeouts (3ad045c)
  • fix: Gemini MALFORMED_FUNCTION_CALL recovery, search timeout, tool suggestions (ff91078)
  • docs: update README with headless mode, test count, and v0.1.3 fixes (ebea44c)
  • chore: bump version to 0.1.3 (a3e0aad)
  • fix: robust Gemini message sanitization after context compression (ae674b3)
  • chore: bump version to 0.1.2 (d25adf9)
  • fix: Gemini message sanitization and remaining stdout pollution (ac3b49b)
  • fix: stdout pollution, headless hang, and Gemini tool role mapping (bb8dd20)
  • fix: Week 4 code quality audit — type safety, error handling, test reliability (609cfdb)
  • refactor: Week 3 maintainability audit — explicit exports, file splits, circular dep detection (2050ee9)
  • refactor: split index.ts into command modules, fix type safety and resource cleanup (407fc09)
  • fix: memory leaks, O(n) hot path, flaky test, and WebSocket rate limiting (dfde137)
  • chore: bump version to 0.1.1 (3f5a075)
  • chore: remove stale docker-sandbox test file (589a474)
  • refactor: Gemini audit code quality improvements (216c376)
  • fix(security): comprehensive audit fixes across 12 critical areas (4bfc1b2)
  • fix(mcp): correct Gemini audit implementation bugs (39051c0)
  • fix(scripting): suppress unused expression lint error in parser (403c0ad)
  • refactor(scripting): complete FCS/Buddy Script unification (d4c7a08)
  • fix: update tests for view_file 500-line limit and error message changes (7dca18d)
  • fix: view_file truncation, MCP package names, and disabled server filtering (6ef6514)
  • feat: add MCP predefined servers, web search enhancements, and agent utilities (2a53425)
  • refactor(scripting): unify FCS and Buddy Script into single scripting system (4dd4b6c)
  • fix: repair 45+ failing test suites and prepare v0.1.0 for npm publish (51c3630)
  • feat(voice): add AudioReader/Kokoro-82M TTS provider and Porcupine wake word detection (aec977e)
  • feat(skills): add 15 bundled skills for pro software with MCP integration (2e6c74e)
  • docs: add Bundled Skills section with all 25 skills listed (33565b3)
  • feat(skills): add 11 media & communication bundled skills (2b34ac1)
  • feat: add 12 bundled skills inspired by OpenClaw (bbae2dd)
  • feat: add C# + Avalonia UI cross-pla...
Read more