feat(agent): bcli agent mode — interactive chat REPL over BC verbs (Part 4)#26
Open
igor-ctrl wants to merge 6 commits into
Open
feat(agent): bcli agent mode — interactive chat REPL over BC verbs (Part 4)#26igor-ctrl wants to merge 6 commits into
igor-ctrl wants to merge 6 commits into
Conversation
…ckends/memory) Incomplete checkpoint from interrupted run: no tests, extras, or CLI wiring yet. Imports clean. Continuation will complete Parts 1-4.
…adless run Finishes the interrupted Part-1 skeleton (4a71910) into a tested, wired, shippable BYOK agent engine. - fix(pydantic-ai): import AgentRunResultEvent from pydantic_ai, not pydantic_ai.messages (the WIP import would ImportError at turn time). - wire 'bcli agent' (run + init) into the Typer app; bare-bcli REPL entry lands in Part 2. - repl package __init__ (lazy launch_repl) + console setup wizard (_wizard.py): backend detection, [agent] section assembly, keychain key storage, subscription-consent hook. Reachable via 'bcli agent init' and (Part 2) first-run. - pyproject extras: agent-local (pydantic-ai-slim[anthropic,openai] >=1.107,<2), agent-claude-code, agent-codex, agent meta-extra (+textual>=8.2); added to dev. - tests/test_agent (51): factory dispatch + Null fallback, ToolRegistry tier/plan-mode + describe round-trip + bcli_mcp parity, write-safety gate matrix (disable_writes/caution=high/production, decline->refusal, auto-approve, fail-closed) + draft_batch, read handlers, PydanticAI backend event-stream shape via TestModel (no network), wizard logic, consent gate, headless 'agent run' + plan-mode resolution. ruff clean; full suite green (one pre-existing env-pollution failure in test_context unrelated to this change).
Bare `bcli` on a TTY now opens an interactive agent chat; non-TTY still prints help (regression-tested) so scripted callers are untouched. - app.py: no_args_is_help=False + invoke_without_command callback; branch on ctx.invoked_subcommand is None — dual-TTY → lazy-import repl, else help. Agent stack never imported for ordinary subcommands. - repl/_app.py: ChatApp (Textual) — scrolling transcript, MarkdownStream streaming, ToolCallPanel cards, StatusBar, approval modal; turns run in an exclusive worker consuming AgentEvents; long-lived AsyncBCClient + AgentRuntime; first-run wizard via run_repl. - _widgets.py: StatusBar / ToolCallPanel / ApprovalScreen (y/n + buttons, resolves the runtime gate future). - _commands.py: slash parser (/model /profile /company /plan /yes /context /clear /help /exit + aliases), pure + testable. - _plan_mode.py: drafted batch YAML → temp file → gated 'bcli batch run' (dry-run then real), same path bcli extract uses. - tests/test_repl (22): bare-entry non-TTY/TTY regression, slash parsing, plan-mode argv + round-trip, wizard config-write, Textual run_test() pilots feeding canned AgentEvent streams (text/tool/approval). - fix(test): pin the text-only pydantic-ai test to call_tools=[] so it no longer shells out to a real 'bcli batch' subprocess (that side effect was polluting test_context's last-error read). ruff clean; full suite green (1020 passed, 5 skipped).
Drives the user's installed Claude Code as a harness-owned loop, exposing
bcli's verbs through an in-process SDK MCP server built from the SAME
tools/_impl.py handlers the pydantic-ai backend uses — write safety stays
in one place.
- backends/_claude_sdk.py: ClaudeCodeBackend over ClaudeSDKClient.
system_prompt carries bcli's prompt; allowed_tools is restricted to
mcp__bcli__* (built-in coding tools never allowed); can_use_tool is a
coarse fence (allow bcli tools, deny everything else) on top of the
per-handler write gate. Handles the documented Python quirk: streaming
AsyncIterable prompt + dummy PreToolUse hook returning {continue_:True}
so can_use_tool fires. Translates AssistantMessage/TextBlock/
ToolUseBlock/ToolResultBlock/ResultMessage → uniform AgentEvents.
- consent flow (_consent.py, already present): claude-code on a
subscription login requires literal 'yes', persisted via tomlkit; API
keys never prompt — covered by test_consent.
- [agent-claude-code] extra already declared in Part 1.
- tests/test_agent/test_claude_sdk_backend.py (5): fake claude_agent_sdk
injected into sys.modules (package not installed) — factory build,
event translation, can_use_tool fence, dummy hook, bcli-only
allowed_tools.
ruff clean; agent suite 56 passed.
Drives the user's installed Codex as a harness-owned loop. Codex is an MCP client, so it consumes bcli's existing bcli_mcp stdio server — zero new tool code; the write gate runs one layer down in the bcli subprocess (confirm_write_or_exit + disable_writes) reinforced by codex approval_mode. DEVIATION (documented): the plan assumed an 'import codex' JSON-RPC thread/turn/item surface. The actually-published package is openai-codex (import openai_codex, beta 0.1.0b3): AsyncCodex().thread_start(...) -> thread.turn(input) -> AsyncTurnHandle.stream() yielding notifications + a TurnResult. This backend targets that real API; inspected the live PyPI metadata + GitHub api-reference.md to confirm signatures. - backends/_codex.py: CodexBackend over AsyncCodex. to_mcp_config() registers bcli_mcp (bcli-mcp script or 'python -m bcli_mcp') with BCLI_PROFILE env. base_instructions carries bcli's system prompt; approval_mode escalates to on_request under production/plan mode (defensive enum probing for the beta). Notifications mapped best-effort to AgentEvents (assistant text -> text_delta, tool/mcp/ command items -> tool_call_started); final answer from TurnResult. - pyproject: [agent-codex] = openai-codex>=0.1.0b3; [tool.uv] prerelease = allow so the universal lock resolves the beta's pinned prerelease runtime (openai-codex-cli-bin) — core deps still pin stable. - tests/test_agent/test_codex_backend.py (5): fake openai_codex in sys.modules — to_mcp_config, factory build, notification mapping + final answer, thread_start config/instructions, production approval escalation. ruff clean; full suite 1030 passed, 5 skipped.
- docs/agent.md: end-to-end guide (quick start, three backends + config, credentials + subscription consent gate, write safety + plan mode, chat commands, BC.md memory, headless run, engine/renderer architecture, live smoke-test checklist). - docs/command-reference.md: 'agent' section (bare bcli, agent run, agent init). - README: Agent Mode row in the docs table. - IMPLEMENTATION-SUMMARY.md: rewritten for the agent-mode plan (was a stale unrelated summary) — per-part build log, commit list, test counts (test_agent 61, test_repl 22, full suite 1030 passed), deviations, and manual follow-ups.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
bcliagent mode — a Claude Code / Codex-style interactive chat REPL where an LLM drives bcli's own verbs (get/query/post/batch/…) as tools. This is Part 4 of the agent-evolution roadmap; Parts 0–3 (ContextBundle, packs,bcli ask, site) shipped to make this a wiring exercise.Typing bare
bclion a TTY launches the chat TUI; subcommands are unchanged and non-TTY invocations still print help (so scripts/agents that pipe bcli are unaffected).Architecture
AgentSessionBackendprotocol emitting anAgentEventstream (text_delta/tool_call_started/tool_result/awaiting_approval/turn_complete/error), consumed by one Textual TUI — so all backends feel identical. Factory mirrorsbcli.ask(_BUILTIN_BACKENDS+module:Classescape hatch +NullAgentfallback).pydantic-ai— BYOK in-process loop (provider:modelstrings: Anthropic / OpenAI / local Ollama / OpenAI-compatiblebase_url).claude-agent-sdk— rides the user's installed Claude Code (in-process@tools,tools=[]strips coding built-ins,can_use_toolpermission callback).codex—openai-codexSDK, reusing the existingbcli_mcpserver as the tool surface (no new tool code).bcli describe --format json(the same sourcebcli_mcpuses), projected three ways.src/bcli/agent/tools/_impl.py):disable_writes,caution == "high", and production targets emitawaiting_approval, resolved by the Textual approval dialog or/yes. Plan-mode drafts abatch.yamlfor review → confirm →batch run.bcli agent init), consent gate for subscription-auth backends (never the default; persisted with timestamp), per-profileBC.mdmemory (read-only in v1).SDK/CLI split preserved:
bcli.agentnever importsbcli_cli; the agent/Textual/pydantic-ai stack is lazy-loaded so ordinary subcommands import none of it.New optional extras
[agent-local](pydantic-ai),[agent-claude-code],[agent-codex], and[agent]meta-extra (addstextual). Backends behind extras fall back toNullAgent+ a one-shot warning when the SDK isn't installed.Test plan
uv run pytest tests/— 1030 passed, 5 skipped (test_agent 61, test_repl 22; no network — pydantic-ai viaTestModel, claude-agent-sdk/openai-codex faked insys.modules).uv run ruff check src/— clean.bclipiped (non-TTY) prints help, does not launch the REPL (regression-tested).textual/pydantic_ai/bcli.agent/replmodules.openai-codex_notification_to_eventmapping against a real stream (SDK is beta0.1.0b3; its API differs from the original plan and the mapping is defensive). SeeIMPLEMENTATION-SUMMARY.mdanddocs/agent.md.Notable deviation
The Codex SDK is
openai-codex(importopenai_codex, beta0.1.0b3) with a high-levelAsyncCodex().thread_start → thread.turn → stream()API — not theimport codexJSON-RPC surface the plan assumed. Implementation targets the real API;[tool.uv] prerelease = "allow"lets the lock resolve the pinned prerelease while core deps stay stable.