Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
412 changes: 169 additions & 243 deletions IMPLEMENTATION-SUMMARY.md

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -169,6 +169,7 @@ pip install -e ".[dev,etl]"
| [Batch Operations](docs/batch-operations.md) | YAML batch files |
| [SDK Usage](docs/sdk-usage.md) | Python SDK for developers and MCP servers |
| [MCP Server](docs/mcp-server.md) | Drive bcli from Claude Desktop via the `bcli-mcp` server (preview) |
| [Agent Mode](docs/agent.md) | Bare `bcli` chat REPL — ask BC questions in plain language (BYOK / Claude Code / Codex) |
| [Command Reference](docs/command-reference.md) | Complete CLI command reference |
| [For AI Agents](AGENTS.md) | Quick discovery recipes for Claude Code, Cursor, etc. driving bcli on a user's behalf |
| [Contributing](docs/contributing.md) | Development setup, architecture, testing |
Expand Down
164 changes: 164 additions & 0 deletions docs/agent.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,164 @@
# Agent Mode — the `bcli` chat REPL

Agent mode turns `bcli` into an interactive assistant for Business Central: an
LLM drives bcli's own verbs as tools, so you ask questions in plain language and
watch it run `get`, `endpoint search`, `post`, and friends — with the same write
safety the CLI enforces.

```
$ bcli
bcli agent — model: anthropic:claude-sonnet-4-5 · profile: finance · env: sandbox

› how many vendors does LLC have?
→ bcli_get {"endpoint": "vendors", "company": "LLC", "top": 1, "count": true}
✓ bcli_get
LLC has 312 vendors.
```

Bare `bcli` on an interactive terminal launches the chat. Piped or scripted
(`echo … | bcli`, `bcli | cat`) it still prints help — existing automation is
unaffected.

## Quick start

```bash
# Install the agent extra (BYOK loop + the Textual TUI):
uv tool install -e ".[agent]" --force # or: pip install "bc-cli[agent]"

# First launch runs a setup wizard (also: bcli agent init):
bcli
```

The wizard asks which LLM to use, stores any API key in your OS keychain, writes
the `[agent]` config section, and drops you into chat.

## Backends (BYOK or bring your own CLI)

| `[agent] backend` | What it is | Extra |
|---|---|---|
| `pydantic-ai` | In-process loop — any Anthropic / OpenAI / local OpenAI-compatible model (Ollama, vLLM, LM Studio) via `provider:model` strings + `base_url`. The default BYOK path. | `[agent]` / `[agent-local]` |
| `claude-code` | Drives your installed Claude Code through the Claude Agent SDK; bcli's verbs become an in-process MCP server. | `[agent-claude-code]` |
| `codex` | Drives your installed Codex CLI through the `openai-codex` SDK; Codex consumes bcli's existing `bcli-mcp` server. | `[agent-codex]` |
| `null` (default) | No backend; the REPL prints a setup hint. | — |
| `my_pkg.module:MyBackend` | Any class implementing `bcli.agent.AgentSessionBackend` with a `from_config` classmethod. | your own |

All three first-party backends emit the **same** `AgentEvent` stream, so the
chat UI, the write-safety gate, and plan mode behave identically regardless of
which one you pick. Switching is a one-line config change.

### `[agent]` config

```toml
[agent]
backend = "pydantic-ai" # null | pydantic-ai | claude-code | codex | module:Class
model = "anthropic:claude-sonnet-4-5" # provider:model (pydantic-ai); bare name → OpenAI-compatible
api_key_env = "ANTHROPIC_API_KEY" # optional override of the key env var
base_url = "" # Ollama / OpenAI-compatible endpoint
max_steps = 20 # tool-call budget per turn
memory = true # load per-profile BC.md into the prompt
plan_mode_default = "auto" # auto (on for production) | on | off
```

### Local models (no API key)

```toml
[agent]
backend = "pydantic-ai"
model = "ollama:llama3.1"
base_url = "http://localhost:11434/v1"
```

## Credentials

Resolution order matches the rest of bcli (`bcli.auth._credentials`): explicit
`api_key_env` → OS keychain (service `bcli`, key `llm:<provider>`) → the
provider's default env var (`ANTHROPIC_API_KEY`, `OPENAI_API_KEY`). The wizard
writes keys to the keychain; nothing sensitive lands in the config file.

### Subscription auth + the consent gate

`claude-code` and `codex` can ride your personal Claude / ChatGPT subscription
instead of an API key. That's individual-use territory at both vendors
(Anthropic's per-plan Agent SDK credit is sized for one person; Codex
subscription access uses an undocumented endpoint with rate windows). So bcli
**never defaults to it**: when a subscription login is the only credential, the
first run shows an explicit notice and requires you to type literal `yes`.
Consent is persisted as `subscription_authorized = true` + a timestamp under
`[agent]` — visible in plain text, revocable by deleting the line. **Teams
should use API keys**, which never prompt.

## Write safety

Writes are gated **inside the tool implementations**, never just in the prompt:

- `disable_writes = true` profiles, `caution: high` endpoints, and **production**
targets pause the write and raise an approval dialog (or `--yes` in headless
mode). Decline → the model gets a typed refusal and is told not to retry.
- Every approved write goes through `SafeContext` with an explicit
environment + company.
- **Plan mode** (default ON for production): the write tier is replaced by a
single `draft_batch` tool. The agent proposes a `bcli batch` YAML; you review
it, then it's promoted through the normal gated `bcli batch run` path
(dry-run first) — exactly like `bcli extract`. Toggle with `/plan`.

## Chat commands

| Command | Effect |
|---|---|
| `/model [name]` | Show / note a model switch (persist via config) |
| `/profile [name]` | Switch the bcli profile (re-resolves env, company, registry) |
| `/company <alias>` | Set the default company for tool calls |
| `/plan` | Toggle plan mode |
| `/yes` | Approve a pending write |
| `/context` | Show the resolved profile / env / plan-mode context |
| `/clear` | Clear the transcript |
| `/help` | List commands |
| `/exit` (`/quit`, Ctrl+C) | Leave the chat |

## Memory (BC.md)

When `memory = true`, agent mode loads a `BC.md` file into the system prompt
after the base instructions: a project-local `./BC.md` (discovered by walking up
from the working directory) wins over the per-profile
`~/.config/bcli/profiles/<profile>/BC.md`. Use it to pin durable context — "our
vendors are keyed by `displayName`, never by number". Read-only in v1.

## Headless one-shot

```bash
bcli agent run "how many open sales orders are there?"
bcli agent run "draft a vendor for Acme" --plan # force plan mode
bcli agent run "…" --yes # auto-approve writes (scripted; careful)
```

`bcli agent run` streams the answer to stdout and tool activity to stderr —
testable without a TTY and the same engine the chat REPL uses.

## Architecture (engine / renderer split)

The seam between bcli and the LLM is the **session**, not the model call.
`src/bcli/agent/` is the SDK engine: a backend implements
`AgentSessionBackend` and streams uniform `AgentEvent`s (`text_delta`,
`tool_call_started`, `tool_result`, `awaiting_approval`, `turn_complete`,
`error`). `src/bcli_cli/repl/` is one renderer (the Textual app); the headless
`bcli agent run` printer is another. The engine never imports `bcli_cli`.

Tools come from a single source — the same `bcli describe --format json` payload
`bcli_mcp` consumes — projected three ways: in-process pydantic-ai tools, an
in-process Claude SDK MCP server, and (for codex) the existing `bcli-mcp`
subprocess. All paths share the handlers in `src/bcli/agent/tools/_impl.py`, so
write safety lives in exactly one place.

## Verification smoke tests (need real keys / binaries)

These aren't in the automated suite (no network / no installed CLIs in CI):

1. `bcli` on a TTY with no `[agent]` → wizard; configure Ollama → chat opens.
2. BYOK: `[agent] backend=pydantic-ai model=anthropic:claude-sonnet-4-5` → ask
"how many vendors does LLC have?" → watch the tool panel run `get vendors`.
3. Write safety: a `disable_writes=true` sandbox profile → ask the agent to
create a vendor → approval dialog (or plan-mode draft); decline → refusal.
4. Claude Code: a machine with `claude` installed and no `ANTHROPIC_API_KEY` →
wizard offers it, consent text shown, literal `yes` required.
5. Codex: `codex` installed → backend registers `bcli-mcp`, tool calls
round-trip, approval policy surfaces writes.
39 changes: 39 additions & 0 deletions docs/command-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -406,3 +406,42 @@ bcli etl sync --destination filesystem
bcli etl sync --entities customers,vendors --destination duckdb
bcli etl sync --full-refresh --destination iceberg
```

---

## agent (optional — requires `bc-cli[agent]`)

Interactive chat REPL where an LLM drives bcli's verbs as tools. Bare `bcli` on
a TTY launches the chat; non-TTY prints help. See [Agent Mode](agent.md).

### bcli (bare, on a TTY)

```bash
bcli # launch the chat REPL (or the first-run setup wizard)
bcli --profile finance
```

### agent run

Headless one-shot turn — stream the answer to stdout, tool activity to stderr.

```bash
bcli agent run "<prompt>" [options]
```

| Option | Short | Description |
|--------|-------|-------------|
| `--backend <name>` | | One-shot backend override (`pydantic-ai` / `claude-code` / `codex` / `module:Class`) |
| `--model <name>` | | One-shot model override (e.g. `anthropic:claude-sonnet-4-5`) |
| `--plan` | | Force plan mode on (writes become `draft_batch`) |
| `--no-plan` | | Force plan mode off |
| `--yes` | `-y` | Auto-approve gated writes (scripted use; be careful) |

### agent init

Re-run the setup wizard — pick a backend, store the API key in the OS keychain,
write the `[agent]` config section.

```bash
bcli agent init
```
29 changes: 29 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,25 @@ ask-openai = [
mcp = [
"mcp>=1.0",
]
# Agent mode (the interactive chat REPL + headless `bcli agent run`).
# Three backends, each behind its own extra; `agent` is the meta-extra
# that installs the BYOK loop + the Textual TUI. The harness-owned
# backends (claude-code / codex) are additive opt-ins.
agent-local = [
"pydantic-ai-slim[anthropic,openai]>=1.107,<2",
]
agent-claude-code = [
"claude-agent-sdk>=0.2",
]
agent-codex = [
# openai-codex is in beta (0.1.x); allow prereleases so the pin
# resolves. It pulls in its own openai-codex-cli-bin runtime.
"openai-codex>=0.1.0b3",
]
agent = [
"bc-cli[agent-local]",
"textual>=8.2",
]
polaris = [
"bc-cli[etl]",
"pyarrow>=16.0",
Expand All @@ -94,6 +113,7 @@ dev = [
"bc-cli[mcp]",
"bc-cli[extract]",
"bc-cli[ask]",
"bc-cli[agent]",
"pytest>=8.0",
"pytest-asyncio>=0.23",
"pytest-httpx>=0.30",
Expand All @@ -120,6 +140,15 @@ include = [
"pyproject.toml",
]

[tool.uv]
# The optional [agent-codex] extra depends on openai-codex, which is
# beta-only and pulls a pinned prerelease runtime
# (openai-codex-cli-bin==0.137.0a4). Enabling prereleases lets the
# universal lock resolve that extra. Every other dependency has a stable
# release satisfying its constraint, so uv still pins stable versions for
# them — the lockfile records exact versions regardless.
prerelease = "allow"

[tool.pytest.ini_options]
testpaths = ["tests"]
asyncio_mode = "auto"
Expand Down
45 changes: 45 additions & 0 deletions src/bcli/agent/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
"""``bcli.agent`` — the agent-mode engine (Part 4 of the roadmap).

Engine emits events, renderer consumes them: every backend (pydantic-ai
BYOK, claude-agent-sdk, codex) streams uniform :class:`AgentEvent`
records through the :class:`AgentSessionBackend` protocol, and one
renderer — the Textual REPL in ``bcli_cli.repl`` or the headless
``bcli agent run`` printer — consumes them. Write safety lives inside
the tool implementations (:mod:`bcli.agent.tools._impl`), gated by
:class:`AgentRuntime` and resolved through ``awaiting_approval`` events.

Design rules enforced by the package boundary:

* Nothing in here imports from ``bcli_cli`` or ``bcli_mcp``
(CLI → SDK only; the MCP server stays a subprocess concern).
* Optional LLM SDKs are imported lazily inside backends; the factory
falls back to :class:`NullAgentBackend` with a one-shot warning.
"""

from __future__ import annotations

from bcli.agent._factory import get_agent_backend
from bcli.agent._prompt import build_system_prompt
from bcli.agent._protocol import (
AgentEvent,
AgentSessionBackend,
EventKind,
NullAgentBackend,
)
from bcli.agent._runtime import AgentRuntime, WriteGateDecision
from bcli.agent.memory import load_bc_md
from bcli.agent.tools import ToolRegistry, ToolSpec

__all__ = [
"AgentEvent",
"AgentRuntime",
"AgentSessionBackend",
"EventKind",
"NullAgentBackend",
"ToolRegistry",
"ToolSpec",
"WriteGateDecision",
"build_system_prompt",
"get_agent_backend",
"load_bc_md",
]
66 changes: 66 additions & 0 deletions src/bcli/agent/_auth_detect.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
"""Credential detection for the harness-owned backends.

Pure environment / filesystem checks — no optional SDK imports — so the
setup wizard and the consent gate can classify the auth path before
anything heavy loads.

Classification:

* ``"api_key"`` — a sanctioned programmatic key is present; no
consent needed.
* ``"subscription"`` — only subscription credentials are detectable
(Claude Code login / ``~/.codex/auth.json``);
the explicit consent gate applies.
* ``"none"`` — nothing usable found.
"""

from __future__ import annotations

import os
import shutil
from pathlib import Path

AuthKind = str # "api_key" | "subscription" | "none"


def detect_claude_auth(*, home: Path | None = None) -> AuthKind:
"""Classify how a claude-code backend session would authenticate."""
if os.environ.get("ANTHROPIC_API_KEY"):
return "api_key"
if os.environ.get("CLAUDE_CODE_OAUTH_TOKEN"):
return "subscription"
base = home or Path.home()
# Claude Code stores login credentials under ~/.claude (keychain on
# macOS, credentials file elsewhere); the binary on PATH with a
# config dir is the practical signal that a subscription login exists.
if (base / ".claude" / ".credentials.json").is_file():
return "subscription"
if shutil.which("claude") and (base / ".claude").is_dir():
return "subscription"
return "none"


def detect_codex_auth(*, home: Path | None = None) -> AuthKind:
"""Classify how a codex backend session would authenticate."""
if os.environ.get("CODEX_API_KEY") or os.environ.get("OPENAI_API_KEY"):
return "api_key"
base = home or Path.home()
if (base / ".codex" / "auth.json").is_file():
return "subscription"
return "none"


def claude_code_available() -> bool:
return shutil.which("claude") is not None


def codex_available() -> bool:
return shutil.which("codex") is not None


__all__ = [
"claude_code_available",
"codex_available",
"detect_claude_auth",
"detect_codex_auth",
]
Loading
Loading