Skip to content

feat: phase 3 — workspace capability + pluggable backends (sub-project 4)#170

Open
blove wants to merge 18 commits into
mainfrom
claude/phase3-workspace
Open

feat: phase 3 — workspace capability + pluggable backends (sub-project 4)#170
blove wants to merge 18 commits into
mainfrom
claude/phase3-workspace

Conversation

@blove
Copy link
Copy Markdown
Contributor

@blove blove commented May 21, 2026

Summary

Sub-project 4 of the Dawn opinionated agent harness. The workspace tools (readFile/writeFile/listDir/runBash) move from hand-rolled per-route files into a built-in capability auto-wired by the workspace/ directory convention (cwd-relative, matching the existing agents-md capability). Filesystem and exec implementations become pluggable via a new @dawn-ai/workspace package; defaults preserve existing behavior so apps that don't touch dawn.config.ts keep working unchanged.

dawn.config.ts loader switches from a hand-rolled string-only parser to a tsx-evaluated import so callable backend values can be expressed.

Spec: docs/superpowers/specs/2026-05-20-phase3-workspace-backends-design.md
Plan: docs/superpowers/plans/2026-05-20-phase3-workspace-backends.md

Changes

  • New @dawn-ai/workspace package: FilesystemBackend / ExecBackend type interfaces, localFilesystem() and localExec() defaults, compose() middleware composition helper, and withFilesystemLogging / withExecLogging demonstration middlewares.
  • New createWorkspaceMarker() capability in @dawn-ai/core. Detects <process.cwd()>/workspace/ (matching the existing agents-md capability's resolution — fixed during implementation when the original <routeDir>/workspace/ design broke the chat example's shared-workspace pattern); contributes the four workspace tools routed through configurable backends; enforces path-jail before calling the backend.
  • DawnConfig and CapabilityMarkerContext gain an optional backends: { filesystem?, exec? } field. When omitted, the capability falls back to localFilesystem() + localExec().
  • Tool-name uniqueness check supports overridable capability tools: user-authored tools/readFile.ts (etc.) replaces the workspace capability's contribution; non-overridable tools (writeTodos, readSkill, task) retain the collision error.
  • dawn.config.ts loader switches from hand-rolled parser to tsx import. Existing configs (just { appDir }) continue to work; richer configs (callable backends, imports) now possible.
  • Typegen surfaces the four workspace tools on routes when <process.cwd()>/workspace/ exists.
  • Chat example's hand-rolled tool files deleted from /chat (4 files) and from /coordinator/subagents/research (2 files), plus their workspace-path helpers.

Test plan

  • @dawn-ai/workspace unit tests: types + localFilesystem (5) + localExec (4) + compose (3) + with-logging (3) = 15 cases
  • createWorkspaceMarker unit tests: detect, load, tool wiring, path-jail, default backends, overridable flag — 9 cases
  • checkToolNameUniqueness overridable cases — 3 new cases
  • Config loader rewrite — 6 cases including syntax-error propagation
  • Typegen workspace tools — 2 cases (present + absent)
  • Full repo: 510 tests green, build + typecheck + lint clean
  • Manual Chrome MCP smoke: /chat and /coordinator both produce clean SSE streams via the workspace capability's tools; 0 paired duplicates; 0 errors; done event fires. Research subagent's listDir + readFile now wired through the capability.

Deferred / known limitations

  • HITL permission system (sub-project 4.5) — the capability hard-refuses jail escapes today. A future PR introduces an interrupt() flow so the user can grant per-path permissions, with persistence to a yet-to-be-decided location.
  • Per-route backend override — currently global only. Add via descriptor field if a real use case surfaces.
  • OS-level isolation — out of scope; documented as deployment guidance. The path-jail in the capability is a correctness boundary, not a security boundary against hostile agents.
  • runBash return shape changed — was a formatted string ("<stdout>...[exit N]"), now an object ({stdout, stderr, exitCode}). LangChain JSON-stringifies tool results for the model anyway; should be readable, but a behavior delta worth flagging.
  • listDir no longer sorts or marks directories with / — capability returns raw readdir output. Cosmetic difference; agent can sort if needed.
  • The config-loader try/catch swallows non-ENOENT errors — a dawn.config.ts with a syntax error gets silently ignored, with the workspace capability falling back to defaults. Acknowledged as a follow-up: should narrow the catch to ENOENT only (or warn loudly on other errors).

Architectural note — workspace root resolution

The original spec said the capability would resolve workspace as <routeDir>/workspace/. During implementation, T13's chat-example migration surfaced that the existing agents-md capability and the prior hand-rolled tools both used <process.cwd()>/workspace/. The chat demo depends on a single shared workspace across routes (/chat and the research subagent both read/write the same files; AGENTS.md lives there too). Per-route workspaces would have broken that pattern.

Aligned the workspace capability with the existing convention rather than introduce a new inconsistent one. Documented in the marker source.

🤖 Generated with Claude Code

blove and others added 17 commits May 20, 2026 15:03
Design for sub-project 4 of the Dawn opinionated agent harness. The
workspace tools (readFile, writeFile, listDir, runBash) become a built-
in capability auto-wired by the convention of having a workspace/
directory under a route. Filesystem and exec implementations become
pluggable via a new @dawn-ai/workspace package shipping the type
interfaces, localFilesystem/localExec defaults, a compose() helper,
and one demonstration middleware (withLogging).

dawn.config.ts switches from the existing hand-rolled string-only
parser to tsx-evaluated import so callable backend values can be
expressed naturally. Default behavior is unchanged: apps that don't
touch dawn.config.ts keep working.

Path-jail enforcement lives in the capability; backends receive
already-resolved absolute paths. Human-in-the-loop permission gating
(interrupt to ask the user about jail escapes) is deferred to a
separate sub-project (4.5) with its own brainstorm + spec + plan.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Bite-sized, TDD-structured plan covering: @dawn-ai/workspace package
(types, localFilesystem, localExec, compose, withLogging), the
createWorkspaceMarker capability, dawn.config.ts loader switch from
hand-rolled parser to tsx import, tool-name uniqueness check inversion
for overridable tools, runtime wiring, typegen, chat example migration,
and the smoke + PR steps. 15 tasks; each commits independently.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds the package skeleton (manifest, tsconfig, vitest config) for the
upcoming pluggable workspace backends. No exports yet — types, defaults,
and helpers land in subsequent commits.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…sx import

The hand-rolled parser supported only string-literal property values and
const string bindings. The upcoming workspace capability needs to express
callable backend values in dawn.config.ts, which strings can't express.
Switch to a tsx-evaluated dynamic import (same loader Dawn already uses
for route discovery and tool execution).

Existing dawn.config.ts files (just { appDir }) remain valid TS modules
and continue to load without modification.

Side-effects of the loader swap:
- Two CLI integration tests assumed the old parser's specific error
  message or its fresh-read-from-disk behavior. The verify test's
  expected error string is updated to match the runtime ReferenceError
  that the tsx import now surfaces, and the dev test that mutated
  dawn.config.ts mid-session is rewritten to start the dev process with
  the invalid config in place (Node's ESM cache prevents a re-import of
  the same module URL within one process — mid-session config edits
  will become a per-task concern as backends land).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Type-only edge: @dawn-ai/core now imports FilesystemBackend/ExecBackend
types from @dawn-ai/workspace via 'import type'. No runtime weight yet
(workspace stays in devDependencies until the marker lands).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Auto-detects a route's workspace/ directory and contributes four tools
(readFile/writeFile/listDir/runBash) routed through configurable
backends. Defaults to localFilesystem + localExec when no backends are
configured in dawn.config.ts. Path-jail enforced in the capability;
backends receive resolved absolute paths.

Tools carry an `overridable: true` flag so a future uniqueness-check
inversion can let user-authored tools/<name>.ts files supersede them.

Promotes @dawn-ai/workspace to a runtime dependency of @dawn-ai/core,
and extends the cli typegen harness to pack @dawn-ai/workspace
alongside cli/core/langchain/langgraph/sdk so externally installed
dawn bin tests resolve the new transitive dep.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Tools marked overridable on a capability contribution can be shadowed
by a user-authored tool with the same name. Used by the workspace
capability so authors can override readFile/writeFile/listDir/runBash
by dropping a file in tools/. Non-overridable capability tools
(writeTodos, readSkill, task) retain the collision error.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…config

Registers createWorkspaceMarker in the capability registry. Loads
dawn.config.ts at the start of prepareRouteExecution and threads
config.backends into the CapabilityMarkerContext so the workspace
marker uses the configured backends (defaulting to localFilesystem +
localExec when none are configured).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Delete the hand-rolled readFile/writeFile/listDir/runBash tool files
(and their workspace-path helpers) from both the /chat route and the
research subagent. The workspace capability auto-contributes these
tools when the route has a workspace/ directory, so add empty
workspace/ dirs (with .gitkeep) under both routes to opt in.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…agents-md

T13's migration of the chat example surfaced a mismatch: the workspace
capability was resolving to <routeDir>/workspace/ while the agents-md
capability (and the prior hand-rolled tools) used
<process.cwd()>/workspace/. Result: post-migration, the chat agent's
memory file and its workspace tools pointed at completely different
directories.

Align the workspace capability with the existing convention:
process.cwd()/workspace/. Same trigger as agents-md; same root as the
deleted hand-rolled tools. The chat example's pre-existing
examples/chat/server/workspace/ directory (with AGENTS.md) now serves
as the workspace for both /chat and the research subagent.

Removes the empty per-route workspace/ stubs T13 created.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented May 21, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
dawnai Ready Ready Preview, Comment May 21, 2026 3:56am

Request Review

- system-prompt: runBash signature is { command } now (no timeoutSeconds);
  returns { stdout, stderr, exitCode } instead of a formatted string
- README: status reflects shipped subagents + workspace capabilities;
  layout shows current file structure (no tools/, no workspace-path.ts);
  deferred list updated to flag HITL permission gating (sub-project 4.5)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant