Skip to content

Integrate the pi.dev coding harness as a coder subagent #17

Description

@mattmezza

Summary

Add the ability to invoke a coding harness (pi.dev — an open-source, provider-agnostic terminal
coding agent) for software tasks, wired as a specialised subagent rather than a new subsystem.

Why it fits

pi exposes non-interactive modes intended for embedding: one-shot print (-p / --print) and
newline-delimited JSON events (--mode json), plus RPC/SDK. It is provider-agnostic and works with
the same providers the agent already uses. This is the same product-layer-over-harness-layer shape
already seen elsewhere in the ecosystem.

Approach

  • Add pi to the executor whitelist and an entry in the optional-tools registry (provider-key env
    injection, mirroring the gh token pattern).
  • A coding.md skill documents invocation and JSON-event parsing.
  • Wire it as the coder subagent specialisation: spawn_subagent(persona="coder", ...) shells into
    pi (print / JSON mode) instead of running the agent's own loop.
  • Coding tasks are long-running → use the async subagent path and a per-task workspace dir under
    data/.

Security (hard requirement)

pi has no built-in permission system and runs with the permissions of the launching process; its own
docs recommend containerisation. Therefore pi MUST run inside the sandbox sidecar, never in the main
process. The agent's permission engine gates invoking pi (ASK before a run); the container gates
what pi can touch.

UX & product

  • Admin UI: enable/disable toggle, a provider-key field (stored in the vault), and a coding-run
    viewer showing streamed JSON events / full logs and the produced workspace/diff — responsive/
    touch-friendly
    at phone width, reusing consistent toggle + masked-field + log-viewer components.
  • On the go (Telegram): dispatch a coding task in chat; receive a concise summary (what
    changed, pass/fail) rather than a wall of logs; an inline approval gates the run. Consistent
    progress/approval conventions.
  • Mobile-first: summaries-not-logs keeps coding usable on a phone; full output stays in the web
    UI.

Setup & onboarding

  • Disabled by default; an optional wizard step provisions the sandbox sidecar and the provider
    key via the vault when enabled.

Acceptance criteria

  • A coding task can be dispatched to pi non-interactively and its result returned/streamed.
  • A coding task's outcome can be understood from a Telegram summary on mobile.
  • pi executes only inside the sandboxed sidecar with a scoped workspace.
  • Invoking a coding run requires approval.

Related

  • Depends on: subagents, and the sandbox sidecar from the browser-automation issue.
  • Provider key stored via: secrets vault.

Metadata

Metadata

Assignees

No one assigned

    Labels

    newNew additiontodoPlanned / not yet started

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions