Skip to content

ideation: close the research arc into a loop — tend, atlas, modeling + agent-facing hardening#34

Open
aadarwal wants to merge 6 commits into
masterfrom
ideation
Open

ideation: close the research arc into a loop — tend, atlas, modeling + agent-facing hardening#34
aadarwal wants to merge 6 commits into
masterfrom
ideation

Conversation

@aadarwal

Copy link
Copy Markdown
Owner

Acting on the "what next?" strategy discussion: the diagnosis was that anu has built the harness and written the manifesto, but the arc (find → understand → do → show) is still a line of commands a human re-invokes, and the agent-facing paths fail silently. This branch closes the arc into a loop and hardens the substrate. Built in an isolated worktree; every milestone committed + pushed; full test suite green throughout (the one links failure is pre-existing and unrelated — a stale config/claude/skills symlink source).

What's here

Hardening — fail loud, not silent (core)
New config/bash/fns/core: _anu_require / _anu_die / _anu_warn, the one place anu's diagnostic vocabulary lives so an unattended cxc run hits a clear "missing required tool" line instead of a symptom three calls deep. Wired into swarm() and nc/ncn; swarm send/capture/inspect now list the available agents on a bad id, and swarm send no longer claims success when pane injection fails.

tend — the continuity layer (new plugin + config/bash/fns/tend + bin/tend)
Register checks that should keep passing, run them on a cadence, record the drift, and (armed) spawn a contained agent to self-heal. Turns the arc from a line into a loop. tend add/run/watch/heal/log/rm/dash/cron, a fixed B&W health dashboard (autonomy you can see), a cron heartbeat, and a skill that teaches heal-vs-surface judgment.

atlas — the agent arXiv (new plugin + config/bash/fns/atlas)
Indexes every /study, /investigate, /map and trail into one B&W page and collects their gaps + frontiers into a single queue the next run pulls from — so knowledge compounds. Pure render (no LLM). Records only real edges (study→present, investigate→trail); refuses to fabricate a gap→hypothesis link. Validated on the real corpus (the chemistry /study, the homi photonics /investigate, 23 frontier items) — which also surfaces the real worked example honestly, without touching the contended website redesign.

modeling — do the quantitative science (new plugin)
The deferred /modeling skill: turn data (a file, a simulation, or a digitized figure) into a fitted, model-selected, uncertainty-quantified law — composing the Axiomatic MCP tools, ncn, and trail/present. The quantitative twin of /investigate.

Quality

  • New tests: core_test.sh, atlas_test.sh, tend_test.sh (+ smoke/contract wiring). 15 bash suites + python + contracts.
  • An adversarial review (5 dimensions, every finding independently verified) found 25 issues; 14 were confirmed and fixed (incl. a HIGH: bin/swarm broke on the new guard via the agent path), 11 false alarms correctly dismissed. See the final commit.
  • All generated prose follows the house style (B&W minimal; em-dashes/​"not X but Y" removed to match the deslop work).

Not done (deliberately)

  • The michelangelo/website flagship is still a mock-up; de-faking it is left to the active website redesign on another branch. The real run is now first-class in the atlas, ready to embed.

🤖 Generated with Claude Code

aadarwal and others added 6 commits June 15, 2026 00:23
…ed agents

Add config/bash/fns/core — _anu_require / _anu_die / _anu_warn / _anu_note —
the one place anu's diagnostic vocabulary lives, so a full-auto cxc run hits a
clear "missing required tool: X" line instead of a symptom three calls deep.

Wire it into the genuinely unguarded paths:
- swarm() guards jq+tmux for every real subcommand (help still works bare)
- nc/ncn require ssh before opening an ssh/cluster/apple channel
- swarm send/capture/inspect list the available agents on a bad id, so a
  mistyped agent self-corrects instead of dead-ending
- swarm send no longer claims "sent" when live pane injection fails — it says
  the message is queued to the mailbox and the pane may be gone

Tests: new tests/bash/core_test.sh (18 assertions); helpers.bash loads core so
functions-under-test resolve their primitives. smoke/swarm/ncn stay green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ibes it

anu could write and animate research (science-writing, tikz, manim, marimo) but
nothing actually did it. modeling is the quantitative do-stage twin of
/investigate: take data (a file, a simulation, or a digitized figure), frame
candidate models, fit with parameter covariance, select on evidence (AIC/BIC +
cross-validation, not in-sample fit), and discover the functional form when it's
unknown (symbolic regression) — composing the Axiomatic MCP tools
(AxModelFitter / AxEquationExplorer / AxPlotToData / AxArgmin), ncn for heavy
compute, and trail/present to record and show.

This is the /modeling plugin the build notes had deferred. Wired into the
marketplace; passes the plugin contract tests. Command /modeling, skill modeling.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… arXiv)

Every /study, /investigate, /map and trail render lands alone under
~/.anu/atlas. atlas indexes them into one fixed B&W page and collects their
gaps, open questions and frontiers into a single queue — so finished work
exposes what it left open and the next run pulls its question from there. It
turns the find->understand->do->show arc from a line into a loop.

- build.py scans ~/.anu/atlas + ~/.anu/trail into atlas.json (no LLM; the corpus
  IS the on-disk JSON). Only real edges are recorded (study->present,
  investigate->trail); the cross-arc gap->hypothesis link is left to the shared
  frontier rather than fabricated.
- render.py injects into the fixed template (same </ escaping contract as
  map/study/trail). B&W, structure over prose.
- `atlas` / `atlas open` / `atlas ls` shell command, guarded through core.
- skill + command teach the discipline: consult the frontier before starting,
  deposit your own gaps when you finish.

Validated on the real corpus: surfaces the chemistry /study, the homi photonics
/investigate (verdict + 12->4 funnel), the anu map and trail, with 23 frontier
items aggregated. tests/bash/atlas_test.sh (24 assertions); smoke + contracts
stay green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…loop

The research arc was a line of commands a human re-invokes. tend makes it a
loop: register checks that should keep passing, run them on a cadence, record
the drift, and — when armed — spawn a contained agent (cxc) to self-heal the
moment one breaks. This is the most-named, least-built gap from the build notes
("tend — zero infrastructure exists").

- config/bash/fns/tend: add / run / watch (live loop) / heal / log / rm / dash /
  cron, over one JSON per watch under ~/.local/share/anu/tend. Guarded through
  core; healers run contained (cxc), never with host creds.
- config/bash/bin/tend: standalone wrapper so the cron heartbeat works without an
  interactive shell (functions aren't on PATH for cron).
- plugins/tend: build.py + render.py + a fixed B&W health dashboard (the same
  template grammar as map/atlas/trail — autonomy you can see), plus a skill that
  teaches the judgment: --auto only for safe/reversible fixes, surface anything
  with judgment or irreversibility (a refuted claim is a result, not a bug). A
  watch can re-open the research frontier — tend closing the loop with atlas/trail.

tests/bash/tend_test.sh (25 assertions); smoke (+tend) and contracts stay green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- README research arc: add the modeling (quantitative do), atlas (compound) and
  tend (continuity) rows; show atlas+tend closing the arc into a loop; note the
  atlas index and tend state under state/data.
- Match the house style across all new files: drop em/en dashes (colon / period
  / parens per context, 2-5 not 2-5) and "not X, but Y" phrasing, as the
  deslop-plugins branch is doing elsewhere. Zero U+2014/U+2013 remain.

No behavior change; full suite green (the links failure is pre-existing and
unrelated: the stale config/claude/skills symlink source).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
An adversarial review (5 dimensions, every finding independently verified; 11
false alarms correctly dismissed) surfaced real defects, fixed here.

HIGH
- bin/swarm sourced fns/swarm but not fns/core, so the new _anu_require guard
  died with "_anu_require: command not found" on every non-interactive call,
  exactly the agent path the wrapper exists for. Source core first (like bin/tend).

MED
- tend add looped forever when --every/--heal was the final arg (shift 2 with one
  positional never terminates); guard the value before consuming it.
- _tend_secs crashed on fractional/garbage durations (1.5h aborted with an
  arithmetic error, leaving an empty value that made jq truncate the watch file to
  0 bytes and poison the store); validate the integer stem, default to 1h.
- atlas/tend build.py crashed on any JSON parsing to a non-dict; load() now
  enforces dict-ness and tend filters non-dict history elements.
- atlas ls rendered before mkdir on first use; build.py also mkdirs its out dir.
- auto-heal had no debounce: a still-failing watch re-spawned a fresh contained
  healer every cadence tick; record healing_since and skip re-heal within a cooldown.
- watch command was space-flattened (lost quoting); store via printf %q so it
  round-trips through bash -c.
- bin/tend exports a PATH so box/jq resolve under cron (headless heal).
- tend help sed range overshot the header and dumped source; bounded to 2,23.

Tests: +2 core (bin-wrapper-sources-core regression), +7 tend (malformed
durations, value-less flag, quoting round-trip, literal auto-heal + debounce
marker). Full suite green; the links failure is pre-existing and unrelated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant