Skip to content

docs(agent-eval-adoption): dynamic driver + optimizePrompt; fix stale topology claims#6

Merged
drewstone merged 1 commit into
mainfrom
feat/eval-adoption-dynamic-driver-optimizeprompt
May 31, 2026
Merged

docs(agent-eval-adoption): dynamic driver + optimizePrompt; fix stale topology claims#6
drewstone merged 1 commit into
mainfrom
feat/eval-adoption-dynamic-driver-optimizeprompt

Conversation

@tangletools
Copy link
Copy Markdown
Contributor

The agent-eval-adoption skill predated agent-runtime 0.33.0 and was actively
steering adopters wrong:

  • Topologies section was stale/incorrect. It said "Refine + FanoutVote
    shipped; others deferred"
    and explicitly told people the Decompose
    topology was NOT shipped and to fork a custom Driver. But createDynamicDriver
    • createSandboxPlanner (agent-authored topology) ship in 0.33.0 and subsume it.
  • createFanoutVoteDriver signature was wrong ({variants, scoreFn} → actual
    {n, selector}).
  • No mention of optimizePrompt (identity-gated prompt-surface optimization).

Changes:

  • Refresh Topologies: add createDynamicDriver / createSandboxPlanner,
    move Decompose out of "deferred", correct the fanout-vote signature. Council +
    Pipeline remain the only deferred topologies.
  • New subsection "Prompt-surface optimization — optimizePrompt (identity-gated)":
    the reusable [SURFACE]-prompt optimization recipe (extract prompt → domain
    scenarios → judge dims → runWithPrompt → gepaDriver + held-out gate) and the
    footguns learned the hard way — gepaDriver is text/H2-only, scenarios must be
    domain-real, extend-don't-fork an existing runImprovementLoop, cost via ctx.cost.
  • Frontmatter + plugin.json descriptions updated for discoverability (0.33.x).

Pairs with the self-contained agent-runtime-adoption skill now shipped inside
the agent-runtime repo (tangle-network/agent-runtime#77), so external consumers
of that package need nothing from this marketplace.

…tale topology claims

The skill predated agent-runtime 0.33.0 and was steering adopters wrong:
- "Topologies — Refine + FanoutVote shipped; others deferred" told people the
  Decompose topology was NOT shipped and to fork a custom Driver. createDynamicDriver
  + createSandboxPlanner (agent-authored topology) ship in 0.33.0 and subsume it.
- createFanoutVoteDriver signature was stale ({variants, scoreFn} → actual {n, selector}).
- No mention of optimizePrompt (identity-gated prompt-surface optimization).

Changes:
- Refresh the Topologies section: add createDynamicDriver/createSandboxPlanner,
  move Decompose out of "deferred", correct the fanout-vote signature. Council +
  Pipeline remain the only deferred topologies.
- New "Prompt-surface optimization — optimizePrompt (identity-gated)" subsection:
  the reusable [SURFACE]-prompt optimization recipe + the gepaDriver footguns
  (text/H2-only, domain-real scenarios, extend-don't-fork, cost via ctx.cost).
- Update frontmatter + plugin.json descriptions for discoverability (0.33.x).
Copy link
Copy Markdown
Contributor

@drewstone drewstone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Auto-approved tangletools PR — 93f4cd24

This PR was opened by the trusted tangletools automation account.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.

tangletools · auto-approval · reason: tangletools_author · 2026-05-31T01:22:32Z

@tangletools
Copy link
Copy Markdown
Contributor Author

tangletools commented May 31, 2026

✅ No Blockers — 93f4cd24

Readiness 82/100 · Confidence 65/100 · 4 findings (2 medium, 2 low)

deepseek kimi-code aggregate
Readiness 82 82 82
Confidence 65 65 65
Correctness 82 82 82
Security 82 82 82
Testing 82 82 82
Architecture 82 82 82

Full multi-shot audit completed 1/1 planned shots over 2 changed files. Global verifier still owns final merge decision. | Full multi-shot audit completed 1/1 planned shots over 2 changed files. Global verifier still owns final merge decision.

🟠 MEDIUM Key docs section retains stale package version references inconsistent with updated frontmatter — plugins/agent-eval-adoption/skills/agent-eval-adoption/SKILL.md

The frontmatter description was updated to reference @tangle-network/agent-eval (0.50.x+) and @tangle-network/agent-runtime (0.33.x+), and the body documents createDynamicDriver / optimizePrompt as agent-runtime 0.33.0+ APIs. However, the Key docs section at lines 1118–1126 was left unchanged and still lists @tangle-network/agent-eval@0.36.x README and @tangle-network/agent-runtime@0.28.x/loops. An adopter following the Key docs index may look at the wrong package version documentation and fail to find the newly documented APIs. Fix: update Key docs to @tangle-network/agent-eval@0.50.x+ README and `@tangle-netw

🟠 MEDIUM Stale version references in Key docs section after description bump — plugins/agent-eval-adoption/skills/agent-eval-adoption/SKILL.md

L1118 references @tangle-network/agent-eval@0.36.x but the description frontmatter (L3) was bumped to 0.50.x+. L1122 references @tangle-network/agent-runtime@0.28.x/loops but the description now says 0.33.x+ and new features (createDynamicDriver L145-146, optimizePrompt L180) require 0.33.0+. The agent-runtime mismatch is clearly wrong and will misdirect readers to stale API docs. Fix: update L1118 to 0.50.x and L1122 to 0.33.x to match the frontmatter version target.

🟡 LOW Comment references wrong variable name in optimizePrompt example — plugins/agent-eval-adoption/skills/agent-eval-adoption/SKILL.md

L194 comment says assign result.prompt unconditionally but the destructuring at L187 binds prompt directly (not result.prompt). The comment should say prompt to match the destructured variable. Minor reader confusion risk.

🟡 LOW optimizePrompt example destructures prompt but prose refers to result.promptplugins/agent-eval-adoption/skills/agent-eval-adoption/SKILL.md

Lines 187–194 destructure the return value as const { prompt, improved, decision, delta } = await optimizePrompt(...), yet the comment on line 194 says // assign result.prompt unconditionally, and the prose on line 199 says `result.prompt` is the baseline UNLESS `decision === 'ship'`. A reader copying the d


tangletools · 2026-05-31T01:29:30Z · trace

@drewstone drewstone merged commit 1d0b198 into main May 31, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants