Skip to content

Make task.md the native v0.6.2 authoring path#760

Merged
bingran-you merged 4 commits into
mainfrom
bry/taskmd-v062-clean-break
Jun 14, 2026
Merged

Make task.md the native v0.6.2 authoring path#760
bingran-you merged 4 commits into
mainfrom
bry/taskmd-v062-clean-break

Conversation

@bingran-you

@bingran-you bingran-you commented Jun 14, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • bump BenchFlow to 0.6.2
  • make task.md the native task authoring path for bench tasks init
  • reject new split-layout scaffolds through bench tasks init --format legacy
  • keep export as an explicit compatibility path while default docs point contributors at task.md
  • refresh CLI and task-authoring docs around uv tool install benchflow, bench tasks check, and bench eval create
  • avoid using / as the inferred rollout workspace, so Docker verifier snapshots do not copy the whole root filesystem for prebuilt images without WORKDIR

Validation

  • CI test passed
  • CI pip-audit passed
  • uv run --extra dev ruff check .
  • uv run --extra dev pytest -q -> 4074 passed, 49 skipped, 7 deselected
  • targeted regression: tests/test_rollout_upload.py::test_resolve_agent_cwd_avoids_root_workspace
  • verified via SkillsBench #929 VM Docker oracle canary on daytona-orchestrator

Merge order

  1. Merge this BenchFlow 0.6.2 PR.
  2. Publish benchflow==0.6.2 to PyPI.
  3. Merge the SkillsBench native task.md corpus/docs PR.

@mintlify

mintlify Bot commented Jun 14, 2026

Copy link
Copy Markdown

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
benchflow-bff148e7 🟢 Ready View Preview Jun 14, 2026, 4:58 PM

💡 Tip: Enable Workflows to automatically generate PRs for you.

The PR bumps the version to 0.6.2 and changes user-facing CLI behavior
(`bench tasks init --format legacy` now exits with an error) but the
changelog did not mention it. Add a Changed entry under [Unreleased].
The 352-line demotion of docs/task-authoring.md dropped the only docs for a
live, schema-backed feature. Relocate the multi-container section
(environment/docker-compose.yaml, exec_in_service / exec(service=),
verifier.service target-side verification, and the main-only hardening
policy) into docs/task-authoring-task-md.md, translated to task.md frontmatter.

Also:
- Repoint docs/concepts.md and docs/llm-judge.md authoring cross-references at
  the native task.md guide (they pointed at the demoted split-layout page while
  advertising task.toml/tests/ authoring).
- Scope the cutover report's 'no longer scaffolds split layouts' claim to
  'bench tasks init' and note that 'bench tasks generate --task-format legacy'
  (trace import) still emits split packages for adoption compatibility.
- Update .claude/skills/benchflow/SKILL.md, which still advertised the removed
  'bench tasks init --format legacy' behavior.
@bingran-you bingran-you merged commit 8df039b into main Jun 14, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant