Skip to content

Slicer/planner over-decompose and scope-creep small requests #4

@n1arash

Description

@n1arash

Summary

The slicer and planner over-decompose and scope-creep small requests, multiplying per-issue cost and shipping unrequested behaviour.

Evidence (dogfood campaign)

  • F5 = "include created_at in the GET /tasks response" (a one-liner) was sliced into 5 issues, including:
    • ISS-002 "Update POST /tasks" — never requested.
    • Two speculative test issues (ISS-004 / ISS-005).
  • The planner independently added the same out-of-scope POST change.
  • The plain claude -p baseline did the whole thing correctly in one 53 s session.

Why it matters

Over-slicing multiplies the per-issue worker + evaluator + merge cost (the 5 issues are why F5 cost $3.23 / 31 min vs the baseline's $0.11 / 54 s) and ships behaviour the user never asked for.

Proposed fix

  • Scale decomposition to request size — add a "single trivial slice" path for one-liners.
  • Instruct the slicer/planner to stay strictly in request scope (no speculative POST changes, no speculative test-only issues).
  • Have the queue reviewer flag aggregate over-decomposition (see the queue-rubric issue — the synthetic reviewer missed it too).

Acceptance criteria

  • A trivial one-line request produces 1 issue (or a single trivial-slice path), not 5.
  • Slicer/planner prompts forbid out-of-scope additions and speculative work.
  • An evaluation fixture (the F5 request) yields an in-scope, minimal decomposition.

Source: dogfood/ITERATION_REPORT.md MAJOR-3; dogfood/AUTOREVIEW_LOG.md queue entries.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:agentsWorker / planner / grill / slicer agent qualityarea:pipelineScheduler / gates / decomposition pipelinebugSomething isn't workingdogfoodSurfaced by the self-driving dogfood campaignmajorMajor — significant impact on cost, reliability, or correctness

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions