Reviewer calibration gaps: aggregate over-decomposition + strictness tunable

## Summary
The queue reviewer and plan reviewer have calibration gaps. The queue reviewer judges each slice's coherence but **not aggregate over-decomposition**, and plan-reviewer strictness is load-bearing but not exposed as a tunable.

> ⚠️ The reviewer in the dogfood run was a **synthetic LLM judge**, not a human. These are real calibration signals, but human review DX is deferred to an attended run.

## Evidence (dogfood campaign)
- The queue reviewer **approved the over-sliced F5 queue** (5 issues + scope creep) — it never asks *"is this the right number of slices for the request?"*
- The plan reviewer, at full strictness, bounced a **sound trivial plan 3×** on style/truncation nits; recalibrating to "approve sound, request-changes only for substantive defects" then approved correctly. Calibration is load-bearing.
- F2 failed `doc_review` because a mandated request-changes + a demanding judge never converged in 2 cycles — the revise→review loop can diverge with a cheap grill model.

## Proposed fix
- Add an **aggregate over-decomposition check** to the queue rubric ("right *number* of slices for the request?").
- Expose **reviewer strictness** (demanding vs demanding-but-fair) as an operator-tunable.
- Guard the revise→review loop against divergence (cap cycles / detect non-convergence with cheap models).

## Acceptance criteria
- [ ] Queue rubric includes an aggregate-slice-count check and flags over-decomposition.
- [ ] Reviewer strictness is a documented, operator-settable knob.
- [ ] The revise→review loop has a non-convergence guard rather than silently failing after N cycles.

_Source: `dogfood/ITERATION_REPORT.md` MINOR-7; `dogfood/AUTOREVIEW_LOG.md`._


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reviewer calibration gaps: aggregate over-decomposition + strictness tunable #8

Summary

Evidence (dogfood campaign)

Proposed fix

Acceptance criteria

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Reviewer calibration gaps: aggregate over-decomposition + strictness tunable #8

Description

Summary

Evidence (dogfood campaign)

Proposed fix

Acceptance criteria

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions