Skip to content

Fix: Harden Task Planner acceptance criteria coverage (fixes #2148)#2149

Open
vdstrizhkova wants to merge 3 commits into
microsoft:mainfrom
vdstrizhkova:feat/2148-acceptance-criteria-coverage
Open

Fix: Harden Task Planner acceptance criteria coverage (fixes #2148)#2149
vdstrizhkova wants to merge 3 commits into
microsoft:mainfrom
vdstrizhkova:feat/2148-acceptance-criteria-coverage

Conversation

@vdstrizhkova

@vdstrizhkova vdstrizhkova commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Pull Request

Description

Hardens Task Planner and Plan Validator behavior so explicit acceptance criteria are treated as required, traceable planning inputs with blocking coverage validation before handoff.

Key updates:

  • Task Planner now extracts each acceptance criterion as a separate AC item with stable IDs (for example, AC-01, AC-02) and prevents criterion collapse.
  • Plans now require an Acceptance Criteria Coverage section that maps each AC to implementation steps, details references, validation evidence, and status.
  • Implementation details now require step-level AC references.
  • Added the Needs clarification status to support ambiguous criteria.
  • Plan Validator now enforces AC coverage mapping and reports AC-specific gaps with severity (High by default, Critical when blocking completion).
  • Validation evidence classification is explicit: Partial when evidence is incomplete/unvalidated, Missing when no plan evidence exists.
  • Planning Log discrepancy taxonomy now explicitly includes AC, DR, DD, and RI entries.

Related Issue(s)

Closes #2148

Type of Change

Select all that apply:

Code & Documentation:

  • Bug fix (non-breaking change fixing an issue)
  • New feature (non-breaking change adding functionality)
  • Breaking change (fix or feature causing existing functionality to change)
  • Documentation update

Infrastructure & Configuration:

  • GitHub Actions workflow
  • Linting configuration (markdown, PowerShell, etc.)
  • Security configuration
  • DevContainer configuration
  • Dependency update

AI Artifacts:

  • Reviewed contribution with prompt-builder agent and addressed all feedback
  • Copilot instructions (.github/instructions/*.instructions.md)
  • Copilot prompt (.github/prompts/*.prompt.md)
  • Copilot agent (.github/agents/*.agent.md)
  • Copilot skill (.github/skills/*/SKILL.md)
  • Eval spec added/updated for changed AI artifacts (evals/)

Other:

  • Script/automation (.ps1, .sh, .py)
  • Other (please describe):

Sample Prompts (for AI Artifact Contributions)

User Request:
"Harden the Task Planner instructions so Jira acceptance criteria become explicit, traceable planning inputs with a blocking coverage check before handoff."

Execution Flow:

  1. Task Planner extracts explicit acceptance criteria from source artifacts and assigns stable AC IDs.
  2. Planner generates required Acceptance Criteria Coverage table in the implementation plan.
  3. Planner includes step-level AC references in implementation details.
  4. Plan Validator checks each AC for mapping to plan step(s), details step(s), success criterion, and validation task.
  5. Validator records AC-prefixed discrepancies in Planning Log and blocks when unresolved Missing/Partial coverage remains.

Output Artifacts:

  • Updated Task Planner and Plan Validator agent instructions under .github/agents/hve-core/.
  • Planning artifacts generated downstream now include explicit AC coverage rows and AC/DR/DD/RI discrepancy handling.

Success Indicators:

  • Every explicit acceptance criterion appears in the plan coverage table with a valid status.
  • No unresolved Missing/Partial AC rows remain at completion gate.
  • Validator output includes AC-prefixed discrepancy entries when coverage gaps exist.

Testing

  • Prompt Builder validation loop completed with Prompt Tester and Prompt Evaluator across 3 sandbox iterations.
  • Findings from early runs were remediated and revalidated.
  • Final evaluation reported no remaining prompt modifications required.

Checklist

Required Checks

  • Documentation is updated (if applicable)
  • Files follow existing naming conventions
  • Changes are backwards compatible (if applicable)
  • Tests added for new functionality (if applicable)

AI Artifact Contributions

  • Used /prompt-analyze to review contribution
  • Addressed all feedback from prompt-builder review
  • Verified contribution follows common standards and type-specific requirements

Required Automated Checks

The following validation commands must pass before merging:

  • Markdown linting: npm run lint:md
  • Spell checking: npm run spell-check
  • Frontmatter validation: npm run lint:frontmatter
  • Skill structure validation: npm run validate:skills
  • Link validation: npm run lint:md-links
  • PowerShell analysis: npm run lint:ps
  • Eval spec schema and coverage (if AI artifacts changed): npm run eval:lint:schema
  • Plugin freshness: npm run plugin:generate
  • Docusaurus tests: npm run docs:test

Security Considerations

  • This PR does not contain any sensitive or NDA information
  • Any new dependencies have been reviewed for security issues
  • Security-related scripts follow the principle of least privilege

Additional Notes

No new dependencies were introduced. Scope is limited to AI artifact instruction hardening in Task Planner and Plan Validator agent files.

- Extract each acceptance criterion as separate AC item with stable IDs
- Add Acceptance Criteria Coverage section mapping ACs to plan steps, details, validation tasks
- Clarify Partial vs Missing states for missing/unvalidated validation evidence
- Support 'Needs clarification' status for ambiguous acceptance criteria
- Add Reference Integrity example to Planning Log template
- Update Plan Validator to flag missing AC coverage as High/Critical findings
- Include AC, DR, DD, RI deltas in Planning Log contract
@vdstrizhkova vdstrizhkova changed the title Harden Task Planner acceptance criteria coverage (fixes #2148) Fix: Harden Task Planner acceptance criteria coverage (fixes #2148) Jun 23, 2026
@codecov-commenter

codecov-commenter commented Jun 23, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 80.59%. Comparing base (d44ad1e) to head (0e8b4e5).

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2149      +/-   ##
==========================================
+ Coverage   80.52%   80.59%   +0.06%     
==========================================
  Files         128      118      -10     
  Lines       19274    19198      -76     
  Branches       12        0      -12     
==========================================
- Hits        15521    15473      -48     
+ Misses       3750     3725      -25     
+ Partials        3        0       -3     
Flag Coverage Δ
docusaurus ?
pester 84.23% <ø> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 11 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@vdstrizhkova vdstrizhkova marked this pull request as ready for review June 23, 2026 17:11
@vdstrizhkova vdstrizhkova requested a review from a team as a code owner June 23, 2026 17:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix: Harden Task Planner acceptance criteria coverage

2 participants