Address PR feedback: remove 'other' label, add comment for unclassifiable issues, clean up

MackinnonBuck · Copilot · MackinnonBuck · commit 71c2e1d4fe3a · 2026-03-25T15:46:28.000-07:00
- Remove 'other' from classification labels; agent now leaves a comment
  when an issue doesn't fit established categories
- Remove unused 'mode' and 'snapshot_text' workflow_dispatch inputs
- Add 'add-comment' safe-output for unclassifiable issue comments
- Reduce max labels from 3 to 2 (one classification + ai-triaged)
- Add scripts/corrections/README.md
- Clarify test timestamps relative to frozen clock
- Remove 'other' from CLASSIFICATION_LABELS constant

Co-authored-by: Copilot &lt;223556219+Copilot@users.noreply.github.com&gt;
diff --git a/.github/workflows/issue-classification.lock.yml b/.github/workflows/issue-classification.lock.yml
diff --git a/.github/workflows/issue-classification.md b/.github/workflows/issue-classification.md
@@ -9,15 +9,6 @@ on:
         description: "Issue number to triage"
         required: true
         type: string
-      mode:
-        description: "Execution mode: live (default) or eval (future use)"
-        required: false
-        type: string
-        default: "live"
-      snapshot_text:
-        description: "Issue snapshot JSON for eval mode (future use): {title, body, author}"
-        required: false
-        type: string
   roles: all
 permissions:
   contents: read
@@ -30,8 +21,11 @@ tools:
 safe-outputs:
   staged: true
   add-labels:
-    allowed: [bug, enhancement, question, documentation, other, ai-triaged]
-    max: 3
+    allowed: [bug, enhancement, question, documentation, ai-triaged]
+    max: 2
+    target: triggering
+  add-comment:
+    max: 1
     target: triggering
 timeout-minutes: 10
 ---
@@ -40,18 +34,18 @@ timeout-minutes: 10
 
 You are an AI agent that classifies newly opened issues in the copilot-sdk repository.
 
-Your **only** job is to apply labels. You do not post comments, close issues, or modify issues in any other way.
+Your **only** job is to apply labels and, when necessary, leave a brief comment. You do not close issues or modify them in any other way.
 
 ## Your Task
 
 1. Fetch the full issue content using GitHub tools
 2. Read the issue title, body, and author information
-3. Follow the classification instructions below to determine the correct labels
-4. Apply the labels
+3. Follow the classification instructions below to determine the correct classification
+4. Take action:
+   - If the issue fits one of the established categories (`bug`, `enhancement`, `question`, `documentation`): apply that label **and** the `ai-triaged` label
+   - If the issue does **not** clearly fit any category: do **not** apply a classification label. Instead, leave a brief comment explaining why the issue couldn't be classified and that a human will review it. Still apply the `ai-triaged` label.
 
-You must apply:
-- **Exactly one** classification label (`bug`, `enhancement`, `question`, `documentation`, or `other`)
-- **The `ai-triaged` label** (always, alongside the classification label)
+You must always apply the `ai-triaged` label.
 
 {{#import shared/triage-classification.md}}
 
diff --git a/.github/workflows/shared/triage-classification.md b/.github/workflows/shared/triage-classification.md
@@ -4,7 +4,7 @@ You are classifying issues for the **copilot-sdk** repository — a multi-langua
 
 ## Classification Labels
 
-Apply **exactly one** of these routing labels to each issue:
+Apply **exactly one** of these routing labels to each issue. If none fit, see "Unclassifiable Issues" below.
 
 ### `bug`
 Something isn't working correctly. The issue describes unexpected behavior, errors, crashes, or regressions in existing functionality.
@@ -38,13 +38,9 @@ Examples:
 - "API reference for session.ui is outdated"
 - "Add migration guide from v1 to v2"
 
-### `other`
-The issue doesn't clearly fit any of the above categories. Use this for meta discussions, process questions, infrastructure issues, or anything that doesn't map to a specific routing category.
+## Unclassifiable Issues
 
-Examples:
-- "Proposal to restructure the monorepo"
-- "CI is failing on the main branch"
-- "License question about commercial use"
+If the issue doesn't clearly fit any of the above categories (e.g., meta discussions, process questions, infrastructure issues, license questions), do **not** apply a classification label. Instead, leave a brief comment explaining why the issue couldn't be automatically classified and that a human will review it.
 
 ## Classification Guidelines
 
diff --git a/scripts/corrections/README.md b/scripts/corrections/README.md
@@ -0,0 +1,56 @@
+# Correction Tracking Scripts
+
+TypeScript scripts that detect and record human corrections to the AI triage agent's issue classifications.
+
+## How it works
+
+When the AI triage agent classifies an issue, it applies a classification label (`bug`, `enhancement`, `question`, `documentation`) plus `ai-triaged`. If a maintainer disagrees, they remove the agent's label and apply the correct one. This system detects that change and records it as a **correction**.
+
+### Detection logic
+
+- **Correction**: A classification label is added while `ai-triaged` is present, and the timeline shows the same actor removed a different classification label within the last 2 minutes.
+- **Confirmation**: `ai-triaged` is removed without changing the classification label — the maintainer agrees with the agent.
+- **Late comments**: If the corrector adds a follow-up comment, it's appended to the correction record as context.
+
+### Where corrections go
+
+Corrections are stored as JSON files (`evals/corrections/issue-{N}.json`) on a `triage-corrections` branch. A single PR accumulates all corrections for human review before merging.
+
+### Guards
+
+- Only users with `write`, `maintain`, or `admin` permission can record corrections
+- Bot-triggered label events are ignored
+- Non-classification labels (e.g., `priority/high`) are ignored
+
+## Development
+
+```bash
+npm ci          # install dependencies
+npm run build   # compile TypeScript → dist/
+npm test        # run tests (vitest)
+npm run typecheck  # type-check without emitting
+```
+
+Or from the repo root:
+
+```bash
+just install-corrections
+just test-corrections
+just lint-corrections
+```
+
+## Files
+
+| File | Purpose |
+|------|---------|
+| `src/track-correction.ts` | Main detection logic (correction vs confirmation) |
+| `src/write-correction.ts` | Branch management, file writing, PR creation |
+| `src/update-context-comments.ts` | Appends late justification comments |
+| `src/types.ts` | Shared interfaces and constants |
+| `src/github-types.ts` | Type aliases for `actions/github-script` params |
+| `src/test-helpers.ts` | Mock factories for unit tests |
+| `src/integration.test.ts` | End-to-end scenarios with recording mock client |
+
+## Workflow
+
+These scripts are called by `.github/workflows/track-correction.yml`, which triggers on `issues.labeled`, `issues.unlabeled`, and `issue_comment.created` events. The workflow builds the TypeScript, then runs the compiled JS via `actions/github-script`.
diff --git a/scripts/corrections/src/track-correction.test.ts b/scripts/corrections/src/track-correction.test.ts
@@ -53,15 +53,15 @@ describe('trackCorrection', () => {
       currentLabels: ['bug', 'ai-triaged'],
     });
 
-    // Set up as a correction scenario with timeline
-    const twoSecondsAgo = new Date('2026-01-15T11:59:58Z').toISOString();
+    // System time is frozen at 2026-01-15T12:00:00Z (see beforeEach)
+    const withinWindow = new Date('2026-01-15T11:59:58Z').toISOString(); // 2s before frozen clock
     github.paginate
       .mockResolvedValueOnce([
         {
           event: 'unlabeled',
           label: { name: 'enhancement' },
           actor: { login: 'maintainer' },
-          created_at: twoSecondsAgo,
+          created_at: withinWindow,
         },
       ])
       .mockResolvedValueOnce([]); // comments
@@ -119,21 +119,21 @@ describe('trackCorrection', () => {
       currentLabels: ['bug', 'ai-triaged'],
     });
 
-    const twoSecondsAgo = new Date('2026-01-15T11:59:58Z').toISOString();
+    const withinWindow = new Date('2026-01-15T11:59:58Z').toISOString(); // 2s before frozen clock
     github.paginate
       .mockResolvedValueOnce([
         {
           event: 'unlabeled',
           label: { name: 'enhancement' },
           actor: { login: 'maintainer' },
-          created_at: twoSecondsAgo,
+          created_at: withinWindow,
         },
       ])
       .mockResolvedValueOnce([
         {
           user: { login: 'maintainer' },
           body: 'This is actually a bug, not an enhancement',
-          created_at: twoSecondsAgo,
+          created_at: withinWindow,
         },
       ]);
 
@@ -179,7 +179,7 @@ describe('trackCorrection', () => {
         event: 'unlabeled',
         label: { name: 'bug' }, // same label
         actor: { login: 'maintainer' },
-        created_at: new Date('2026-01-15T11:59:58Z').toISOString(),
+        created_at: new Date('2026-01-15T11:59:58Z').toISOString(), // within 2-min window
       },
     ]);
 
@@ -199,7 +199,7 @@ describe('trackCorrection', () => {
       currentLabels: ['bug', 'ai-triaged'],
     });
 
-    const recentTimestamp = new Date('2026-01-15T11:59:58Z').toISOString();
+    const recentTimestamp = new Date('2026-01-15T11:59:58Z').toISOString(); // within 2-min window
     github.paginate
       .mockResolvedValueOnce([
         {
@@ -250,8 +250,8 @@ describe('trackCorrection', () => {
     );
   });
 
-  // --- Timeline window ---
-  it('ignores old unlabel events outside 2-minute window', async () => {
+  // --- Timeline window (events must be within 2 minutes of current time) ---
+  it('ignores unlabel events outside 2-minute window', async () => {
     const context = mockLabelContext({
       action: 'labeled',
       labelName: 'bug',
diff --git a/scripts/corrections/src/types.ts b/scripts/corrections/src/types.ts
@@ -20,6 +20,6 @@ export interface ContextComment {
   created_at: string;
 }
 
-export const CLASSIFICATION_LABELS = ['bug', 'enhancement', 'question', 'documentation', 'other'];
+export const CLASSIFICATION_LABELS = ['bug', 'enhancement', 'question', 'documentation'];
 export const REVIEW_LABEL = 'ai-triaged';
 export const CORRECTIONS_BRANCH = 'triage-corrections';

Original file line number	Diff line number	Diff line change
`@@ -20,6 +20,6 @@ export interface ContextComment {`
`20`	`20`	`created_at: string;`
`21`	`21`	`}`
`22`	`22`
`23`		`-export const CLASSIFICATION_LABELS = ['bug', 'enhancement', 'question', 'documentation', 'other'];`
	`23`	`+export const CLASSIFICATION_LABELS = ['bug', 'enhancement', 'question', 'documentation'];`
`24`	`24`	`export const REVIEW_LABEL = 'ai-triaged';`
`25`	`25`	`export const CORRECTIONS_BRANCH = 'triage-corrections';`