You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs: expand Slack notification design with interaction models
Detailed design for agent-user communication during long CI loops:
- Natural pause points (after fix, after CI, when blocked)
- Review window concept (agent waits N minutes for feedback before pushing)
- Actionable notification content (what changed, why, confidence level)
- Four implementation options with tradeoffs:
A) Incoming webhook (one-way, 5-min setup)
B) Slack bot with thread-based replies (two-way, no callback server)
C) Claude Code hooks bridge
D) GitHub PR comments as notification channel
- Recommended progression path A → B
- Skill integration points for both local and CI loops
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@@ -128,71 +128,205 @@ The orchestrator could automatically transition from Phase A to Phase B when loc
128
128
129
129
## Slack Notifications for Long-Running Loops
130
130
131
-
**Problem**: The CI iteration loop (`/iterate-ci-flaky`) runs for hours (each CI run takes ~2h). The user has no visibility into what the agent is doing until the session ends. By then, multiple fix-push-wait cycles may have happened with no chance for the user to intervene.
131
+
###The Problem
132
132
133
-
**Idea**: Optional Slack notifications at key moments, giving the user a chance to review and influence the next cycle.
133
+
The CI iteration loop (`/iterate-ci-flaky`) runs for hours — each CI run takes ~2h, and the loop may do 3-5 fix-push-wait cycles. During that time:
134
134
135
-
### Notification Events
135
+
- The user has no visibility into what the agent decided to fix or how
136
+
- By the time the loop finishes, multiple commits may have been pushed with no chance to course-correct
137
+
- A wrong fix in cycle 1 wastes 2+ hours of CI time before the agent discovers it didn't work
138
+
- The user may have domain context ("that test is flaky because of animation timing, not the selector") that would save cycles
136
139
137
-
| Event | When | Why the user cares |
138
-
|-------|------|-------------------|
139
-
|`fix_applied`| After committing and pushing a fix | User can review the diff before CI runs. Can reply "redo" or "don't change X" to influence next cycle |
140
-
|`ci_started`| After triggering `/test` or push | Confirmation that the loop is progressing |
141
-
|`ci_complete`| CI run finished (pass or fail) | User knows whether to check in or let it continue |
142
-
|`review_needed`| 5-commit threshold reached or blocking issue | User needs to act |
143
-
|`flaky_found`| Intermittent failure detected | User may have context about why |
144
-
|`blocked`| Agent stopped — REAL_REGRESSION, infra issue, or auth problem | Needs human input to continue |
145
-
|`iteration_done`| Full loop complete with summary | Final status |
140
+
The core tension: **autonomy vs oversight**. The agent should run independently, but the user needs the ability to intervene at natural pause points.
146
141
147
-
### Implementation Options
142
+
### Natural Pause Points
148
143
149
-
**Option A: Slack Incoming Webhook** (simplest)
150
-
- User creates a webhook for their channel: Slack → Apps → Incoming Webhooks
151
-
- Set `SLACK_WEBHOOK_URL` in `export-env.sh` or shell environment
- Use Claude Code's hook system to trigger notifications on specific events (tool calls, commits)
165
-
- Pro: Native to Claude Code, no external service
166
-
- Con: Hooks are local — would need forwarding to Slack
150
+
1.**After fix, before CI runs** (`fix_applied`): The agent committed a fix and is about to push (or just pushed). This is the highest-value notification — the user can review the approach and say "redo" before a 2-hour CI cycle starts.
167
151
168
-
### Recommended Approach
152
+
2.**After CI completes** (`ci_complete`): Results are in. The agent is about to diagnose. User might have context about known issues.
169
153
170
-
Start with **Option A** (webhook). It's 5 minutes to set up and covers the primary need: visibility into what the agent is doing. The agent posts, the user reads. If the user wants to intervene, they message the agent directly in the Claude Code session.
154
+
3.**When blocked** (`blocked`): Agent can't continue — needs human decision.
171
155
172
-
The `notify-slack.py` script would:
173
-
- Check if `SLACK_WEBHOOK_URL` is set — if not, skip silently (notifications are optional)
174
-
- Format messages with Slack Block Kit (sections, context with PR link, branch, CI URL)
175
-
- Be called by both skills at key points in the loop
156
+
### Review Window
176
157
177
-
### Configuration
158
+
For the `fix_applied` event, the agent could optionally **wait before pushing**, giving the user a time window to respond:
The key: show **what** changed, **why** the agent chose that fix, and **how confident** it is. This lets the user quickly decide "looks good, let it run" vs "wrong approach, let me intervene."
195
+
196
+
**`ci_complete`** — actionable status:
197
+
```
198
+
:white_check_mark: Agent: CI Complete — PASSED (run 2/5)
199
+
200
+
*Results:* 15/15 tests passed in 1h 47m
201
+
*Flakiness probe:* 2 of 5 confirmation runs complete, all green so far
202
+
203
+
*Next:* Triggering confirmation run 3. No action needed.
204
+
205
+
PR #860 | Branch: test/incident-robustness-2026-03-24 | CI Run
206
+
```
207
+
208
+
Or on failure:
209
+
```
210
+
:x: Agent: CI Complete — FAILED (iteration 2/3)
211
+
212
+
*Results:* 13/15 passed, 2 failed
213
+
*Failures:*
214
+
• "should filter by severity" — Timed out on `[data-test="severity-chip"]` (same as last run)
215
+
• "should display chart bars" — new failure, `Expected 5 bars, found 0`
216
+
217
+
*Assessment:*
218
+
• severity filter: same fix didn't work, will try different approach
219
+
• chart bars: new failure — possibly caused by previous fix (will investigate)
220
+
221
+
*Next:* Diagnosing and fixing. Will notify before pushing.
222
+
223
+
PR #860 | Branch: test/incident-robustness-2026-03-24 | CI Run
224
+
```
196
225
197
-
PR #860 | Branch: agentic-test-iteration | CI Run
226
+
**`blocked`** — requires user action:
198
227
```
228
+
:octagonal_sign: Agent: Blocked — REAL_REGRESSION
229
+
230
+
*Test:* "should display incident bars in chart"
231
+
*Issue:* Chart component renders empty. Screenshot shows the chart area with no bars, no error, no loading state.
232
+
*Commit correlation:* `src/components/incidents/IncidentChart.tsx` was modified in this PR (+45, -12)
233
+
234
+
*This is not a test issue* — the chart rendering logic appears broken. Agent cannot fix source code in Phase 1.
235
+
236
+
*Action needed:* Investigate the chart component refactor. Agent will stop iterating on this test.
user_replies = [r for r in replies["messages"] if r.get("user") !=BOT_USER_ID]
278
+
if user_replies:
279
+
return user_replies[-1]["text"] # Return latest user feedback
280
+
time.sleep(30)
281
+
282
+
returnNone# No feedback, proceed autonomously
283
+
```
284
+
285
+
**Option C: Claude Code hooks → Slack bridge**
286
+
- Configure a Claude Code hook that fires on `git commit` or specific tool calls
287
+
- The hook runs a shell script that posts to Slack
288
+
- Pro: Zero changes to the skills — hooks are external
289
+
- Con: Less control over notification content and timing. Can't implement review windows. Hooks are local config, not portable.
290
+
291
+
**Option D: GitHub PR comments as notification channel**
292
+
- Instead of Slack, the agent posts status updates as PR comments
293
+
- User replies directly on the PR
294
+
- Agent reads PR comments via `gh api` before proceeding
295
+
- Pro: No Slack setup at all. Everything stays in GitHub. Natural for code review context.
296
+
- Con: Noisier PR history. Not real-time (no push notifications unless GitHub notifications are configured).
297
+
298
+
### Recommended Progression
299
+
300
+
1.**Start with Option A** — get visibility. User monitors passively, intervenes in Claude Code session when needed.
301
+
2.**Upgrade to Option B** when the review window pattern proves valuable — adds two-way interaction within Slack.
302
+
3.**Option D** is a good alternative if you prefer keeping everything in GitHub — especially for team use where the PR is the natural communication hub.
0 commit comments