Skip to content

Commit 84586e5

Browse files
DavidRajnohaclaude
andcommitted
docs: add cloud execution options for long-running agent
Documents three approaches for running the agent without a local terminal session: headless mode (simplest), Claude Agent SDK (most flexible), and GitHub Actions (cloud, event-driven). Includes SDK vs CLI comparison, requirements to port skills, and a concrete GitHub Actions workflow triggered by PR comments. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 85fed91 commit 84586e5

1 file changed

Lines changed: 130 additions & 0 deletions

File tree

docs/agentic-test-iteration-ideas.md

Lines changed: 130 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -330,3 +330,133 @@ Where notifications fire in each skill:
330330
- Step 12: `flaky_found` — during flakiness probe
331331
- Step 13: `iteration_done` — final summary
332332
- Any step: `blocked` — on REAL_REGRESSION
333+
334+
---
335+
336+
## Cloud Execution: Long-Running Autonomous Agent
337+
338+
**Problem**: The current setup requires a local machine with an active Claude Code CLI session. Long CI polling (~2h per run) causes session timeouts, and the user must keep a terminal open.
339+
340+
### Option 1: Claude Code Headless Mode (simplest)
341+
342+
Run Claude Code non-interactively without a TTY:
343+
344+
```bash
345+
claude --print --dangerously-skip-permissions \
346+
-p "/iterate-ci-flaky pr=860 confirm-runs=5"
347+
```
348+
349+
- `--print` / `-p`: non-interactive, outputs result and exits
350+
- `--dangerously-skip-permissions`: skips all approval prompts (use only in sandboxed environments)
351+
- Can run in `tmux`, `nohup`, GitHub Actions, or any CI runner
352+
- Uses the same tools, skills, and CLAUDE.md as interactive mode
353+
- Limitation: single-shot execution — runs the prompt and exits
354+
355+
**Deployment**: `nohup claude --print ... > output.log 2>&1 &` on any machine, or in a GitHub Actions runner.
356+
357+
### Option 2: Claude Agent SDK (most flexible)
358+
359+
The Agent SDK (`@anthropic-ai/claude-code`) is a Node.js/TypeScript library that embeds Claude Code as a programmable agent:
360+
361+
```typescript
362+
import { Claude } from "@anthropic-ai/claude-code";
363+
364+
const claude = new Claude({
365+
dangerouslySkipPermissions: true,
366+
});
367+
368+
const result = await claude.message({
369+
prompt: "/iterate-ci-flaky pr=860 confirm-runs=5",
370+
workingDirectory: "/path/to/monitoring-plugin",
371+
});
372+
373+
// Post result as PR comment
374+
await octokit.issues.createComment({
375+
owner: "openshift", repo: "monitoring-plugin",
376+
issue_number: 860, body: result.text,
377+
});
378+
```
379+
380+
#### SDK vs CLI comparison
381+
382+
| Aspect | CLI (`claude`) | Agent SDK |
383+
|--------|---------------|-----------|
384+
| Runtime | Terminal process | Node.js library |
385+
| Lifecycle | Single session, exits | Embed in any long-lived process |
386+
| Event-driven | No | Yes — webhooks, timers, PR events |
387+
| Permissions | Interactive prompts or skip-all | Programmatic control |
388+
| Tools | Built-in (Read, Write, Bash, etc.) | Same built-in + custom tools |
389+
| State | Session-scoped | Persistent (DB, files, etc.) |
390+
| Deployment | Local terminal | Anywhere Node.js runs |
391+
392+
#### Requirements to port current skills
393+
394+
- Node.js runtime with `@anthropic-ai/claude-code`
395+
- `ANTHROPIC_API_KEY` environment variable
396+
- `gh` CLI authenticated (or GitHub App token for comment access)
397+
- Git + SSH for pushing to fork
398+
- The repo cloned in the agent's working directory
399+
- All skill files (`.claude/commands/`) present in the clone
400+
401+
#### What stays the same
402+
403+
- Skills (`.md` files) — the SDK reads them from `.claude/commands/`
404+
- Polling script (`poll-ci-status.py`) — SDK runs Bash the same way
405+
- `/diagnose-test-failure`, `/analyze-ci-results` — all work as-is
406+
- File editing, git operations, Cypress execution — identical
407+
408+
#### What changes
409+
410+
- No permission prompts — `dangerouslySkipPermissions` in a sandboxed container
411+
- State between runs — persist to file or DB instead of ephemeral session
412+
- Triggering — webhook handler calls the SDK instead of user typing a command
413+
- Error recovery — the wrapping process can catch failures and retry
414+
415+
### Option 3: GitHub Actions Workflow (cloud, event-driven)
416+
417+
A GitHub Actions workflow that runs the agent on PR events:
418+
419+
```yaml
420+
name: Flaky Test Iteration
421+
on:
422+
issue_comment:
423+
types: [created]
424+
425+
jobs:
426+
iterate:
427+
if: contains(github.event.comment.body, '/run-flaky-iteration')
428+
runs-on: ubuntu-latest
429+
steps:
430+
- uses: actions/checkout@v4
431+
- uses: actions/setup-node@v4
432+
- name: Install Claude Code
433+
run: npm install -g @anthropic-ai/claude-code
434+
- name: Run iteration
435+
env:
436+
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
437+
GH_TOKEN: ${{ secrets.GH_TOKEN }}
438+
run: |
439+
claude --print --dangerously-skip-permissions \
440+
-p "/iterate-ci-flaky pr=${{ github.event.issue.number }} confirm-runs=3"
441+
- name: Post results
442+
run: gh pr comment ${{ github.event.issue.number }} --body-file output.md
443+
```
444+
445+
**Flow**:
446+
1. User comments `/run-flaky-iteration` on a PR
447+
2. GitHub Actions triggers the workflow
448+
3. Claude Code runs in headless mode on the Actions runner
449+
4. Agent executes the full iteration loop (trigger CI, wait, analyze, fix, push)
450+
5. Results posted back as a PR comment
451+
452+
**Considerations**:
453+
- GitHub Actions runners have a 6h timeout — enough for 2-3 CI runs
454+
- Needs `ANTHROPIC_API_KEY` and `GH_TOKEN` as repository secrets
455+
- Runner needs SSH key for git push (or use `GH_TOKEN` with HTTPS)
456+
- Cost: API tokens consumed + GitHub Actions minutes
457+
458+
### Recommendation
459+
460+
1. **Start with headless mode** (`tmux` + `--print`) to validate the flow works without interactive prompts
461+
2. **Move to GitHub Actions** for true cloud execution — event-driven, no local machine needed
462+
3. **Agent SDK** when you want a custom orchestrator with richer state management, error recovery, or Slack integration beyond what the skills provide

0 commit comments

Comments
 (0)