-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feat: per-job event stream + watchdog + compact recovery (1.1.0) #335
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
suminerProxy
wants to merge
7
commits into
openai:main
from
suminerProxy:feat/event-stream-foundation
Closed
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
5e8100a
test: isolate from host-runtime plugin env vars
bit-star c30b664
feat(state): add per-job NDJSON event stream API
bit-star a3f4181
feat(codex): add notification stream hook + normalize + surface usage
bit-star 9332c29
feat(companion): wire per-job event stream + stall watchdog + events cmd
bit-star 3a0ae2d
fix(events): job/exited verdict + cover codex 0.131 schema gaps
bit-star 1014d10
feat(phase-3): compact recovery + slash commands + rescue defaults to bg
bit-star f9f36fb
chore(release): bump plugin version to 1.1.0 + CHANGELOG
bit-star File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,5 +1,47 @@ | ||
| # Changelog | ||
|
|
||
| ## 1.1.0 | ||
|
|
||
| Observability rework for the main-loop orchestration case: the consumer of | ||
| "what is codex doing right now" is the calling Claude session, not a human | ||
| dashboard. Adds a per-job NDJSON event stream the main loop can poll, plus | ||
| a protocol-native recovery path for context overflow. | ||
|
|
||
| - `/codex:events <job-id>`: new slash command. Streams normalized codex | ||
| notifications from `{stateDir}/jobs/{jobId}.events.ndjson`. Supports | ||
| `--since <iso>` / `--after-seq <n>` / `--limit <n>` / `--json` for | ||
| incremental polling. Each event carries `seq`, `ts`, `method`, `phase`, | ||
| `itemType`, `message`, and the raw payload. | ||
| - `/codex:compact <thread-id>`: new slash command wrapping codex | ||
| app-server's `thread/compact/start`. Protocol-native recovery for | ||
| "prompt too long" — typical flow is cancel → compact → resume with an | ||
| amended prompt via `/codex:rescue --resume`. | ||
| - `/codex:rescue` now defaults to `--background`. The main Claude loop | ||
| receives a job id immediately and polls `/codex:events` instead of | ||
| blocking on a synchronous Bash call; this removes the deadlock when | ||
| codex stalls or errors silently. | ||
| - Per-job stall watchdog (60s default, override via | ||
| `CODEX_COMPANION_STALL_SECONDS`) emits a `{type:"watchdog", | ||
| phase:"stuck"}` event when codex produces no new notifications inside | ||
| the window. The watchdog never cancels — the main loop decides whether | ||
| to continue, compact, or cancel. | ||
| - New `{type:"job/exited"}` terminal event with `phase: completed|failed` | ||
| and `exitCode`. This is the single source of truth for end-of-job; | ||
| callers should not infer terminal state from job-level `status` alone. | ||
| - Surfaces token usage as a top-level field on `runAppServerTurn` and | ||
| streams real-time usage via `thread/tokenUsage/updated` events | ||
| (`phase: "metering"`). | ||
| - Coverage of codex CLI 0.131 notification methods extended to | ||
| `thread/status/changed`, `warning`, `thread/tokenUsage/updated`, plus | ||
| item types `userMessage`, `assistantMessage`/`agentMessage`, and | ||
| `reasoning`. The `agentMessage` item now surfaces a content preview so | ||
| the main loop can recognize codex's final reply from the event stream | ||
| without fetching `/codex:result`. | ||
| - Test isolation: `tests/helpers.mjs` now unsets `CLAUDE_PLUGIN_DATA` and | ||
| `CODEX_COMPANION_SESSION_ID` at module load. Plugin host runtimes (e.g. | ||
| Claude Code) inject these vars; without isolation, two existing tests | ||
| fail when contributors run `npm test` from inside a host. | ||
|
|
||
| ## 1.0.0 | ||
|
|
||
| - Initial version of the Codex plugin for Claude Code |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,33 @@ | ||
| --- | ||
| description: Trigger codex's protocol-native context compaction on a thread (recovery path for "prompt too long") | ||
| argument-hint: '<thread-id> [--json]' | ||
| disable-model-invocation: true | ||
| allowed-tools: Bash(node:*) | ||
| --- | ||
|
|
||
| !`node "${CLAUDE_PLUGIN_ROOT}/scripts/codex-companion.mjs" compact "$ARGUMENTS"` | ||
|
|
||
| When to use: | ||
|
|
||
| - A Codex task has stalled or failed with a context-overflow error and you want to recover the thread without losing its history. | ||
| - `/codex:events <job-id>` shows `phase:"stuck"` for an extended period and `phase:"metering"` records indicate the token budget is near the limit. | ||
| - A previous turn returned `phase:"failed"` with a context-length error and you want to compact before resuming. | ||
|
|
||
| Typical recovery sequence: | ||
|
|
||
| 1. `/codex:cancel <job-id>` if a turn is still running and stuck. | ||
| 2. `/codex:compact <thread-id>` — this command. Returns immediately after the codex app-server acknowledges the compaction request. | ||
| 3. `/codex:rescue --resume <amended prompt>` — resume the (now compacted) thread with a tighter prompt. | ||
|
|
||
| Output format: | ||
|
|
||
| - Without `--json`: prints `Compaction started on <thread-id>.` and a hint for the resume flow. | ||
| - With `--json`: returns `{attempted, compacted, transport, result, detail}`. | ||
|
|
||
| If the thread id is malformed or the app-server rejects the request, the command exits non-zero and emits the codex-side error in `stderr` (or in `detail` under `--json`). | ||
|
|
||
| Notes: | ||
|
|
||
| - Compaction runs codex-side after this command returns; it is not a synchronous wait. | ||
| - `thread/compact/start` is a streaming RPC in the app-server protocol, but this wrapper does not consume the stream — codex completes compaction in the background regardless of whether a stream consumer is active. | ||
| - The exact success payload shape from codex CLI is preserved verbatim under `result` for forward-compat; downstream consumers should not depend on its keys beyond what they have observed. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,24 @@ | ||
| --- | ||
| description: Stream Codex notifications as a per-job NDJSON event log for poll-based monitoring | ||
| argument-hint: '<job-id> [--since <iso>] [--after-seq <n>] [--limit <n>] [--json]' | ||
| disable-model-invocation: true | ||
| allowed-tools: Bash(node:*) | ||
| --- | ||
|
|
||
| !`node "${CLAUDE_PLUGIN_ROOT}/scripts/codex-companion.mjs" events "$ARGUMENTS"` | ||
|
|
||
| How to use the output: | ||
|
|
||
| - Each line in the stream is a normalized notification: `seq`, `ts`, `method`, `phase`, `itemType`, `message`, and `raw`. | ||
| - Treat `{type:"job/exited"}` as the single source of truth for terminal state — its `phase` is `completed` or `failed`, and `exitCode` reflects the codex turn outcome. Do not infer end-of-job from job-level `status` alone. | ||
| - `phase:"stuck"` records (emitted by the worker's stall watchdog) mean codex produced no new notifications for the configured stall window (default 60s, override with `CODEX_COMPANION_STALL_SECONDS`). The job is not cancelled — the main loop decides whether to keep waiting, run `/codex:compact <thread-id>` to recover from context overflow, or `/codex:cancel <job-id>` to abort. | ||
| - `phase:"warning"` records carry codex-side non-fatal conditions (context budget exceeded, capabilities removed). They are informational, not failures. | ||
| - `phase:"metering"` records come from `thread/tokenUsage/updated` and stream real-time token usage; poll them to detect "this turn is burning a lot of tokens" before hitting context overflow. | ||
| - Use `--after-seq <last-seq>` for incremental polling. If both `--after-seq` and `--since` are supplied, `--after-seq` wins. Default (no filter) returns all events for the job; the main loop is responsible for dedupe-by-seq. | ||
|
|
||
| Output format: | ||
|
|
||
| - Without `--json`: one human-readable line per event. | ||
| - With `--json`: `{jobId, eventsFile, count, events: [...]}`. | ||
|
|
||
| If no events exist yet for the job id, the command prints `No events yet for <job-id>.` and exits 0. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This default-background guidance tells the main loop to poll with
--since <last-seq>, but the events command treats--sinceas an ISO timestamp and uses--after-seqfor sequence numbers. If a model follows this instruction with a numeric last sequence, ISO timestamps compare greater than that string and the poll returns old events repeatedly, defeating incremental monitoring; the example should use--after-seq <last-seq>.Useful? React with 👍 / 👎.