Skip to content

chore(ci): add Codex PR-review pipeline#45

Open
brettheap wants to merge 1 commit into
mainfrom
chore/codex-pr-review
Open

chore(ci): add Codex PR-review pipeline#45
brettheap wants to merge 1 commit into
mainfrom
chore/codex-pr-review

Conversation

@brettheap
Copy link
Copy Markdown
Contributor

@brettheap brettheap commented Jun 2, 2026

Summary

Adds the OpenAI-Codex-powered PR review pipeline as repo CI infrastructure. These files had been sitting untracked on the FEAT-013 branch (013-managed-session-lifecycle); this PR splits them onto their own branch off main so FEAT-013 stays self-contained and the tooling is reviewable on its own.

What's included

  • .github/workflows/codex-pr-review.yml — read-only GitHub Action. Triggers on PR opened/synchronize/reopened/ready_for_review + manual workflow_dispatch. Skips drafts, checks out the PR merge ref, runs openai/codex-action@v1, posts the review as a PR comment (truncated at 60 KB) and uploads codex-output.md as an artifact (14-day retention). Needs the OPENAI_API_KEY repo secret.
  • .github/codex/prompts/pr-review.md — review prompt defining the mandatory 10-pass expert panel (architecture, performance, security, QA, reliability/concurrency, data/schema, API/contracts, observability, maintainability + coordinator) plus up to 5 dynamic technology-specific passes. AgentTower-specific focus: daemon tx boundaries, SQLite migrations, CLI/JSON contract drift, socket/peer-uid, tmux input safety, lifecycle-log vs JSONL separation.
  • .codex/agents/pr-review-*.toml — the 10 read-only panel agent definitions.
  • .codex/config.toml — panel runtime caps (max_threads=15, max_depth=1).
  • .gitignore — ignores .codex/speckit-claude-driver.json (host/bench-local operator config that must never be committed; relocated out of the repo).

These sit alongside the already-tracked .codex/skills/ that the prompt references.

Why a GitHub remote was chosen as the home

GitHub Actions can only run from .github/workflows/ in the repo being reviewed, so this pipeline cannot live in a separate devBench/dotfiles repo — it has to be in AgentTower to review AgentTower PRs. (The one host-local file, speckit-claude-driver.json, was the exception and is now gitignored + relocated.)

Open question for review

The workflow currently uses sandbox: workspace-write even though the prompt is strictly read-only. Consider tightening to read-only for a review-only action — flagged for discussion, not changed here.

🤖 Generated with Claude Code

Summary by Sourcery

Introduce a Codex-powered, read-only automated PR review workflow and supporting configuration for AgentTower.

Enhancements:

  • Add Codex agent configuration and panel runtime limits for structured, read-only PR reviews.

CI:

  • Add a GitHub Actions workflow that runs an OpenAI Codex-based PR review on pull requests and manual dispatch, posting results as a comment and artifact.

Chores:

  • Ignore local Codex driver configuration from version control.

Adds the OpenAI-Codex-powered PR review GitHub Action and its panel,
split out of the FEAT-013 branch where it had been sitting untracked.

- .github/workflows/codex-pr-review.yml: read-only review action that
  runs on PR open/sync/reopen/ready and on workflow_dispatch, checks out
  the PR merge ref, runs openai/codex-action, and posts the result as a
  PR comment (+ uploads codex-output.md artifact).
- .github/codex/prompts/pr-review.md: review prompt defining the
  mandatory 10-pass expert panel + up to 5 dynamic tech-specific passes.
- .codex/agents/pr-review-*.toml: the 10 read-only panel agent defs.
- .codex/config.toml: panel runtime caps (max_threads/max_depth).
- .gitignore: ignore .codex/speckit-claude-driver.json (host/bench-local
  operator config that must never be committed).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 2, 2026 01:41
@sourcery-ai
Copy link
Copy Markdown

sourcery-ai Bot commented Jun 2, 2026

Reviewer's Guide

Adds a Codex-powered, read-only PR review GitHub Action workflow plus supporting prompt and agent configuration, enabling automated multi-pass expert-style reviews on AgentTower pull requests, and updates gitignore for a local Codex driver config file.

Sequence diagram for Codex PR review GitHub Action workflow

sequenceDiagram
    actor Developer
    participant GitHub as GitHub
    participant Workflow as codex-pr-review_workflow
    participant Script as github-script_pr
    participant Checkout as actions_checkout
    participant Codex as openai_codex_action
    participant Comment as github-script_comment
    participant Artifact as upload_artifact

    Developer->>GitHub: open_pull_request / update / workflow_dispatch
    GitHub-->>Workflow: trigger codex-pr-review

    Workflow->>Script: actions/github-script Resolve PR metadata
    Script-->>Workflow: PR number, refs, skip flag

    alt non_draft_PR
        Workflow->>Checkout: actions/checkout@v4 merge ref
        Workflow->>Workflow: git fetch base and head refs
        Workflow->>Codex: openai/codex-action@v1
        Codex-->>Workflow: codex-output.md, final-message

        Workflow->>Comment: actions/github-script createComment
        Comment-->>GitHub: PR review comment

        Workflow->>Artifact: actions/upload-artifact codex-output.md
        Artifact-->>GitHub: codex-output.md artifact
    else draft_PR
        Script-->>GitHub: notice Skipping draft PR
    end
Loading

File-Level Changes

Change Details Files
Introduce Codex PR review GitHub Actions workflow that can be triggered on PR activity or manually and posts a summarized Codex review as a PR comment plus artifact.
  • Adds a pull_request and workflow_dispatch-triggered workflow that skips draft PRs and supports manual PR number input.
  • Resolves PR metadata via actions/github-script, validates the PR number, and exposes base/head refs and SHAs via step outputs.
  • Checks out the PR merge ref, fetches the base branch and PR head for diffing, and runs openai/codex-action with repository-specific environment variables and prompt file.
  • Configures Codex to output to codex-output.md with workspace-write sandbox and drop-sudo safety strategy.
  • Posts the Codex final message (or fallback artifact content) as an issue comment, truncating to 60KB, and uploads codex-output.md as a 14-day artifact.
.github/workflows/codex-pr-review.yml
Define a detailed PR-review prompt tailored to AgentTower and Codex, including a mandatory 10-agent expert panel and tech-specific passes.
  • Specifies strict read-only behavior and allowed operations for the Codex review run, including diff and history inspection guidance.
  • Describes expected PR context, environment variables, and preferred git diff commands including fallback behavior when head refs are unavailable.
  • Defines a mandatory 10-role expert panel and up to five dynamic technology-specific passes, with behavior when subagent spawning is unavailable.
  • Outlines review focus areas such as security, migrations, API contracts, observability, and operational risk, with severity levels and output format for findings.
.github/codex/prompts/pr-review.md
Add Codex agent definitions for the mandatory expert panel with read-only sandbox mode and scoped instructions.
  • Creates per-role TOML agent definitions for architecture, performance, security, testing, reliability, data/schema, API/contracts, observability, maintainability, and a coordinator.
  • Each agent is configured as read-only and includes focused domain-specific guidance aligned with AgentTower concerns like SQLite migrations, tmux safety, socket permissions, and lifecycle logging.
.codex/agents/pr-review-api-contracts.toml
.codex/agents/pr-review-architecture.toml
.codex/agents/pr-review-coordinator.toml
.codex/agents/pr-review-data-schema.toml
.codex/agents/pr-review-maintainability.toml
.codex/agents/pr-review-observability.toml
.codex/agents/pr-review-performance.toml
.codex/agents/pr-review-reliability.toml
.codex/agents/pr-review-security.toml
.codex/agents/pr-review-testing.toml
Configure Codex runtime limits for panel execution and ignore a host-local Codex driver config file from version control.
  • Adds a Codex config file limiting maximum concurrent agents/threads and recursion depth for panel execution.
  • Updates gitignore to exclude a local speckit-claude-driver JSON config file so it is not committed to the repository.
.codex/config.toml
.gitignore

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 2, 2026

Codex PR review did not produce a final message. See the workflow logs and uploaded artifact for details.

Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue

Prompt for AI Agents
Please address the comments from this code review:

## Individual Comments

### Comment 1
<location path=".github/workflows/codex-pr-review.yml" line_range="101-110" />
<code_context>
+      - name: Post Codex review comment
</code_context>
<issue_to_address>
**suggestion:** Truncation by raw byte count can split multi-byte characters and produce slightly corrupted UTF-8 in the comment body.

Using `Buffer.byteLength` with `subarray(0, maxCommentBytes).toString('utf8')` can cut in the middle of a multi-byte character, which Node then replaces with �. If you want to avoid any such corruption, consider truncating by code points instead (e.g., `Array.from(body)` or a helper that only cuts at character boundaries), or truncate by characters first and then enforce a slightly smaller byte limit.

Suggested implementation:

```
      - name: Post Codex review comment
        if: steps.pr.outputs.skip != 'true' && always()
        uses: actions/github-script@v7
        env:
          PR_NUMBER: ${{ steps.pr.outputs.number }}
          CODEX_FINAL_MESSAGE: ${{ steps.run_codex.outputs.final-message }}
        with:
          script: |
            const fs = require('fs');

            const MAX_COMMENT_BYTES = 65500;

            function truncateUtf8(body, maxBytes) {
              if (!body) return '';
              let bytes = 0;
              let result = '';

              for (const ch of body) {
                const chBytes = Buffer.byteLength(ch, 'utf8');
                if (bytes + chBytes > maxBytes) {
                  break;
                }
                result += ch;
                bytes += chBytes;
              }

              return result;
            }

            let body = truncateUtf8(
              (process.env.CODEX_FINAL_MESSAGE || '').trim(),
              MAX_COMMENT_BYTES,
            );

```

If there is existing truncation logic further down in this script that uses `Buffer.from(body).subarray(0, maxCommentBytes).toString('utf8')` (or similar), you should remove or adjust it to avoid double-truncating. In particular:
1. Delete any previous byte-based truncation using `Buffer` slices, since `truncateUtf8` already enforces the byte limit safely.
2. If the workflow already defines a specific max-byte limit (e.g. via env or a constant), reuse that value instead of `MAX_COMMENT_BYTES = 65500` to stay consistent with the rest of the file.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +101 to +110
- name: Post Codex review comment
if: steps.pr.outputs.skip != 'true' && always()
uses: actions/github-script@v7
env:
PR_NUMBER: ${{ steps.pr.outputs.number }}
CODEX_FINAL_MESSAGE: ${{ steps.run_codex.outputs.final-message }}
with:
script: |
const fs = require('fs');

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Truncation by raw byte count can split multi-byte characters and produce slightly corrupted UTF-8 in the comment body.

Using Buffer.byteLength with subarray(0, maxCommentBytes).toString('utf8') can cut in the middle of a multi-byte character, which Node then replaces with �. If you want to avoid any such corruption, consider truncating by code points instead (e.g., Array.from(body) or a helper that only cuts at character boundaries), or truncate by characters first and then enforce a slightly smaller byte limit.

Suggested implementation:

      - name: Post Codex review comment
        if: steps.pr.outputs.skip != 'true' && always()
        uses: actions/github-script@v7
        env:
          PR_NUMBER: ${{ steps.pr.outputs.number }}
          CODEX_FINAL_MESSAGE: ${{ steps.run_codex.outputs.final-message }}
        with:
          script: |
            const fs = require('fs');

            const MAX_COMMENT_BYTES = 65500;

            function truncateUtf8(body, maxBytes) {
              if (!body) return '';
              let bytes = 0;
              let result = '';

              for (const ch of body) {
                const chBytes = Buffer.byteLength(ch, 'utf8');
                if (bytes + chBytes > maxBytes) {
                  break;
                }
                result += ch;
                bytes += chBytes;
              }

              return result;
            }

            let body = truncateUtf8(
              (process.env.CODEX_FINAL_MESSAGE || '').trim(),
              MAX_COMMENT_BYTES,
            );

If there is existing truncation logic further down in this script that uses Buffer.from(body).subarray(0, maxCommentBytes).toString('utf8') (or similar), you should remove or adjust it to avoid double-truncating. In particular:

  1. Delete any previous byte-based truncation using Buffer slices, since truncateUtf8 already enforces the byte limit safely.
  2. If the workflow already defines a specific max-byte limit (e.g. via env or a constant), reuse that value instead of MAX_COMMENT_BYTES = 65500 to stay consistent with the rest of the file.

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented Jun 2, 2026

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a Codex-powered PR review pipeline to AgentTower’s CI so PRs can be automatically reviewed with a structured, read-only prompt and panel agent configuration.

Changes:

  • Add a new GitHub Actions workflow to run openai/codex-action@v1, comment results on PRs, and upload the review output as an artifact.
  • Add the PR review prompt plus panel agent definitions and runtime caps under .codex/.
  • Ignore a host-local Codex driver config file to prevent accidental commits.

Reviewed changes

Copilot reviewed 13 out of 14 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
.gitignore Ignores host-local Codex driver config file.
.github/workflows/codex-pr-review.yml New workflow to run Codex review and post/upload outputs.
.github/codex/prompts/pr-review.md Read-only PR review prompt (panel + procedure + output format).
.codex/config.toml Sets Codex agent runtime caps (max_threads, max_depth).
.codex/agents/pr-review-testing.toml QA/testing expert agent definition.
.codex/agents/pr-review-security.toml Security expert agent definition.
.codex/agents/pr-review-reliability.toml Reliability/concurrency expert agent definition.
.codex/agents/pr-review-performance.toml Performance expert agent definition.
.codex/agents/pr-review-observability.toml Observability/ops expert agent definition.
.codex/agents/pr-review-maintainability.toml Maintainability expert agent definition.
.codex/agents/pr-review-data-schema.toml Data/schema/migrations expert agent definition.
.codex/agents/pr-review-coordinator.toml Review coordinator agent definition.
.codex/agents/pr-review-architecture.toml Architecture expert agent definition.
.codex/agents/pr-review-api-contracts.toml API/contracts expert agent definition.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +17 to +20
permissions:
contents: read
pull-requests: write
issues: write
Comment on lines +95 to +100
openai-api-key: ${{ secrets.OPENAI_API_KEY }}
prompt-file: .github/codex/prompts/pr-review.md
output-file: codex-output.md
safety-strategy: drop-sudo
sandbox: workspace-write

Comment on lines +25 to +26
Reuse those local checks. Do not duplicate or supersede them with stale generic
instructions. If a local skill is missing, continue with this prompt.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e3e7f38e31

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

PR_TITLE: ${{ steps.pr.outputs.title }}
with:
openai-api-key: ${{ secrets.OPENAI_API_KEY }}
prompt-file: .github/codex/prompts/pr-review.md
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep the review prompt out of PR control

When a PR modifies .github/codex/prompts/pr-review.md, this job first checks out the PR merge ref and then passes prompt-file from that same checkout, so the code being reviewed can rewrite the reviewer’s instructions and suppress or distort its own review. Fetch the trusted prompt from the base/default branch, or inline/pin the prompt outside the PR-controlled workspace before invoking Codex.

Useful? React with 👍 / 👎.

PR_HEAD_REPO: ${{ steps.pr.outputs.head_repo }}
PR_TITLE: ${{ steps.pr.outputs.title }}
with:
openai-api-key: ${{ secrets.OPENAI_API_KEY }}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Skip fork PRs when the API key is unavailable

For PRs opened from forks, GitHub does not pass repository secrets to pull_request workflows except GITHUB_TOKEN, so this input is empty and the Codex action cannot start its Responses API proxy for those PRs. Because the trigger covers all pull_request events without a same-repository or secret-present guard, external contributor PRs get a failing review check instead of a review; gate this step/job to same-repo PRs or move secret-backed review to a trusted workflow pattern.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

@brettheap brettheap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review panel: standard 10 passes executed as a swarm across architecture, security, CI/reliability, QA, API/contracts, and maintainability, plus GitHub Actions-specific checks.

Findings

  • [P1] The review trust boundary still collapses back into PR-controlled content. The existing unresolved comment on prompt-file is valid, and there is a second path here: the prompt immediately tells Codex to re-import repo-local review skills from the checked-out PR workspace, so even a trusted top-level prompt would still hand authority back to PR-editable files.
  • [P1] The workflow is already failing on this introducing PR’s own Codex PR review run (run 26793098629). Run Codex review failed, but the workflow still posted a generic fallback comment and continued, so readers get a comment even when no successful review completed.
  • [P2] The prompt prefers the local AgentTower review skill path, but the Run Codex review step does not export GH_TOKEN/GITHUB_TOKEN. Those skills start with gh pr view / gh api graphql, so the preferred workflow path is not actually authenticated in Actions as written.

Residual Risk / Test Gaps

  • The unresolved fork/secret-path concern already on the PR still looks valid.
  • The workflow still appends a fresh top-level comment on every rerun, which will make review state hard to track across pushes.

Merge Readiness

Not ready. The review workflow is not yet trustworthy or self-validating enough to serve as a required CI review surface.


## Repository Review Skill

Before reviewing, inspect the local AgentTower review guidance if present:
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This prompt re-imports repo-local review guidance from the checked-out PR workspace before doing anything else. That means even if you later pin prompt-file to a trusted ref, the review policy still collapses back into PR-controlled files through .codex/skills/... / .agents/skills/.... In this repo the local AgentTower review skill is not passive documentation either: it prescribes the review workflow and starts with gh-based metadata/thread inspection. For an Actions review path, the prompt/skill/agent bundle all need to come from a trusted ref; otherwise a PR can steer its own reviewer by moving logic into the local skill text.

- name: Run Codex review
if: steps.pr.outputs.skip != 'true'
id: run_codex
uses: openai/codex-action@v1
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This step never exports GH_TOKEN or GITHUB_TOKEN, but the prompt above tells Codex to inspect the local AgentTower review skill before reviewing, and that skill begins with gh pr view / gh api graphql. In GitHub Actions, gh is not automatically authenticated unless a token is exported into the environment, so the preferred local-skill path the prompt asks Codex to follow is not actually runnable as written. Either provide an authenticated GH_TOKEN to the review step or stop instructing the reviewer to rely on gh-based local skills in this environment.

body = fs.readFileSync('codex-output.md', 'utf8').trim();
}

if (!body) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fallback path masks a broken review as if it merely produced no content. On this PR’s own introducing run (Codex PR review, run 26793098629), the Run Codex review step failed before producing output, but this step still posted the generic fallback comment and the artifact step had nothing useful to upload. That makes the PR conversation look like a review completed when the action actually died upstream. The workflow should fail explicitly when Codex does not produce a final message or codex-output.md, rather than converting a failed review into a generic comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants