[codex] add live provider E2E workflow#175
Conversation
|
Codex review: needs real behavior proof before merge. Reviewed May 31, 2026, 3:38 AM ET / 07:38 UTC. Summary Reproducibility: not applicable. this is a new maintainer-run workflow feature, not a bug report. The PR defects are source-reproducible from the workflow env block and the mismatch with current provider docs/registry. Review metrics: 3 noteworthy metrics.
Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Rank-up moves:
Proof guidance:
Risk before merge
Maintainer options:
Next step before merge
Security Review findings
Review detailsBest possible solution: Land a narrowed manual workflow that derives or tests provider coverage against Crabbox's registered providers, scopes only selected-provider secrets into the steps that need them, and includes redacted live workflow/script proof. Do we have a high-confidence way to reproduce the issue? Not applicable; this is a new maintainer-run workflow feature, not a bug report. The PR defects are source-reproducible from the workflow env block and the mismatch with current provider docs/registry. Is this the best way to solve the issue? No; the proposed direction is useful, but the implementation should scope secrets per selected provider and derive or test the matrix from the registered provider source before it is the maintainable solution. Full review comments:
Overall correctness: patch is incorrect AGENTS.md: found and applied where relevant. Codex review notes: model gpt-5.5, reasoning high; reviewed against 9a42efcdaefc. Label changesLabel justifications:
Evidence reviewedSecurity concerns:
What I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
|
ClawSweeper PR egg 🎁 Pass real behavior proof to wake the egg and unlock a hatchable treat. Where did the egg go?
|
|
Would be nice to have ci/cd to smoke test the providers or local sandboxes across machines and archs. Changing as ready for review so maintainers think about it and consider it and help to develop and make it happen on the ci/cd. |
Summary
Live Provider E2EGitHub Actions workflow that expands a selected provider list into one smoke job per built-in providerscripts/live-provider-e2e.shto run fast provider-specific smoke checks, skip missing secret/tool setup when requested, and clean up leases where supporteddocs/features/live-provider-e2e.mdNotes
The workflow is
workflow_dispatchonly, defaults toallow_missing=true, and does not expose provider secrets to pull request jobs. Maintainers can switchallow_missing=falseonce the selected provider secrets are expected to be present.Some providers still require a runner-side CLI or provider-side resource that cannot be represented as a GitHub secret, such as
devbox,sprite, an existing Railway service, or a self-hosted Parallels host. Those requirements are called out in the doc.Verification
bash -n scripts/live-provider-e2e.shruby -e 'require "yaml"; YAML.load_file(".github/workflows/live-provider-e2e.yml"); puts "workflow yaml ok"'go build -trimpath -o /tmp/crabbox-provider-e2e ./cmd/crabboxnode scripts/check-docs-links.mjsCRABBOX_LIVE=1 CRABBOX_LIVE_SKIP_MISSING=1 CRABBOX_BIN=/tmp/crabbox-provider-e2e scripts/live-provider-e2e.sh awsCRABBOX_LIVE=1 CRABBOX_LIVE_SKIP_MISSING=1 CRABBOX_BIN=/tmp/crabbox-provider-e2e scripts/live-provider-e2e.sh e2b