You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Codex Lab is dogfoodable as the active Codex-based coding CLI, with the restored Every Code history used as source material and with regression gates that protect skills behavior, prompt caching, thread continuity, and agent/review workflows before upstream merges or feature ports land.
Current Status
State: The codex-lab dogfood launcher is merged in PR #114 and local dogfood can start from PATH with CODEX_LAB_HOME defaulting to ~/.codex-lab. The next dogfood blocker is multi-auth: users need to manage multiple accounts, keep sessions sticky for prompt-cache locality, fall back on quota/auth failures, and optionally prime lazy weekly reset windows.
Recommended next action: implement #115's first PR slice: profile storage plus codex-lab login --profile <name>, profile listing/status, and profile-aware logout. Do not build a broad settings panel yet; design stable APIs so future CLI/TUI/app-server settings surfaces can consume them.
Important implementation stance remains unchanged:
Use upstream-shaped Codex primitives first.
Keep changes small and easy to rebase over openai/codex.
Avoid copying Every Code plugin/settings behavior or adding parallel systems when existing Codex login/auth/app-server primitives can carry the feature.
Source Material
Use restored cbusillo/code issues and commits as historical source material, not as automatic carry-forward status. Important restored planning issues include:
Codex Lab remains the product base. Build on the Codex CLI/Desktop/app-server substrate because the hard-to-recreate value is Codex Desktop compatibility, Codex iOS/mobile control of Desktop, Codex subscription auth, upstream openai/codex compatibility, and continuity with the Every Code Codex-base port path.
opencode and other coding harnesses are reference implementations, not the base. Steal ideas when they improve Codex Lab without breaking Codex compatibility and can be validated through the exec harness or focused tests.
Guardrails
Use fixtures/tests/concepts before broad implementation overlays where feasible.
Treat missing Every Code behavior as unclassified until this plan or a child issue records Port, Rewrite, Covered, Defer, or Retire.
Keep openai as fetch-only upstream.
Avoid direct work on protected/default branches; use focused task branches and PRs for implementation.
Every code-bearing port slice should include scoped validation evidence.
The exec harness is mandatory for Codex Lab-specific regressions: skills prompt strength, cached-token stability, thread continuity, Desktop/app-server compatibility, and Every Code workflow expectations.
Cache-sensitive scenarios should compare input tokens, cached input tokens, cache ratio, and normalized prompt-prefix stability against explicit baselines.
MVP Slices
Repo and runner recovery.
Verify cbusillo/codex-lab runners after repo move.
Create focused child issues for independent MVP slices.
Realign repo AGENTS.md so future agents find this issue instead of relying on local plan files.
Exec harness hardening and regression gates.
Fix known automation findings: cleanup just argument forwarding and exec-harness workflow path triggers.
Add skills-cache-continuity as the first cache-sensitive scenario.
Add deterministic fake-Responses coverage first, then local-LLM dogfood and frontier release variants.
Skills and prompt-cache protection.
Protect against skills becoming less directive or being treated as optional advice.
Protect against prompt-stack churn reducing cached-token reuse.
Auto Review proof loop.
Restore actionable review evidence without broad-review noise.
Mine restored Every Code auto-review lifecycle/store/ledger commits and issues before implementation.
Agents and third-party agent orchestration.
Rebuild configurable agent roles, third-party agents, local LLM roles, and review/validation loops on Codex Lab primitives.
Codex Desktop/app-server compatibility.
Validate Codex Lab in Desktop and keep app-server behavior upstream-shaped unless an additive extension is explicitly validated.
Code Bridge/browser and remote-control workflows.
Preserve or replace Code Bridge/browser/control capability with clear Desktop/app-server boundaries.
Auto Drive.
Rebuild Auto Drive on validated Codex thread/session/worktree/token primitives rather than copying old implementation wholesale.
Local LLM dogfood.
Make LM Studio/OpenAI-compatible local endpoints first-class for bounded dogfood basics, then graduate roles only after repeated proof.
Opencode And Other References
opencode is the primary near-term reference for plugin/hooks ergonomics, named agents/subagents, LM Studio flow, provider/auth boundaries, permissions UX, run artifacts, GitHub automation presentation, prompt/context controls, and MCP/tool discovery.
Keep a lightweight watch on OpenHands, SWE-agent/mini-SWE-agent, Aider, Cline/Roo Code, Goose, and Continue for product architecture ideas. Separately, keep Terminal-Bench, promptfoo, Inspect AI, SWE-bench, Aider benchmarks, and BrowserGym/WebArena/OSWorld in mind for eval and grading ideas. These are references only.
Next Actions
Wait for the manually triggered exec-harness runner proof on cbusillo/codex-lab and record the result here.
Create child issues for the first few MVP slices after confirming this parent issue shape.
Update repo AGENTS.md to point future agents at this issue and labels such as plan and codex-lab-mvp.
Start implementation with known exec-harness automation fixes, then skills-cache-continuity.
Relationships
Sub-issues created from the 2026-06-12 plan review:
Finish Line
Codex Lab is dogfoodable as the active Codex-based coding CLI, with the restored Every Code history used as source material and with regression gates that protect skills behavior, prompt caching, thread continuity, and agent/review workflows before upstream merges or feature ports land.
Current Status
State: The
codex-labdogfood launcher is merged in PR #114 and local dogfood can start from PATH withCODEX_LAB_HOMEdefaulting to~/.codex-lab. The next dogfood blocker is multi-auth: users need to manage multiple accounts, keep sessions sticky for prompt-cache locality, fall back on quota/auth failures, and optionally prime lazy weekly reset windows.New active plan:
Current MVP sequencing:
CODEX_LAB_HOME, then explicit profile selection, auto-sticky routing, and opt-in reset priming.Recommended next action: implement #115's first PR slice: profile storage plus
codex-lab login --profile <name>, profile listing/status, and profile-aware logout. Do not build a broad settings panel yet; design stable APIs so future CLI/TUI/app-server settings surfaces can consume them.Important implementation stance remains unchanged:
openai/codex.Source Material
Use restored
cbusillo/codeissues and commits as historical source material, not as automatic carry-forward status. Important restored planning issues include:Recent restored commits worth mining include the Codex substrate migration, Codex Desktop startup compatibility, Every Code identity/config alignment, feature inventory, parity ledger, prompt-cache prefix preservation, auto-review proof metrics, auto-review lifecycle/store/ledger work, agent context file preloading, worktree decision gate hardening, and release/build runner fixes.
Base Platform Decision
Codex Lab remains the product base. Build on the Codex CLI/Desktop/app-server substrate because the hard-to-recreate value is Codex Desktop compatibility, Codex iOS/mobile control of Desktop, Codex subscription auth, upstream
openai/codexcompatibility, and continuity with the Every Code Codex-base port path.opencodeand other coding harnesses are reference implementations, not the base. Steal ideas when they improve Codex Lab without breaking Codex compatibility and can be validated through the exec harness or focused tests.Guardrails
openaias fetch-only upstream.MVP Slices
Repo and runner recovery.
cbusillo/codex-labrunners after repo move.cbusillo/coderunner-free unless a future archival workflow explicitly needs one.Planning source of truth.
AGENTS.mdso future agents find this issue instead of relying on local plan files.Exec harness hardening and regression gates.
justargument forwarding and exec-harness workflow path triggers.skills-cache-continuityas the first cache-sensitive scenario.Skills and prompt-cache protection.
Auto Review proof loop.
Agents and third-party agent orchestration.
Codex Desktop/app-server compatibility.
Code Bridge/browser and remote-control workflows.
Auto Drive.
Local LLM dogfood.
Opencode And Other References
opencodeis the primary near-term reference for plugin/hooks ergonomics, named agents/subagents, LM Studio flow, provider/auth boundaries, permissions UX, run artifacts, GitHub automation presentation, prompt/context controls, and MCP/tool discovery.Keep a lightweight watch on OpenHands, SWE-agent/mini-SWE-agent, Aider, Cline/Roo Code, Goose, and Continue for product architecture ideas. Separately, keep Terminal-Bench, promptfoo, Inspect AI, SWE-bench, Aider benchmarks, and BrowserGym/WebArena/OSWorld in mind for eval and grading ideas. These are references only.
Next Actions
exec-harnessrunner proof oncbusillo/codex-laband record the result here.AGENTS.mdto point future agents at this issue and labels such asplanandcodex-lab-mvp.skills-cache-continuity.Relationships
Sub-issues created from the 2026-06-12 plan review:
skills-cache-continuityexec-harness scenario.Related existing planning issues: