Skip to content

fix(e2e): unblock current nightly regressions#3936

Merged
ericksoa merged 1 commit into
mainfrom
fix/e2e-followups-0520
May 20, 2026
Merged

fix(e2e): unblock current nightly regressions#3936
ericksoa merged 1 commit into
mainfrom
fix/e2e-followups-0520

Conversation

@ericksoa
Copy link
Copy Markdown
Contributor

@ericksoa ericksoa commented May 20, 2026

Summary

  • make sandbox lifecycle status validation compatible with macOS Bash 3.2 by avoiding ${var,,}
  • retry legacy OpenShell pre-upgrade backup with the current NemoClaw CLI when the installed pre-0.0.37 CLI backup fails
  • keep OpenShell installation deferred during that retry so the old gateway remains available until backup succeeds

Why

  • latest main macOS Vitest failed on bad substitution in sandbox_lifecycle.sh
  • the full manual nightly failed openshell-gateway-upgrade-e2e because v0.0.36 backup-all rejected extension symlink state before the current backup fixes were installed

Verification

  • npm test -- --run test/install-openshell-upgrade-prompt.test.ts test/e2e/scenario-framework-tests/e2e-lib-helpers.test.ts
  • bash -n scripts/install.sh test/e2e/validation_suites/lib/sandbox_lifecycle.sh test/e2e/test-openshell-gateway-upgrade.sh
  • git diff --check
  • npm test -- --run test/install-preflight.test.ts test/install-openshell-upgrade-prompt.test.ts

Summary by CodeRabbit

  • Bug Fixes
    • Improved gateway upgrade reliability with automatic retry capability when initial backup operations fail.
    • Enhanced status validation to be case-insensitive for more robust field detection.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 20, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 5103c244-d345-4dde-981c-597a7380a998

📥 Commits

Reviewing files that changed from the base of the PR and between d4d4a40 and fe088a0.

📒 Files selected for processing (3)
  • scripts/install.sh
  • test/e2e/validation_suites/lib/sandbox_lifecycle.sh
  • test/install-openshell-upgrade-prompt.test.ts

📝 Walkthrough

Walkthrough

This PR refactors the OpenShell CLI installation and legacy gateway upgrade flow to support deferred installation and pre-upgrade backup retry. When NEMOCLAW_DEFER_OPENSHELL_INSTALL is set, install_nemoclaw skips the CLI install. New helper functions prepare the current CLI for use as a fallback and retry backup-all when the legacy CLI backup fails. Tests are updated to mock the current CLI, validate the new retry sequence, and verify fallback behavior.

Changes

OpenShell CLI Upgrade with Fallback Retry

Layer / File(s) Summary
Conditional OpenShell installation
scripts/install.sh
install_nemoclaw now conditionally runs the OpenShell CLI installer based on NEMOCLAW_DEFER_OPENSHELL_INSTALL flag; when deferred, installation is skipped to allow later logic to sequence operations.
Pre-upgrade backup retry infrastructure
scripts/install.sh
Three new helper functions enable backup retry: prepare_current_cli_for_preupgrade_backup sets deferred install flag and reinstalls current CLI, resolve_prepared_cli_runner resolves the prepared executable, and run_preupgrade_backup orchestrates retry with fallback to current CLI. preinstall_backup_and_retire_legacy_gateway now delegates to run_preupgrade_backup.
Test infrastructure for fallback CLI validation
test/install-openshell-upgrade-prompt.test.ts, test/e2e/validation_suites/lib/sandbox_lifecycle.sh
Test harness extended with options to control fallback availability and success; mocked CLI includes nemoclaw-current executable; CLI resolution logic selects prepared current CLI when applicable; assertions verify old-vs-current CLI sequence including prepare markers and retry flow. Supporting case-insensitive status field check in sandbox lifecycle tests ensures robust matching.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested labels

bug, OpenShell, NemoClaw CLI, E2E, nightly-e2e

Suggested reviewers

  • jyaunches
  • cv

🐰 With whiskers held high, we hop through the flow,
Deferring install where fallbacks can grow,
When backups should stumble, the current CLI leaps,
The test suite springs forth to prove promises we keep,
Retry logic blooms—no gateway shall fall! 🌿

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title describes fixing nightly regressions without specifying what those regressions are or which files/systems are affected, making it vague about the actual changes. Consider revising to be more specific about the main fix, such as 'fix: sandbox lifecycle status validation and OpenShell pre-upgrade backup retry' or similar.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/e2e-followups-0520

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.


Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: openshell-gateway-upgrade-e2e, cloud-onboard-e2e
Optional E2E: state-backup-restore-e2e, scenario:ubuntu-repo-cloud-openclaw:sandbox-operations

Dispatch hint: openshell-gateway-upgrade-e2e,cloud-onboard-e2e

Auto-dispatched E2E: openshell-gateway-upgrade-e2e, cloud-onboard-e2e via nightly-e2e.yaml at fe088a03951d204b8f7e47e850e6ce0cef8b7ce2nightly run

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • openshell-gateway-upgrade-e2e (high): Closest existing real-flow coverage for this installer change: installs an old NemoClaw/OpenShell release with a running sandbox, runs the current installer/onboard path, validates pre-upgrade backup, legacy gateway retirement, restored durable state, and a running agent after upgrade.
  • cloud-onboard-e2e (high): scripts/install.sh changed the GitHub/curl installer path that normal onboarding uses. This verifies the non-deferred installer path still installs OpenShell, onboards OpenClaw, applies policy presets, and reaches a healthy sandbox/gateway.

Optional E2E

  • state-backup-restore-e2e (high): Useful adjacent confidence for the backup/restore mechanisms relied on by the pre-upgrade backup path, although it does not specifically exercise legacy OpenShell gateway retirement.
  • scenario:ubuntu-repo-cloud-openclaw:sandbox-operations (high): Validates the changed scenario-suite sandbox_lifecycle.sh status assertion in its intended scenario-runner context.

New E2E recommendations

  • installer-openshell-gateway-upgrade (high): Existing openshell-gateway-upgrade-e2e validates the successful old-CLI pre-upgrade backup path, but this PR adds a new fallback path where the old CLI backup-all fails, the installer prepares the current CLI with OpenShell install deferred, retries backup-all, and only then retires the legacy gateway.
    • Suggested test: Extend test/e2e/test-openshell-gateway-upgrade.sh or add a sibling legacy-upgrade E2E that forces old CLI backup-all failure, verifies current CLI retry succeeds, asserts OpenShell installation is deferred until after backup, and confirms the legacy gateway is not retired if both backups fail.

Dispatch hint

  • Workflow: nightly-e2e.yaml
  • jobs input: openshell-gateway-upgrade-e2e,cloud-onboard-e2e

@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26193062061
Target ref: fe088a03951d204b8f7e47e850e6ce0cef8b7ce2
Workflow ref: main
Requested jobs: openshell-gateway-upgrade-e2e
Summary: 0 passed, 0 failed, 0 skipped

Job Result
openshell-gateway-upgrade-e2e ⚠️ cancelled

@github-actions
Copy link
Copy Markdown
Contributor

PR Review Advisor

Recommendation: blocked
Confidence: medium
Analyzed HEAD: fe088a03951d204b8f7e47e850e6ce0cef8b7ce2
Findings: 3 blocker(s), 3 warning(s), 0 suggestion(s)

This is an automated advisory review. A human maintainer must make the final merge decision.

Limitations: Review used the provided deterministic GitHub context, diff, and read-only repository inspection; no tests or installer scripts were executed.; CodeRabbit review was still in progress, so final automated review findings were unavailable.; E2E Advisor recommendation content was not available as a completed comment; only the in-progress check context was visible.; Linked issues array was empty, so acceptance mapping used the PR body summary/why clauses rather than linked issue clauses.

Workflow run

Full advisor summary

PR Review Advisor

Base: origin/main
Head: HEAD
Analyzed SHA: fe088a03951d204b8f7e47e850e6ce0cef8b7ce2
Recommendation: blocked
Confidence: medium

The patch appears targeted and mostly sound, but it touches installer and sandbox lifecycle paths while CI, E2E recommendation, CodeRabbit, and mergeability are still blocked or pending for the head SHA.

Gate status

  • CI: pending — Head SHA fe088a0 has pending/in-progress contexts including cli-parity, E2E recommendation, wsl-e2e, macos-e2e, CodeQL, ShellCheck SARIF, unit-vitest-linux, build-sandbox-images, and CodeRabbit.
  • Mergeability: fail — GitHub reports mergeStateStatus=BLOCKED and reviewDecision=REVIEW_REQUIRED for PR fix(e2e): unblock current nightly regressions #3936.
  • Review threads: pending — GraphQL reviewThreads is empty, but the CodeRabbit issue comment says review is in progress and the CodeRabbit status context is PENDING.
  • Risky code tested: warning — Risky installer/bootstrap shell and onboarding/host glue paths changed. Unit tests were added, but required CI/E2E jobs have not passed for the current head SHA.

🔴 Blockers

  • Required status checks are still pending for the head SHA: The PR changes high-risk installer and sandbox lifecycle paths, but multiple checks for fe088a0 are not complete.
    • Recommendation: Wait for the current head SHA's required checks to complete successfully before considering merge.
    • Evidence: Status rollup shows in-progress or queued checks including E2E recommendation, macos-e2e, wsl-e2e, unit-vitest-linux, ShellCheck SARIF, CodeQL, and build-sandbox-images.
  • PR is currently merge-blocked: GitHub reports the PR is blocked and still requires review.
    • Recommendation: Resolve branch protection, review, and status check requirements before merge.
    • Evidence: mergeStateStatus=BLOCKED; reviewDecision=REVIEW_REQUIRED.
  • Installer retry path requires real E2E validation (scripts/install.sh:1583): The new pre-upgrade backup retry installs the current NemoClaw CLI while deliberately deferring OpenShell installation, then uses that CLI against a legacy OpenShell gateway. Unit tests mock this behavior but cannot prove the real install, backup, gateway-retirement, and restore sequence on affected hosts.
    • Recommendation: Require passing E2E recommendation plus relevant macOS and WSL/manual upgrade jobs for head SHA fe088a0.
    • Evidence: Changed runtime/infrastructure paths include scripts/install.sh and sandbox_lifecycle.sh; E2E recommendation, macos-e2e, and wsl-e2e are in progress rather than passed.

🟡 Warnings

  • Externally inherited defer flag can skip OpenShell installation (scripts/install.sh:1429): NEMOCLAW_DEFER_OPENSHELL_INSTALL is introduced as an internal retry sentinel, but install_nemoclaw reads it directly from the environment. If inherited during a normal GitHub-clone install path, it can skip OpenShell CLI installation and leave the system with stale OpenShell despite the installer path that normally enforces the version floor.
    • Recommendation: Constrain this flag to the internal retry path where possible, clear inherited values at script entry, or document and guard it so the final top-level install cannot unintentionally defer OpenShell installation.
    • Evidence: install_nemoclaw gates OpenShell CLI installation on truthy_env "${NEMOCLAW_DEFER_OPENSHELL_INSTALL:-}"; prepare_current_cli_for_preupgrade_backup sets the same environment variable for the nested retry install.
  • Negative fallback-backup failure path is not directly asserted (test/install-openshell-upgrade-prompt.test.ts:21): The test helper supports fallbackBackupSucceeds, but the suite does not add a case where the current CLI is prepared successfully and its backup-all still fails. That is the branch that proves the legacy gateway is not retired after a failed retry backup.
    • Recommendation: Add a test with backupSucceeds=false, fallbackAvailable=true, and fallbackBackupSucceeds=false that asserts non-zero exit and no gateway destroy call.
    • Evidence: fallbackBackupSucceeds is wired into the fake current CLI, but no test passes fallbackBackupSucceeds: false.
  • Active overlapping PRs touch the same high-risk files: The changed files still exist and the patch is not stale, but there is concurrent work touching scripts/install.sh and sandbox_lifecycle.sh. This increases drift risk for installer and E2E semantics.

🔵 Suggestions

  • None.

Acceptance coverage

  • met — make sandbox lifecycle status validation compatible with macOS Bash 3.2 by avoiding ${var,,}: sandbox_lifecycle_assert_status_fields_present now computes status_output_lower with tr '[:upper:]' '[:lower:]' and compares against that variable instead of using Bash 4 ${var,,} expansion.
  • met — retry legacy OpenShell pre-upgrade backup with the current NemoClaw CLI when the installed pre-0.0.37 CLI backup fails: run_preupgrade_backup first invokes the old CLI backup-all, checks legacy_openshell_gateway_upgrade_needed, then prepares the current CLI and retries backup-all with resolve_prepared_cli_runner.
  • met — keep OpenShell installation deferred during that retry so the old gateway remains available until backup succeeds: prepare_current_cli_for_preupgrade_backup exports NEMOCLAW_DEFER_OPENSHELL_INSTALL=1, and install_nemoclaw skips scripts/install-openshell.sh when that flag is truthy.
  • met — latest main macOS Vitest failed on bad substitution in sandbox_lifecycle.sh: The only Bash lowercase conversion in the shown sandbox_lifecycle.sh diff was changed to POSIX-style tr, which is compatible with macOS Bash 3.2.
  • partial — the full manual nightly failed openshell-gateway-upgrade-e2e because v0.0.36 backup-all rejected extension symlink state before the current backup fixes were installed: The unit test retries legacy backup with the current CLI before retiring the gateway models the retry behavior, but the relevant E2E jobs and E2E recommendation are still pending for the current head SHA.

Security review

  • pass — 1: Secrets and Credentials: No hardcoded tokens, passwords, PEM material, credential JSON, or new secret-bearing files were introduced in the reviewed diff.
  • pass — 2: Input Validation and Data Sanitization: The changed shell invocations quote CLI runner variables and fixed arguments. No new URL parsing, deserialization, command construction from user-provided strings, or path traversal surface was introduced in the shown diff.
  • pass — 3: Authentication and Authorization: No authentication or authorization endpoints or token validation logic are changed.
  • pass — 4: Dependencies and Third-Party Libraries: No new package dependencies or registries are added. The installer still uses existing npm/GitHub install flows.
  • pass — 5: Error Handling and Logging: The backup retry path fails closed before gateway retirement when both old and current CLI backup attempts fail. No new stack-trace or secret logging was evident in the diff.
  • pass — 6: Cryptography and Data Protection: Not applicable — no cryptographic operations, key handling, hashing, or transport security changes are introduced.
  • warning — 7: Configuration and Security Headers: NEMOCLAW_DEFER_OPENSHELL_INSTALL is a new environment-controlled configuration switch that can skip OpenShell CLI installation on the GitHub-clone install path if externally inherited. This should be constrained to avoid accidental stale OpenShell state.
  • warning — 8: Security Testing: Unit tests cover opt-in, old CLI failure, retry success, and no-retirement on unavailable fallback, but real installer/sandbox E2E and CI checks are still pending; the fallback-current-CLI-fails-after-prepare negative branch is not directly asserted.
  • warning — 9: Holistic Security Posture: The change is fail-closed around gateway retirement and addresses a real lifecycle failure, but it modifies high-risk installer and sandbox lifecycle behavior while mergeability, CodeRabbit, CI, and E2E validation are still incomplete.

Test / E2E status

  • Test depth: e2e_required — The change affects scripts/install.sh, OpenShell gateway upgrade sequencing, and sandbox lifecycle validation. Unit tests are useful but cannot prove real host install, legacy gateway backup, gateway retirement, restore preparation, macOS Bash 3.2 behavior, and OpenShell/NemoClaw integration.
  • E2E Advisor: missing
  • Required E2E jobs: E2E recommendation, macos-e2e, wsl-e2e
  • Missing for analyzed SHA: E2E recommendation, macos-e2e, wsl-e2e

✅ What looks good

  • The Bash 3.2 compatibility fix is small and avoids the unsupported ${var,,} expansion using tr.
  • The installer retry path is structured to back up before retiring the legacy gateway and fails closed if backup cannot be completed.
  • The new tests cover the key happy path for retrying with the current CLI and verify that the legacy gateway is not destroyed when backup preparation is unavailable.
  • The nested retry install restores the previous NEMOCLAW_DEFER_OPENSHELL_INSTALL value after preparing the current CLI.

Review completeness

  • Review used the provided deterministic GitHub context, diff, and read-only repository inspection; no tests or installer scripts were executed.
  • CodeRabbit review was still in progress, so final automated review findings were unavailable.
  • E2E Advisor recommendation content was not available as a completed comment; only the in-progress check context was visible.
  • Linked issues array was empty, so acceptance mapping used the PR body summary/why clauses rather than linked issue clauses.
  • Human maintainer review required: yes

@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26193106252
Target ref: fe088a03951d204b8f7e47e850e6ce0cef8b7ce2
Workflow ref: main
Requested jobs: openshell-gateway-upgrade-e2e,cloud-onboard-e2e
Summary: 2 passed, 0 failed, 0 skipped

Job Result
cloud-onboard-e2e ✅ success
openshell-gateway-upgrade-e2e ✅ success

@ericksoa ericksoa merged commit 419a3c3 into main May 20, 2026
33 checks passed
@ericksoa ericksoa deleted the fix/e2e-followups-0520 branch May 20, 2026 22:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant