Skip to content

test(e2e): fix current nightly failures#3926

Merged
ericksoa merged 7 commits into
mainfrom
fix/e2e-green-0520
May 20, 2026
Merged

test(e2e): fix current nightly failures#3926
ericksoa merged 7 commits into
mainfrom
fix/e2e-green-0520

Conversation

@ericksoa
Copy link
Copy Markdown
Contributor

@ericksoa ericksoa commented May 20, 2026

Summary

  • update scenario validation helpers to call sandbox-scoped nemoclaw <sandbox> status and nemoclaw <sandbox> logs commands
  • restore the WeChat plugin install path into plugins.load.paths when reseeding OpenClaw config after plugin-install rewrites
  • tighten helper/unit coverage so wrong command ordering and missing WeChat load paths fail locally

Validation

  • bash -n test/e2e/validation_suites/lib/baseline_onboarding.sh test/e2e/validation_suites/lib/sandbox_lifecycle.sh
  • npm test -- --run test/e2e/scenario-framework-tests/e2e-lib-helpers.test.ts
  • npm test -- --run test/generate-openclaw-config.test.ts test/seed-wechat-accounts.test.ts
  • Focused E2E: https://github.com/NVIDIA/NemoClaw/actions/runs/26187684008
    • onboard-negative-paths-e2e: success
    • messaging-providers-e2e: success
    • sandbox-operations-e2e: success

Summary by CodeRabbit

  • Bug Fixes

    • WeChat seeding now preserves and restores plugin registry and load-path entries (including install path) while continuing to write per-account state and enable seeded accounts.
  • Tests

    • Expanded e2e and unit tests: improved CLI argument handling, added status/logs availability checks, and added assertions for WeChat seeding/rerun and expected extension paths.
  • Chores

    • Hardened container build/patch steps and made WebSocket handshake timeout handling more robust.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 20, 2026

📝 Walkthrough

Walkthrough

seed-wechat-accounts.py now preserves/restores the openclaw-weixin install and plugins.load.paths by deriving installPath from existing registry; tests add helpers/assertions for the computed extension path; E2E mocks/helpers now call nemoclaw with sandbox name first; Dockerfile patch steps and validations are hardened.

Changes

WeChat plugin registry preservation and E2E validation

Layer / File(s) Summary
Plugin registry preservation logic
scripts/seed-wechat-accounts.py
Removes hardcoded WECHAT_PLUGIN_INSTALL; adds _wechat_plugin_install_path() to derive or default install paths; reworks _patch_openclaw_config() to conditionally set source/spec/installPath and manage plugins.load.paths.
WeChat extension path test helpers
test/generate-openclaw-config.test.ts, test/seed-wechat-accounts.test.ts
Adds wechatExtensionPath() helpers that resolve the temp OpenClaw state dir to compute extensions/openclaw-weixin for assertions.
Seed and config generation test updates
test/generate-openclaw-config.test.ts, test/seed-wechat-accounts.test.ts
Updates assertions to expect plugins.installs["openclaw-weixin"].installPath, verify plugins.load.paths contains the computed extension path, and ensure existing install metadata and load paths are preserved when re-seeding.
E2E validation suite and mocked CLI updates
test/e2e/scenario-framework-tests/e2e-lib-helpers.test.ts, test/e2e/validation_suites/lib/baseline_onboarding.sh, test/e2e/validation_suites/lib/sandbox_lifecycle.sh
Mocks dispatch on full nemoclaw argument lists (case "$*"); baseline onboarding adds PASS checks for sandbox status and logs; helpers now invoke `nemoclaw "$E2E_SANDBOX_NAME" status
Docker OpenClaw patching and validations
Dockerfile
Makes install-path discovery tolerant, adds explicit not-found failures and post-sed assertions; makes websocket-timeout replacement idempotent and validates replacement result.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

Possibly related PRs

  • NVIDIA/NemoClaw#3839: Also updates seed logic to restore openclaw-weixin install/registry and plugins.load.paths.
  • NVIDIA/NemoClaw#3682: Related work on WeChat onboarding and plugin registry preservation in config generation.
  • NVIDIA/NemoClaw#3897: Overlaps with baseline onboarding e2e helper and nemoclaw invocation changes.

Suggested labels

CI/CD, E2E, fix, Integration: WeChat, v0.0.46

Suggested reviewers

  • jyaunches
  • cv
  • cjagwani

Poem

🐰 I found a path that once was lost,
Restored the plugin at no cost.
Tests follow sandboxes by name,
Docker checks confirm the same.
Small hops keep startup strong and boss.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 7.14% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'test(e2e): fix current nightly failures' is directly related to the main changes in this PR, which update e2e validation helpers, WeChat plugin configuration, and Docker patching to resolve test failures.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/e2e-green-0520

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 20, 2026

E2E Advisor Recommendation

Required E2E: test-e2e-sandbox, test-e2e-gateway-isolation, cloud-onboard-e2e, channels-stop-start-e2e, sandbox-operations-e2e, openclaw-plugin-runtime-exdev-e2e
Optional E2E: messaging-providers-e2e, network-policy-e2e, ubuntu-repo-cloud-openclaw

Dispatch hint: cloud-onboard-e2e,channels-stop-start-e2e,sandbox-operations-e2e

Auto-dispatched E2E: cloud-onboard-e2e, channels-stop-start-e2e, sandbox-operations-e2e via nightly-e2e.yaml at 9060ac8a480aca2e269fac20de7b654c4248151bnightly run

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • test-e2e-sandbox (medium): Required because Dockerfile changes must prove the production sandbox image still builds and passes the containerized sandbox smoke suite.
  • test-e2e-gateway-isolation (medium): Required because the touched Dockerfile block patches OpenClaw fetch/proxy/security-sensitive code paths; gateway isolation should verify the production image still enforces sandbox boundaries.
  • cloud-onboard-e2e (high): Required to validate install/onboard/image-build flow, generated openclaw.json, Landlock read-only enforcement, and inference.local readiness after Dockerfile and OpenClaw config seeding changes.
  • channels-stop-start-e2e (high): Required because the PR changes WeChat OpenClaw plugin registration/load-path seeding; this existing job explicitly exercises OpenClaw and Hermes channel stop/start/remove/rebuild flows across telegram, discord, wechat, slack, and whatsapp.
  • sandbox-operations-e2e (high): Required because the PR changes live status/logs validation semantics and Dockerfile handshake timeout patching, both of which can affect sandbox list/status/logs/exec and recovery operations.
  • openclaw-plugin-runtime-exdev-e2e (high): Required because Dockerfile changes are in OpenClaw plugin/runtime patching territory; this regression job validates a fresh OpenClaw sandbox can bootstrap plugin runtime dependencies without filesystem/runtime failures.

Optional E2E

  • messaging-providers-e2e (high): Useful adjacent confidence for messaging provider placeholders, credential isolation, OpenClaw config patching, and L7 proxy token rewriting, though it does not appear to specifically cover WeChat.
  • network-policy-e2e (high): Optional security-adjacent check because the edited Dockerfile block sits near OpenClaw proxy/SSRF patching; useful to confirm deny-by-default, hot reload, inference exemption, and SSRF validation remain intact.
  • ubuntu-repo-cloud-openclaw (high): Optional scenario-runner validation with suite_filter=baseline-onboarding,sandbox-operations,sandbox-lifecycle to exercise the modified validation_suites helpers in the scenario framework.

New E2E recommendations

  • wechat-openclaw-plugin-load-path (high): Existing channels-stop-start coverage checks WeChat channel lifecycle broadly, but a targeted hermetic OpenClaw WeChat E2E should assert plugins.installs.openclaw-weixin.installPath, plugins.load.paths, channels.openclaw-weixin.accounts..enabled, and gateway startup with the upstream plugin loaded.
    • Suggested test: Add a dedicated WeChat OpenClaw plugin-load-path E2E using fake WECHAT_* inputs and a built sandbox image.
  • openclaw-symlinked-plugin-install (medium): The Dockerfile patch changes OpenClaw install-safe-path/install-package-dir behavior for symlinked plugin install bases, but the current E2E set does not appear to directly install/load a plugin from a symlinked path and verify containment remains enforced.
    • Suggested test: Add an OpenClaw symlinked plugin install E2E that succeeds for in-tree symlinks and fails for symlinks escaping the allowed base.

Dispatch hint

  • Workflow: .github/workflows/nightly-e2e.yaml
  • jobs input: cloud-onboard-e2e,channels-stop-start-e2e,sandbox-operations-e2e

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 20, 2026

PR Review Advisor

Recommendation: blocked
Confidence: medium
Analyzed HEAD: 9060ac8a480aca2e269fac20de7b654c4248151b
Findings: 3 blocker(s), 4 warning(s), 1 suggestion(s)

This is an automated advisory review. A human maintainer must make the final merge decision.

Limitations: Review used provided trusted metadata and the supplied diff; no tests, package-manager commands, PR scripts, workflow dispatches, or network actions were executed.; Current-head CI and required E2E results were not complete in trusted metadata.; No linked issues were present in trusted metadata, so acceptance coverage is based on PR body clauses and trusted bot comments only.; PR title/body/comments and bot comments were treated as untrusted evidence except where included in deterministic trusted context.; Line numbers are approximate for findings derived from the supplied diff.; Active same-file PR overlaps indicate possible drift after rebase or merge-order changes.

Workflow run

Full advisor summary

PR Review Advisor

Base: origin/main
Head: HEAD
Analyzed SHA: 9060ac8a480aca2e269fac20de7b654c4248151b
Recommendation: blocked
Confidence: medium

The patch addresses existing Docker/OpenClaw/WeChat/E2E helper code and fixes the resolved sed-delimiter review issue, but merge is blocked by pending CI, BLOCKED merge state, and missing required current-head E2E evidence for high-risk onboarding/sandbox/plugin changes.

Gate status

  • CI: pending — Trusted gateStatus reports 11 status context(s) pending. GraphQL for head 9060ac8 shows IN_PROGRESS/QUEUED checks including E2E recommendation, wsl-e2e, PR review advisor, CodeQL javascript-typescript, CodeQL python, unit-vitest-linux, checks, ShellCheck SARIF, build-sandbox-images, build-sandbox-images-arm64, and CodeRabbit pending.
  • Mergeability: fail — GitHub GraphQL reports mergeStateStatus=BLOCKED and reviewDecision=REVIEW_REQUIRED for head 9060ac8; REST metadata reports mergeable_state=blocked.
  • Review threads: pass — Trusted gateStatus reports 1 review thread(s), all resolved. GraphQL shows the CodeRabbit sed-delimiter thread isResolved=true and the comment says addressed in commit 9060ac8.
  • Risky code tested: warning — Risky onboarding/host glue detected in Dockerfile and scripts/seed-wechat-accounts.py. Unit/helper tests were added, but E2E Advisor required cloud-onboard-e2e, messaging-providers-e2e, channels-stop-start-e2e, and sandbox-operations-e2e; no passing evidence for current head 9060ac8 was provided.

🔴 Blockers

  • Current-head CI is still pending: The current head SHA cannot be considered merge-ready while required checks are in progress, queued, or pending.
    • Recommendation: Wait for all required checks for 9060ac8 to complete successfully before treating the PR as merge-ready.
    • Evidence: Trusted gateStatus reports 11 pending status contexts; GraphQL lists IN_PROGRESS/QUEUED checks including E2E recommendation, wsl-e2e, PR review advisor, CodeQL, unit-vitest-linux, checks, ShellCheck SARIF, build-sandbox-images, build-sandbox-images-arm64, and CodeRabbit pending.
  • GitHub merge state is blocked: GitHub reports the PR as blocked, indicating branch protection, required checks, or required review gates are not satisfied.
    • Recommendation: Do not merge until mergeStateStatus is no longer BLOCKED and required review/check gates are satisfied.
    • Evidence: GraphQL reports mergeStateStatus=BLOCKED and reviewDecision=REVIEW_REQUIRED; REST metadata reports mergeable_state=blocked.
  • Required E2E evidence missing for current head SHA: The E2E Advisor required four jobs for this Docker/OpenClaw/WeChat/sandbox-helper change class, but trusted evidence only shows auto-dispatch for an older SHA and failed/cancelled prior selective runs; no current-head passes are shown for 9060ac8.
    • Recommendation: Obtain completed passing results for cloud-onboard-e2e, messaging-providers-e2e, channels-stop-start-e2e, and sandbox-operations-e2e for the exact current head SHA.
    • Evidence: E2E Advisor required cloud-onboard-e2e, messaging-providers-e2e, channels-stop-start-e2e, sandbox-operations-e2e. The advisor comment auto-dispatched at 439c7bc; current head is 9060ac8. Later GraphQL shows E2E recommendation still IN_PROGRESS for current head.

🟡 Warnings

  • Restored WeChat installPath is added to plugin load paths without allowlist validation (scripts/seed-wechat-accounts.py:78): The new preservation logic accepts any non-empty plugins.installs.openclaw-weixin.installPath from openclaw.json and appends it to plugins.load.paths. If openclaw.json is tampered with or rewritten to an unexpected path, reseeding can re-enable plugin loading from that path.
    • Recommendation: Constrain restored WeChat load paths to trusted locations such as the resolved OpenClaw state extensions/openclaw-weixin path or an explicit base-image plugin cache directory. Add negative tests for relative paths, traversal, non-state absolute paths, and symlink/realpath escapes.
    • Evidence: _wechat_plugin_install_path() returns install_record.installPath after strip() without base/realpath validation, and _patch_openclaw_config() appends wechat_install_path into plugins.load.paths. The added test preserves and loads /already/installed/openclaw-weixin.
  • accountId is used in a filename without strict validation (scripts/seed-wechat-accounts.py:199): accountId from NEMOCLAW_WECHAT_CONFIG_B64 is stripped but not allowlist-validated before being interpolated into accounts/.json. Separators or traversal sequences could create surprising paths if the input is malformed or attacker-controlled.
    • Recommendation: Validate accountId with a strict allowlist such as /^[A-Za-z0-9._-]+$/ plus a reasonable length limit, or encode it safely before using it as a filename. Add negative tests for ../, /, backslash, empty-after-trim, and very long values.
    • Evidence: main() computes account_id = (config.get("accountId") or "").strip() and account_file = plugin_dir / "accounts" / f"{account_id}.json".
  • Docker OpenClaw patch drift requires real image regression coverage (Dockerfile:162): Patch 3 and Patch 5 now accept already-patched/newer OpenClaw shapes and alter validation around symlink install paths and handshake timeout constants. Unit tests alone cannot prove this against the bundled OpenClaw artifact in a real sandbox image.
    • Recommendation: Use the E2E Advisor required jobs for the exact head SHA and consider adding a real-image regression that verifies the patched install-path and timeout behavior against the built OpenClaw dist.
    • Evidence: Dockerfile changes grep/sed patterns for install-safe-path, install-package-dir, and DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS. E2E Advisor required cloud-onboard-e2e and sandbox-operations-e2e because these changes affect image build, onboarding, runtime startup, status, and logs.
  • High-churn same-file PR overlap increases drift risk: This PR modifies active Docker, WeChat seeding, and E2E helper files that overlap with multiple open PRs, increasing merge-order risk for OpenClaw patch idempotency, plugin registration, and scenario helper CLI contracts.

🔵 Suggestions

  • Touched TypeScript tests remain under @ts-nocheck (test/generate-openclaw-config.test.ts:1): The changed TypeScript tests use @ts-nocheck, so helper contract drift and type errors in the new WeChat extension path assertions are not caught by TypeScript.
    • Recommendation: Consider removing @ts-nocheck or narrowing it with typed helper utilities in a follow-up.
    • Evidence: test/generate-openclaw-config.test.ts and test/seed-wechat-accounts.test.ts start with // @ts-nocheck while adding wechatExtensionPath() assertions.

Acceptance coverage

  • met — update scenario validation helpers to call sandbox-scoped nemoclaw <sandbox> status and nemoclaw <sandbox> logs commands: baseline_onboarding.sh now calls nemoclaw "$E2E_SANDBOX_NAME" status and nemoclaw "$E2E_SANDBOX_NAME" logs; sandbox_lifecycle.sh now calls nemoclaw "${E2E_SANDBOX_NAME}" status and nemoclaw "${E2E_SANDBOX_NAME}" logs. e2e-lib-helpers.test.ts mocks fail unexpected argument ordering.
  • partial — restore the WeChat plugin install path into plugins.load.paths when reseeding OpenClaw config after plugin-install rewrites: scripts/seed-wechat-accounts.py computes a WeChat install path, writes plugins.installs.openclaw-weixin.installPath when absent, and appends it to plugins.load.paths. Tests assert default and preserved installPath behavior. Security review notes restored paths are not allowlist-validated.
  • met — tighten helper/unit coverage so wrong command ordering and missing WeChat load paths fail locally: e2e-lib-helpers.test.ts mocks now dispatch on full nemoclaw argument lists and exit 64 on unexpected ordering. generate-openclaw-config.test.ts and seed-wechat-accounts.test.ts assert plugins.load.paths contains the WeChat extension path.
  • unknownbash -n test/e2e/validation_suites/lib/baseline_onboarding.sh test/e2e/validation_suites/lib/sandbox_lifecycle.sh: This validation is claimed in the PR body, which is untrusted evidence. Current-head ShellCheck SARIF is still IN_PROGRESS in trusted GraphQL metadata; no trusted completed bash -n result was provided.
  • unknownnpm test -- --run test/e2e/scenario-framework-tests/e2e-lib-helpers.test.ts: This validation is claimed in the PR body, which is untrusted evidence. Current-head unit-vitest-linux is QUEUED in trusted GraphQL metadata.
  • unknownnpm test -- --run test/generate-openclaw-config.test.ts test/seed-wechat-accounts.test.ts: This validation is claimed in the PR body, which is untrusted evidence. Current-head unit-vitest-linux is QUEUED in trusted GraphQL metadata.
  • unknown — Focused E2E: https://github.com/NVIDIA/NemoClaw/actions/runs/26187684008: The run is cited in the PR body, which is untrusted evidence, and trusted metadata does not show it as passing for current head 9060ac8.
  • unknownonboard-negative-paths-e2e: success: The success claim appears only in the PR body. Trusted later selective E2E comments show onboard-negative-paths-e2e failures on refs/pull/3926/merge for earlier runs, and no current-head pass was provided.
  • missingmessaging-providers-e2e: success: The success claim appears only in the PR body. E2E Advisor requires messaging-providers-e2e; trusted comments show earlier failures/cancellations and no passing result for current head 9060ac8.
  • missingsandbox-operations-e2e: success: The success claim appears only in the PR body. E2E Advisor requires sandbox-operations-e2e; trusted comments show earlier failures and no passing result for current head 9060ac8.
  • missingRequired E2E: cloud-onboard-e2e, messaging-providers-e2e, channels-stop-start-e2e, sandbox-operations-e2e: E2E Advisor identified these as required, but no completed passing results were provided for current head 9060ac8. The advisor auto-dispatch reference is for 439c7bc and earlier selective comments show failures/cancellations.
  • unknownOptional E2E: openclaw-plugin-runtime-exdev-e2e, network-policy-e2e, ubuntu-repo-cloud-openclaw / baseline-onboarding: The advisor marks these optional. No current-head optional E2E pass evidence was provided in trusted metadata.
  • partial — WeChat OpenClaw plugin load path: Unit tests assert plugins.installs.openclaw-weixin.installPath and plugins.load.paths for generated/reseeded configs. The E2E Advisor specifically recommended adding a messaging-providers or scenario validation assertion against /sandbox/.openclaw/openclaw.json in a real image; no current-head E2E evidence for that assertion was provided.

Security review

  • pass — Category 1: Secrets and Credentials: No hardcoded real secrets were introduced. WeChat account token remains the placeholder openshell:resolve:env:WECHAT_BOT_TOKEN, and tests assert placeholder usage rather than live credentials.
  • warning — Category 2: Input Validation and Data Sanitization: NEMOCLAW_WECHAT_CONFIG_B64 is decoded with safe JSON parsing, but accountId is used directly in a filename without an allowlist. The new installPath restoration accepts any non-empty string from existing config before appending it to plugin load paths.
  • pass — Category 3: Authentication and Authorization: No new endpoints or authorization decisions are introduced. Helper changes exercise sandbox-scoped CLI commands but do not change auth logic.
  • pass — Category 4: Dependencies and Third-Party Libraries: No new dependencies are added. The WeChat plugin spec remains pinned as @tencent-weixin/openclaw-weixin@2.4.2.
  • pass — Category 5: Error Handling and Logging: Malformed base64/JSON and missing/corrupt openclaw.json handling remains bounded. The baseline logs failure message includes only a truncated 200-character command output snippet and does not intentionally log known secrets.
  • pass — Category 6: Cryptography and Data Protection: Not applicable — no cryptographic operations are added or modified. Files containing placeholder account metadata continue to be written with mode 0600.
  • warning — Category 7: Configuration and Security Headers: The PR modifies OpenClaw plugin configuration and load paths. Restoring plugins.load.paths is functionally relevant, but appending an existing installPath without path allowlisting could amplify config tampering into plugin loading from an unexpected path. Dockerfile patching also touches SSRF/proxy and symlink install-path behavior and needs current-head E2E evidence.
  • warning — Category 8: Security Testing: Positive unit/helper coverage was added for command ordering and WeChat load path restoration. Missing negative tests include malicious accountId/path traversal and malicious installPath/load path tampering. Current-head required E2E evidence is missing.
  • warning — Category 9: Holistic Security Posture: The change improves recovery from OpenClaw config rewrites and keeps secrets as placeholders, but it touches sandbox/onboarding/plugin lifecycle code. Until CI/E2E complete and installPath/accountId validation risks are addressed or consciously accepted, overall posture remains a warning.

Test / E2E status

  • Test depth: e2e_required — Runtime/sandbox/infrastructure paths need real execution coverage: Dockerfile, scripts/seed-wechat-accounts.py, sandbox status/log helper command shape, OpenClaw plugin loading, and messaging provider startup cannot be fully proven by local unit mocks.
  • E2E Advisor: missing
  • Required E2E jobs: cloud-onboard-e2e, messaging-providers-e2e, channels-stop-start-e2e, sandbox-operations-e2e
  • Missing for analyzed SHA: cloud-onboard-e2e, messaging-providers-e2e, channels-stop-start-e2e, sandbox-operations-e2e

✅ What looks good

  • The PR patches files that still exist on the branch and aligns with the stated nightly-failure scope, though same-file active PR overlap increases drift risk.
  • Commit 9060ac8 fixes the CodeRabbit sed-delimiter issue by switching the Patch 5 sed expression to a delimiter that does not conflict with regex alternation.
  • Dockerfile patch matching is more idempotent by accepting already-patched/newer OpenClaw shapes while still validating patched output.
  • Sandbox status/log helper changes now assert the sandbox-scoped nemoclaw <sandbox> status/logs command contract with mocks that fail on wrong argument ordering.
  • WeChat seeding continues to write placeholder tokens rather than live secrets and preserves stopped-channel behavior by avoiding openclaw.json activation when wechat is not active.
  • New tests cover preserving existing plugin load paths while appending the WeChat extension path, directly targeting the plugin-load regression.

Review completeness

  • Review used provided trusted metadata and the supplied diff; no tests, package-manager commands, PR scripts, workflow dispatches, or network actions were executed.
  • Current-head CI and required E2E results were not complete in trusted metadata.
  • No linked issues were present in trusted metadata, so acceptance coverage is based on PR body clauses and trusted bot comments only.
  • PR title/body/comments and bot comments were treated as untrusted evidence except where included in deterministic trusted context.
  • Line numbers are approximate for findings derived from the supplied diff.
  • Active same-file PR overlaps indicate possible drift after rebase or merge-order changes.
  • Human maintainer review required: yes

@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ❌ Some jobs failed

Run: 26189141308
Target ref: refs/pull/3926/merge
Workflow ref: main
Requested jobs: onboard-negative-paths-e2e,sandbox-operations-e2e,messaging-providers-e2e
Summary: 0 passed, 3 failed, 0 skipped

Job Result
messaging-providers-e2e ❌ failure
onboard-negative-paths-e2e ❌ failure
sandbox-operations-e2e ❌ failure

Failed jobs: messaging-providers-e2e, onboard-negative-paths-e2e, sandbox-operations-e2e. Check run artifacts for logs.

@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26189185990
Target ref: 3aa7d86ed92e3b55e617856da17c1d3b326c6696
Workflow ref: main
Requested jobs: messaging-providers-e2e,channels-stop-start-e2e
Summary: 0 passed, 0 failed, 0 skipped

Job Result
channels-stop-start-e2e ⚠️ cancelled
messaging-providers-e2e ⚠️ cancelled

@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ❌ Some jobs failed

Run: 26189611348
Target ref: refs/pull/3926/merge
Workflow ref: main
Requested jobs: onboard-negative-paths-e2e,sandbox-operations-e2e,messaging-providers-e2e
Summary: 0 passed, 3 failed, 0 skipped

Job Result
messaging-providers-e2e ❌ failure
onboard-negative-paths-e2e ❌ failure
sandbox-operations-e2e ❌ failure

Failed jobs: messaging-providers-e2e, onboard-negative-paths-e2e, sandbox-operations-e2e. Check run artifacts for logs.

@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ❌ Some jobs failed

Run: 26189709547
Target ref: 1787f5470957c2c831e0a3087a2aa7d854a52d00
Workflow ref: main
Requested jobs: cloud-onboard-e2e,messaging-providers-e2e,channels-stop-start-e2e
Summary: 0 passed, 3 failed, 0 skipped

Job Result
channels-stop-start-e2e ❌ failure
cloud-onboard-e2e ❌ failure
messaging-providers-e2e ❌ failure

Failed jobs: channels-stop-start-e2e, cloud-onboard-e2e, messaging-providers-e2e. Check run artifacts for logs.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
Dockerfile (1)

198-229: Run the Docker image E2E matrix before merge.

These OpenClaw patch assertions and the new health check only get meaningful coverage in a real container build, so I'd queue cloud-e2e, sandbox-survival-e2e, hermes-e2e, and rebuild-openclaw-e2e on this branch before merging.

As per coding guidelines, "Dockerfile: This file affects the sandbox container image. Layer ordering, permissions, and baked config changes are only testable with a real container build."

Also applies to: 655-664

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Dockerfile` around lines 198 - 229, This change touches Dockerfile patches
around install-safe-path/install-package-dir (look for symbols like baseLstat,
install-safe-path, install-package-dir, assertInstallBaseStable) and the
handshake-timeout constant DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS, so before
merging build the actual container and run the full E2E matrix: queue cloud-e2e,
sandbox-survival-e2e, hermes-e2e, and rebuild-openclaw-e2e for this branch,
verify the container build succeeds, then confirm the patched assertions
(lstat→stat and symlink check removal) and that
DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS is updated to 6e4 in the built image; only
merge after these E2E jobs pass.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@Dockerfile`:
- Around line 198-229: This change touches Dockerfile patches around
install-safe-path/install-package-dir (look for symbols like baseLstat,
install-safe-path, install-package-dir, assertInstallBaseStable) and the
handshake-timeout constant DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS, so before
merging build the actual container and run the full E2E matrix: queue cloud-e2e,
sandbox-survival-e2e, hermes-e2e, and rebuild-openclaw-e2e for this branch,
verify the container build succeeds, then confirm the patched assertions
(lstat→stat and symlink check removal) and that
DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS is updated to 6e4 in the built image; only
merge after these E2E jobs pass.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 72dbdc7a-2924-4d44-9430-8aae6ddc28e5

📥 Commits

Reviewing files that changed from the base of the PR and between 3aa7d86 and 1787f54.

📒 Files selected for processing (2)
  • Dockerfile
  • test/e2e/scenario-framework-tests/e2e-lib-helpers.test.ts

@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ❌ Some jobs failed

Run: 26189876470
Target ref: refs/pull/3926/merge
Workflow ref: main
Requested jobs: onboard-negative-paths-e2e,sandbox-operations-e2e,messaging-providers-e2e
Summary: 0 passed, 3 failed, 0 skipped

Job Result
messaging-providers-e2e ❌ failure
onboard-negative-paths-e2e ❌ failure
sandbox-operations-e2e ❌ failure

Failed jobs: messaging-providers-e2e, onboard-negative-paths-e2e, sandbox-operations-e2e. Check run artifacts for logs.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
Dockerfile (1)

177-229: E2E validation recommended for OpenClaw patching changes.

The Patch 3 and Patch 5 modifications alter baked OpenClaw behavior. Once the sed delimiter fix is applied, consider running the recommended E2E suite to validate the patched image works end-to-end.

As per coding guidelines: "This file affects the sandbox container image. Layer ordering, permissions, and baked config changes are only testable with a real container build." Recommended tests: cloud-e2e, sandbox-survival-e2e, hermes-e2e, rebuild-openclaw-e2e.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Dockerfile` around lines 177 - 229, Patch 3 (install-safe-path /
install-package-dir edits) and Patch 5 (DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS
constant change) modify baked OpenClaw behavior and must be validated by running
the full E2E suites after applying the sed delimiter fix; update and test the
Dockerfile changes that touch install-safe-path/install-package-dir (search for
files matching install-safe-path-*.js and install-package-dir-*.js and symbols
assertInstallBaseStable, baseLstat) and the DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS
edits (search for DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS in
client.js/server.impl.js), rebuild the container image, and execute the
recommended E2E tests (cloud-e2e, sandbox-survival-e2e, hermes-e2e,
rebuild-openclaw-e2e) to confirm symlink handling and extended WS handshake
timeout behave correctly in real containers.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@Dockerfile`:
- Line 227: The sed substitution using printf ... | xargs sed -i -E
's|DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS =
(1e4|15e3)|DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS = 6e4|g' is broken because the
chosen delimiter `|` conflicts with the alternation operator in the regex;
update the sed expression used with the hto_files pipeline to use a delimiter
that does not appear in the pattern (e.g., `#` or `@`) so the regex
(DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS = (1e4|15e3)) and the replacement
(DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS = 6e4) are parsed correctly, leaving the
rest of the command (printf '%s\n' "$hto_files" | xargs sed -i -E ...)
unchanged.

---

Nitpick comments:
In `@Dockerfile`:
- Around line 177-229: Patch 3 (install-safe-path / install-package-dir edits)
and Patch 5 (DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS constant change) modify baked
OpenClaw behavior and must be validated by running the full E2E suites after
applying the sed delimiter fix; update and test the Dockerfile changes that
touch install-safe-path/install-package-dir (search for files matching
install-safe-path-*.js and install-package-dir-*.js and symbols
assertInstallBaseStable, baseLstat) and the DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS
edits (search for DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS in
client.js/server.impl.js), rebuild the container image, and execute the
recommended E2E tests (cloud-e2e, sandbox-survival-e2e, hermes-e2e,
rebuild-openclaw-e2e) to confirm symlink handling and extended WS handshake
timeout behave correctly in real containers.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 2c9a6826-7ace-4c69-b500-0e443e69ab97

📥 Commits

Reviewing files that changed from the base of the PR and between 1787f54 and 439c7bc.

📒 Files selected for processing (1)
  • Dockerfile

Comment thread Dockerfile Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
Dockerfile (1)

225-229: Run the Dockerfile E2E matrix on this branch.

This sed fix looks good, but this patch still mutates baked OpenClaw assets inside the image. Please run the recommended cloud-e2e, sandbox-survival-e2e, hermes-e2e, and rebuild-openclaw-e2e jobs before merge.

As per coding guidelines, Dockerfile: "This file affects the sandbox container image. Layer ordering, permissions, and baked config changes are only testable with a real container build." E2E test recommendation: cloud-e2e, sandbox-survival-e2e, hermes-e2e, rebuild-openclaw-e2e.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Dockerfile` around lines 225 - 229, This change mutates baked OpenClaw assets
in the Dockerfile (the sed patch that updates
DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS), so before merging run the full
container-level E2E matrix to validate layer ordering and baked config: execute
cloud-e2e, sandbox-survival-e2e, hermes-e2e, and rebuild-openclaw-e2e against
this branch and confirm they pass; if any fail, revert or adjust the Dockerfile
patch and re-test until all four jobs succeed.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@Dockerfile`:
- Around line 225-229: This change mutates baked OpenClaw assets in the
Dockerfile (the sed patch that updates DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS), so
before merging run the full container-level E2E matrix to validate layer
ordering and baked config: execute cloud-e2e, sandbox-survival-e2e, hermes-e2e,
and rebuild-openclaw-e2e against this branch and confirm they pass; if any fail,
revert or adjust the Dockerfile patch and re-test until all four jobs succeed.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 9be00d8e-4a64-41e7-806c-197a6b8b7b6b

📥 Commits

Reviewing files that changed from the base of the PR and between 439c7bc and 9060ac8.

📒 Files selected for processing (1)
  • Dockerfile

@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ❌ Some jobs failed

Run: 26189982508
Target ref: 439c7bc2c224877efcacdf2634c70085df2f56a6
Workflow ref: main
Requested jobs: cloud-onboard-e2e,messaging-providers-e2e,channels-stop-start-e2e,sandbox-operations-e2e
Summary: 0 passed, 3 failed, 0 skipped

Job Result
channels-stop-start-e2e ❌ failure
cloud-onboard-e2e ⚠️ cancelled
messaging-providers-e2e ❌ failure
sandbox-operations-e2e ❌ failure

Failed jobs: channels-stop-start-e2e, messaging-providers-e2e, sandbox-operations-e2e. Check run artifacts for logs.

@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26190113439
Target ref: refs/pull/3926/merge
Workflow ref: main
Requested jobs: onboard-negative-paths-e2e,sandbox-operations-e2e,messaging-providers-e2e
Summary: 3 passed, 0 failed, 0 skipped

Job Result
messaging-providers-e2e ✅ success
onboard-negative-paths-e2e ✅ success
sandbox-operations-e2e ✅ success

@ericksoa ericksoa merged commit 379b501 into main May 20, 2026
28 checks passed
@ericksoa ericksoa deleted the fix/e2e-green-0520 branch May 20, 2026 21:34
@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26190220722
Target ref: 9060ac8a480aca2e269fac20de7b654c4248151b
Workflow ref: main
Requested jobs: cloud-onboard-e2e,channels-stop-start-e2e,sandbox-operations-e2e
Summary: 3 passed, 0 failed, 0 skipped

Job Result
channels-stop-start-e2e ✅ success
cloud-onboard-e2e ✅ success
sandbox-operations-e2e ✅ success

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants