test(e2e): fix current nightly failures by ericksoa · Pull Request #3926 · NVIDIA/NemoClaw

ericksoa · 2026-05-20T20:48:40Z

Summary

update scenario validation helpers to call sandbox-scoped nemoclaw <sandbox> status and nemoclaw <sandbox> logs commands
restore the WeChat plugin install path into plugins.load.paths when reseeding OpenClaw config after plugin-install rewrites
tighten helper/unit coverage so wrong command ordering and missing WeChat load paths fail locally

Validation

bash -n test/e2e/validation_suites/lib/baseline_onboarding.sh test/e2e/validation_suites/lib/sandbox_lifecycle.sh
npm test -- --run test/e2e/scenario-framework-tests/e2e-lib-helpers.test.ts
npm test -- --run test/generate-openclaw-config.test.ts test/seed-wechat-accounts.test.ts
Focused E2E: https://github.com/NVIDIA/NemoClaw/actions/runs/26187684008
- onboard-negative-paths-e2e: success
- messaging-providers-e2e: success
- sandbox-operations-e2e: success

Summary by CodeRabbit

Bug Fixes
- WeChat seeding now preserves and restores plugin registry and load-path entries (including install path) while continuing to write per-account state and enable seeded accounts.
Tests
- Expanded e2e and unit tests: improved CLI argument handling, added status/logs availability checks, and added assertions for WeChat seeding/rerun and expected extension paths.
Chores
- Hardened container build/patch steps and made WebSocket handshake timeout handling more robust.

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

coderabbitai · 2026-05-20T20:48:52Z

📝 Walkthrough

Walkthrough

seed-wechat-accounts.py now preserves/restores the openclaw-weixin install and plugins.load.paths by deriving installPath from existing registry; tests add helpers/assertions for the computed extension path; E2E mocks/helpers now call nemoclaw with sandbox name first; Dockerfile patch steps and validations are hardened.

Changes

WeChat plugin registry preservation and E2E validation

Layer / File(s)	Summary
Plugin registry preservation logic `scripts/seed-wechat-accounts.py`	Removes hardcoded `WECHAT_PLUGIN_INSTALL`; adds `_wechat_plugin_install_path()` to derive or default install paths; reworks `_patch_openclaw_config()` to conditionally set `source`/`spec`/`installPath` and manage `plugins.load.paths`.
WeChat extension path test helpers `test/generate-openclaw-config.test.ts`, `test/seed-wechat-accounts.test.ts`	Adds `wechatExtensionPath()` helpers that resolve the temp OpenClaw state dir to compute `extensions/openclaw-weixin` for assertions.
Seed and config generation test updates `test/generate-openclaw-config.test.ts`, `test/seed-wechat-accounts.test.ts`	Updates assertions to expect `plugins.installs["openclaw-weixin"].installPath`, verify `plugins.load.paths` contains the computed extension path, and ensure existing install metadata and load paths are preserved when re-seeding.
E2E validation suite and mocked CLI updates `test/e2e/scenario-framework-tests/e2e-lib-helpers.test.ts`, `test/e2e/validation_suites/lib/baseline_onboarding.sh`, `test/e2e/validation_suites/lib/sandbox_lifecycle.sh`	Mocks dispatch on full `nemoclaw` argument lists (`case "$*"`); baseline onboarding adds PASS checks for sandbox status and logs; helpers now invoke `nemoclaw "$E2E_SANDBOX_NAME" status
Docker OpenClaw patching and validations `Dockerfile`	Makes install-path discovery tolerant, adds explicit not-found failures and post-sed assertions; makes websocket-timeout replacement idempotent and validates replacement result.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

[nightly-e2e] WeChat channel registration lost after OpenClaw config mutations #3844: Addresses config-loss where plugin registry and plugins.load.paths are not preserved when seeding WeChat.

Possibly related PRs

NVIDIA/NemoClaw#3839: Also updates seed logic to restore openclaw-weixin install/registry and plugins.load.paths.
NVIDIA/NemoClaw#3682: Related work on WeChat onboarding and plugin registry preservation in config generation.
NVIDIA/NemoClaw#3897: Overlaps with baseline onboarding e2e helper and nemoclaw invocation changes.

Suggested labels

CI/CD, E2E, fix, Integration: WeChat, v0.0.46

Suggested reviewers

jyaunches
cv
cjagwani

Poem

🐰 I found a path that once was lost,
Restored the plugin at no cost.
Tests follow sandboxes by name,
Docker checks confirm the same.
Small hops keep startup strong and boss.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 7.14% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'test(e2e): fix current nightly failures' is directly related to the main changes in this PR, which update e2e validation helpers, WeChat plugin configuration, and Docker patching to resolve test failures.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/e2e-green-0520

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-05-20T20:50:34Z

E2E Advisor Recommendation

Required E2E: test-e2e-sandbox, test-e2e-gateway-isolation, cloud-onboard-e2e, channels-stop-start-e2e, sandbox-operations-e2e, openclaw-plugin-runtime-exdev-e2e
Optional E2E: messaging-providers-e2e, network-policy-e2e, ubuntu-repo-cloud-openclaw

Dispatch hint: cloud-onboard-e2e,channels-stop-start-e2e,sandbox-operations-e2e

Auto-dispatched E2E: cloud-onboard-e2e, channels-stop-start-e2e, sandbox-operations-e2e via nightly-e2e.yaml at 9060ac8a480aca2e269fac20de7b654c4248151b — nightly run

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

test-e2e-sandbox (medium): Required because Dockerfile changes must prove the production sandbox image still builds and passes the containerized sandbox smoke suite.
test-e2e-gateway-isolation (medium): Required because the touched Dockerfile block patches OpenClaw fetch/proxy/security-sensitive code paths; gateway isolation should verify the production image still enforces sandbox boundaries.
cloud-onboard-e2e (high): Required to validate install/onboard/image-build flow, generated openclaw.json, Landlock read-only enforcement, and inference.local readiness after Dockerfile and OpenClaw config seeding changes.
channels-stop-start-e2e (high): Required because the PR changes WeChat OpenClaw plugin registration/load-path seeding; this existing job explicitly exercises OpenClaw and Hermes channel stop/start/remove/rebuild flows across telegram, discord, wechat, slack, and whatsapp.
sandbox-operations-e2e (high): Required because the PR changes live status/logs validation semantics and Dockerfile handshake timeout patching, both of which can affect sandbox list/status/logs/exec and recovery operations.
openclaw-plugin-runtime-exdev-e2e (high): Required because Dockerfile changes are in OpenClaw plugin/runtime patching territory; this regression job validates a fresh OpenClaw sandbox can bootstrap plugin runtime dependencies without filesystem/runtime failures.

Optional E2E

messaging-providers-e2e (high): Useful adjacent confidence for messaging provider placeholders, credential isolation, OpenClaw config patching, and L7 proxy token rewriting, though it does not appear to specifically cover WeChat.
network-policy-e2e (high): Optional security-adjacent check because the edited Dockerfile block sits near OpenClaw proxy/SSRF patching; useful to confirm deny-by-default, hot reload, inference exemption, and SSRF validation remain intact.
ubuntu-repo-cloud-openclaw (high): Optional scenario-runner validation with suite_filter=baseline-onboarding,sandbox-operations,sandbox-lifecycle to exercise the modified validation_suites helpers in the scenario framework.

New E2E recommendations

wechat-openclaw-plugin-load-path (high): Existing channels-stop-start coverage checks WeChat channel lifecycle broadly, but a targeted hermetic OpenClaw WeChat E2E should assert plugins.installs.openclaw-weixin.installPath, plugins.load.paths, channels.openclaw-weixin.accounts..enabled, and gateway startup with the upstream plugin loaded.
- Suggested test: Add a dedicated WeChat OpenClaw plugin-load-path E2E using fake WECHAT_* inputs and a built sandbox image.
openclaw-symlinked-plugin-install (medium): The Dockerfile patch changes OpenClaw install-safe-path/install-package-dir behavior for symlinked plugin install bases, but the current E2E set does not appear to directly install/load a plugin from a symlinked path and verify containment remains enforced.
- Suggested test: Add an OpenClaw symlinked plugin install E2E that succeeds for in-tree symlinks and fails for symlinks escaping the allowed base.

Dispatch hint

Workflow: .github/workflows/nightly-e2e.yaml
jobs input: cloud-onboard-e2e,channels-stop-start-e2e,sandbox-operations-e2e

github-actions · 2026-05-20T20:50:58Z

PR Review Advisor

Recommendation: blocked
Confidence: medium
Analyzed HEAD: 9060ac8a480aca2e269fac20de7b654c4248151b
Findings: 3 blocker(s), 4 warning(s), 1 suggestion(s)

This is an automated advisory review. A human maintainer must make the final merge decision.

Limitations: Review used provided trusted metadata and the supplied diff; no tests, package-manager commands, PR scripts, workflow dispatches, or network actions were executed.; Current-head CI and required E2E results were not complete in trusted metadata.; No linked issues were present in trusted metadata, so acceptance coverage is based on PR body clauses and trusted bot comments only.; PR title/body/comments and bot comments were treated as untrusted evidence except where included in deterministic trusted context.; Line numbers are approximate for findings derived from the supplied diff.; Active same-file PR overlaps indicate possible drift after rebase or merge-order changes.

Workflow run

Full advisor summary

PR Review Advisor

Base: origin/main
Head: HEAD
Analyzed SHA: 9060ac8a480aca2e269fac20de7b654c4248151b
Recommendation: blocked
Confidence: medium

The patch addresses existing Docker/OpenClaw/WeChat/E2E helper code and fixes the resolved sed-delimiter review issue, but merge is blocked by pending CI, BLOCKED merge state, and missing required current-head E2E evidence for high-risk onboarding/sandbox/plugin changes.

Gate status

CI: pending — Trusted gateStatus reports 11 status context(s) pending. GraphQL for head 9060ac8 shows IN_PROGRESS/QUEUED checks including E2E recommendation, wsl-e2e, PR review advisor, CodeQL javascript-typescript, CodeQL python, unit-vitest-linux, checks, ShellCheck SARIF, build-sandbox-images, build-sandbox-images-arm64, and CodeRabbit pending.
Mergeability: fail — GitHub GraphQL reports mergeStateStatus=BLOCKED and reviewDecision=REVIEW_REQUIRED for head 9060ac8; REST metadata reports mergeable_state=blocked.
Review threads: pass — Trusted gateStatus reports 1 review thread(s), all resolved. GraphQL shows the CodeRabbit sed-delimiter thread isResolved=true and the comment says addressed in commit 9060ac8.
Risky code tested: warning — Risky onboarding/host glue detected in Dockerfile and scripts/seed-wechat-accounts.py. Unit/helper tests were added, but E2E Advisor required cloud-onboard-e2e, messaging-providers-e2e, channels-stop-start-e2e, and sandbox-operations-e2e; no passing evidence for current head 9060ac8 was provided.

🔴 Blockers

Current-head CI is still pending: The current head SHA cannot be considered merge-ready while required checks are in progress, queued, or pending.
- Recommendation: Wait for all required checks for 9060ac8 to complete successfully before treating the PR as merge-ready.
- Evidence: Trusted gateStatus reports 11 pending status contexts; GraphQL lists IN_PROGRESS/QUEUED checks including E2E recommendation, wsl-e2e, PR review advisor, CodeQL, unit-vitest-linux, checks, ShellCheck SARIF, build-sandbox-images, build-sandbox-images-arm64, and CodeRabbit pending.
GitHub merge state is blocked: GitHub reports the PR as blocked, indicating branch protection, required checks, or required review gates are not satisfied.
- Recommendation: Do not merge until mergeStateStatus is no longer BLOCKED and required review/check gates are satisfied.
- Evidence: GraphQL reports mergeStateStatus=BLOCKED and reviewDecision=REVIEW_REQUIRED; REST metadata reports mergeable_state=blocked.
Required E2E evidence missing for current head SHA: The E2E Advisor required four jobs for this Docker/OpenClaw/WeChat/sandbox-helper change class, but trusted evidence only shows auto-dispatch for an older SHA and failed/cancelled prior selective runs; no current-head passes are shown for 9060ac8.
- Recommendation: Obtain completed passing results for cloud-onboard-e2e, messaging-providers-e2e, channels-stop-start-e2e, and sandbox-operations-e2e for the exact current head SHA.
- Evidence: E2E Advisor required cloud-onboard-e2e, messaging-providers-e2e, channels-stop-start-e2e, sandbox-operations-e2e. The advisor comment auto-dispatched at 439c7bc; current head is 9060ac8. Later GraphQL shows E2E recommendation still IN_PROGRESS for current head.

🟡 Warnings

Restored WeChat installPath is added to plugin load paths without allowlist validation (scripts/seed-wechat-accounts.py:78): The new preservation logic accepts any non-empty plugins.installs.openclaw-weixin.installPath from openclaw.json and appends it to plugins.load.paths. If openclaw.json is tampered with or rewritten to an unexpected path, reseeding can re-enable plugin loading from that path.
- Recommendation: Constrain restored WeChat load paths to trusted locations such as the resolved OpenClaw state extensions/openclaw-weixin path or an explicit base-image plugin cache directory. Add negative tests for relative paths, traversal, non-state absolute paths, and symlink/realpath escapes.
- Evidence: _wechat_plugin_install_path() returns install_record.installPath after strip() without base/realpath validation, and _patch_openclaw_config() appends wechat_install_path into plugins.load.paths. The added test preserves and loads /already/installed/openclaw-weixin.
accountId is used in a filename without strict validation (scripts/seed-wechat-accounts.py:199): accountId from NEMOCLAW_WECHAT_CONFIG_B64 is stripped but not allowlist-validated before being interpolated into accounts/.json. Separators or traversal sequences could create surprising paths if the input is malformed or attacker-controlled.
- Recommendation: Validate accountId with a strict allowlist such as /^[A-Za-z0-9._-]+$/ plus a reasonable length limit, or encode it safely before using it as a filename. Add negative tests for ../, /, backslash, empty-after-trim, and very long values.
- Evidence: main() computes account_id = (config.get("accountId") or "").strip() and account_file = plugin_dir / "accounts" / f"{account_id}.json".
Docker OpenClaw patch drift requires real image regression coverage (Dockerfile:162): Patch 3 and Patch 5 now accept already-patched/newer OpenClaw shapes and alter validation around symlink install paths and handshake timeout constants. Unit tests alone cannot prove this against the bundled OpenClaw artifact in a real sandbox image.
- Recommendation: Use the E2E Advisor required jobs for the exact head SHA and consider adding a real-image regression that verifies the patched install-path and timeout behavior against the built OpenClaw dist.
- Evidence: Dockerfile changes grep/sed patterns for install-safe-path, install-package-dir, and DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS. E2E Advisor required cloud-onboard-e2e and sandbox-operations-e2e because these changes affect image build, onboarding, runtime startup, status, and logs.
High-churn same-file PR overlap increases drift risk: This PR modifies active Docker, WeChat seeding, and E2E helper files that overlap with multiple open PRs, increasing merge-order risk for OpenClaw patch idempotency, plugin registration, and scenario helper CLI contracts.
- Recommendation: Coordinate merge order or rebase after overlapping PRs land, then rerun current-head CI and required E2E.
- Evidence: Trusted openPrOverlaps lists same-file overlap with PRs chore: upgrade agent runtime dependencies #3925, fix(e2e): use sandbox subcommands in scenario suites #3927, fix(onboard): reject host.docker.internal inference URLs #3804, fix(inference): set compat.supportsUsageInStreaming for ollama-local (#2747) #3683, fix(openclaw): bump runtime deps EXDEV fix #3820, Upgrade OpenClaw to 2026.5.18 #3825, fix(docker): classify OpenClaw patch drift #3869, test(e2e): migrate inference routing provider coverage #3903, test(e2e): migrate security policy credential suites #3905, and fix(plugin): tolerate empty/malformed onboard config.json #3906 across Dockerfile, scripts/seed-wechat-accounts.py, generate/seed tests, and E2E helper files.

🔵 Suggestions

Touched TypeScript tests remain under @ts-nocheck (test/generate-openclaw-config.test.ts:1): The changed TypeScript tests use @ts-nocheck, so helper contract drift and type errors in the new WeChat extension path assertions are not caught by TypeScript.
- Recommendation: Consider removing @ts-nocheck or narrowing it with typed helper utilities in a follow-up.
- Evidence: test/generate-openclaw-config.test.ts and test/seed-wechat-accounts.test.ts start with // @ts-nocheck while adding wechatExtensionPath() assertions.

Acceptance coverage

met — update scenario validation helpers to call sandbox-scoped nemoclaw <sandbox> status and nemoclaw <sandbox> logs commands: baseline_onboarding.sh now calls nemoclaw "$E2E_SANDBOX_NAME" status and nemoclaw "$E2E_SANDBOX_NAME" logs; sandbox_lifecycle.sh now calls nemoclaw "${E2E_SANDBOX_NAME}" status and nemoclaw "${E2E_SANDBOX_NAME}" logs. e2e-lib-helpers.test.ts mocks fail unexpected argument ordering.
partial — restore the WeChat plugin install path into plugins.load.paths when reseeding OpenClaw config after plugin-install rewrites: scripts/seed-wechat-accounts.py computes a WeChat install path, writes plugins.installs.openclaw-weixin.installPath when absent, and appends it to plugins.load.paths. Tests assert default and preserved installPath behavior. Security review notes restored paths are not allowlist-validated.
met — tighten helper/unit coverage so wrong command ordering and missing WeChat load paths fail locally: e2e-lib-helpers.test.ts mocks now dispatch on full nemoclaw argument lists and exit 64 on unexpected ordering. generate-openclaw-config.test.ts and seed-wechat-accounts.test.ts assert plugins.load.paths contains the WeChat extension path.
unknown — bash -n test/e2e/validation_suites/lib/baseline_onboarding.sh test/e2e/validation_suites/lib/sandbox_lifecycle.sh: This validation is claimed in the PR body, which is untrusted evidence. Current-head ShellCheck SARIF is still IN_PROGRESS in trusted GraphQL metadata; no trusted completed bash -n result was provided.
unknown — npm test -- --run test/e2e/scenario-framework-tests/e2e-lib-helpers.test.ts: This validation is claimed in the PR body, which is untrusted evidence. Current-head unit-vitest-linux is QUEUED in trusted GraphQL metadata.
unknown — npm test -- --run test/generate-openclaw-config.test.ts test/seed-wechat-accounts.test.ts: This validation is claimed in the PR body, which is untrusted evidence. Current-head unit-vitest-linux is QUEUED in trusted GraphQL metadata.
unknown — Focused E2E: https://github.com/NVIDIA/NemoClaw/actions/runs/26187684008: The run is cited in the PR body, which is untrusted evidence, and trusted metadata does not show it as passing for current head 9060ac8.
unknown — onboard-negative-paths-e2e: success: The success claim appears only in the PR body. Trusted later selective E2E comments show onboard-negative-paths-e2e failures on refs/pull/3926/merge for earlier runs, and no current-head pass was provided.
missing — messaging-providers-e2e: success: The success claim appears only in the PR body. E2E Advisor requires messaging-providers-e2e; trusted comments show earlier failures/cancellations and no passing result for current head 9060ac8.
missing — sandbox-operations-e2e: success: The success claim appears only in the PR body. E2E Advisor requires sandbox-operations-e2e; trusted comments show earlier failures and no passing result for current head 9060ac8.
missing — Required E2E: cloud-onboard-e2e, messaging-providers-e2e, channels-stop-start-e2e, sandbox-operations-e2e: E2E Advisor identified these as required, but no completed passing results were provided for current head 9060ac8. The advisor auto-dispatch reference is for 439c7bc and earlier selective comments show failures/cancellations.
unknown — Optional E2E: openclaw-plugin-runtime-exdev-e2e, network-policy-e2e, ubuntu-repo-cloud-openclaw / baseline-onboarding: The advisor marks these optional. No current-head optional E2E pass evidence was provided in trusted metadata.
partial — WeChat OpenClaw plugin load path: Unit tests assert plugins.installs.openclaw-weixin.installPath and plugins.load.paths for generated/reseeded configs. The E2E Advisor specifically recommended adding a messaging-providers or scenario validation assertion against /sandbox/.openclaw/openclaw.json in a real image; no current-head E2E evidence for that assertion was provided.

Security review

pass — Category 1: Secrets and Credentials: No hardcoded real secrets were introduced. WeChat account token remains the placeholder openshell:resolve:env:WECHAT_BOT_TOKEN, and tests assert placeholder usage rather than live credentials.
warning — Category 2: Input Validation and Data Sanitization: NEMOCLAW_WECHAT_CONFIG_B64 is decoded with safe JSON parsing, but accountId is used directly in a filename without an allowlist. The new installPath restoration accepts any non-empty string from existing config before appending it to plugin load paths.
pass — Category 3: Authentication and Authorization: No new endpoints or authorization decisions are introduced. Helper changes exercise sandbox-scoped CLI commands but do not change auth logic.
pass — Category 4: Dependencies and Third-Party Libraries: No new dependencies are added. The WeChat plugin spec remains pinned as @tencent-weixin/openclaw-weixin@2.4.2.
pass — Category 5: Error Handling and Logging: Malformed base64/JSON and missing/corrupt openclaw.json handling remains bounded. The baseline logs failure message includes only a truncated 200-character command output snippet and does not intentionally log known secrets.
pass — Category 6: Cryptography and Data Protection: Not applicable — no cryptographic operations are added or modified. Files containing placeholder account metadata continue to be written with mode 0600.
warning — Category 7: Configuration and Security Headers: The PR modifies OpenClaw plugin configuration and load paths. Restoring plugins.load.paths is functionally relevant, but appending an existing installPath without path allowlisting could amplify config tampering into plugin loading from an unexpected path. Dockerfile patching also touches SSRF/proxy and symlink install-path behavior and needs current-head E2E evidence.
warning — Category 8: Security Testing: Positive unit/helper coverage was added for command ordering and WeChat load path restoration. Missing negative tests include malicious accountId/path traversal and malicious installPath/load path tampering. Current-head required E2E evidence is missing.
warning — Category 9: Holistic Security Posture: The change improves recovery from OpenClaw config rewrites and keeps secrets as placeholders, but it touches sandbox/onboarding/plugin lifecycle code. Until CI/E2E complete and installPath/accountId validation risks are addressed or consciously accepted, overall posture remains a warning.

Test / E2E status

Test depth: e2e_required — Runtime/sandbox/infrastructure paths need real execution coverage: Dockerfile, scripts/seed-wechat-accounts.py, sandbox status/log helper command shape, OpenClaw plugin loading, and messaging provider startup cannot be fully proven by local unit mocks.
E2E Advisor: missing
Required E2E jobs: cloud-onboard-e2e, messaging-providers-e2e, channels-stop-start-e2e, sandbox-operations-e2e
Missing for analyzed SHA: cloud-onboard-e2e, messaging-providers-e2e, channels-stop-start-e2e, sandbox-operations-e2e

✅ What looks good

The PR patches files that still exist on the branch and aligns with the stated nightly-failure scope, though same-file active PR overlap increases drift risk.
Commit 9060ac8 fixes the CodeRabbit sed-delimiter issue by switching the Patch 5 sed expression to a delimiter that does not conflict with regex alternation.
Dockerfile patch matching is more idempotent by accepting already-patched/newer OpenClaw shapes while still validating patched output.
Sandbox status/log helper changes now assert the sandbox-scoped nemoclaw <sandbox> status/logs command contract with mocks that fail on wrong argument ordering.
WeChat seeding continues to write placeholder tokens rather than live secrets and preserves stopped-channel behavior by avoiding openclaw.json activation when wechat is not active.
New tests cover preserving existing plugin load paths while appending the WeChat extension path, directly targeting the plugin-load regression.

Review completeness

Review used provided trusted metadata and the supplied diff; no tests, package-manager commands, PR scripts, workflow dispatches, or network actions were executed.
Current-head CI and required E2E results were not complete in trusted metadata.
No linked issues were present in trusted metadata, so acceptance coverage is based on PR body clauses and trusted bot comments only.
PR title/body/comments and bot comments were treated as untrusted evidence except where included in deterministic trusted context.
Line numbers are approximate for findings derived from the supplied diff.
Active same-file PR overlaps indicate possible drift after rebase or merge-order changes.
Human maintainer review required: yes

github-actions · 2026-05-20T20:52:18Z

Selective E2E Results — ❌ Some jobs failed

Run: 26189141308
Target ref: refs/pull/3926/merge
Workflow ref: main
Requested jobs: onboard-negative-paths-e2e,sandbox-operations-e2e,messaging-providers-e2e
Summary: 0 passed, 3 failed, 0 skipped

Job	Result
messaging-providers-e2e	❌ failure
onboard-negative-paths-e2e	❌ failure
sandbox-operations-e2e	❌ failure

Failed jobs: messaging-providers-e2e, onboard-negative-paths-e2e, sandbox-operations-e2e. Check run artifacts for logs.

github-actions · 2026-05-20T21:01:04Z

Selective E2E Results — ✅ All requested jobs passed

Run: 26189185990
Target ref: 3aa7d86ed92e3b55e617856da17c1d3b326c6696
Workflow ref: main
Requested jobs: messaging-providers-e2e,channels-stop-start-e2e
Summary: 0 passed, 0 failed, 0 skipped

Job	Result
channels-stop-start-e2e	⚠️ cancelled
messaging-providers-e2e	⚠️ cancelled

github-actions · 2026-05-20T21:01:48Z

Selective E2E Results — ❌ Some jobs failed

Run: 26189611348
Target ref: refs/pull/3926/merge
Workflow ref: main
Requested jobs: onboard-negative-paths-e2e,sandbox-operations-e2e,messaging-providers-e2e
Summary: 0 passed, 3 failed, 0 skipped

Job	Result
messaging-providers-e2e	❌ failure
onboard-negative-paths-e2e	❌ failure
sandbox-operations-e2e	❌ failure

Failed jobs: messaging-providers-e2e, onboard-negative-paths-e2e, sandbox-operations-e2e. Check run artifacts for logs.

github-actions · 2026-05-20T21:04:01Z

Selective E2E Results — ❌ Some jobs failed

Run: 26189709547
Target ref: 1787f5470957c2c831e0a3087a2aa7d854a52d00
Workflow ref: main
Requested jobs: cloud-onboard-e2e,messaging-providers-e2e,channels-stop-start-e2e
Summary: 0 passed, 3 failed, 0 skipped

Job	Result
channels-stop-start-e2e	❌ failure
cloud-onboard-e2e	❌ failure
messaging-providers-e2e	❌ failure

Failed jobs: channels-stop-start-e2e, cloud-onboard-e2e, messaging-providers-e2e. Check run artifacts for logs.

coderabbitai

🧹 Nitpick comments (1)

Dockerfile (1)
198-229: Run the Docker image E2E matrix before merge.

These OpenClaw patch assertions and the new health check only get meaningful coverage in a real container build, so I'd queue cloud-e2e, sandbox-survival-e2e, hermes-e2e, and rebuild-openclaw-e2e on this branch before merging.

As per coding guidelines, "Dockerfile: This file affects the sandbox container image. Layer ordering, permissions, and baked config changes are only testable with a real container build."

Also applies to: 655-664
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Dockerfile` around lines 198 - 229, This change touches Dockerfile patches
around install-safe-path/install-package-dir (look for symbols like baseLstat,
install-safe-path, install-package-dir, assertInstallBaseStable) and the
handshake-timeout constant DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS, so before
merging build the actual container and run the full E2E matrix: queue cloud-e2e,
sandbox-survival-e2e, hermes-e2e, and rebuild-openclaw-e2e for this branch,
verify the container build succeeds, then confirm the patched assertions
(lstat→stat and symlink check removal) and that
DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS is updated to 6e4 in the built image; only
merge after these E2E jobs pass.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@Dockerfile`:
- Around line 198-229: This change touches Dockerfile patches around
install-safe-path/install-package-dir (look for symbols like baseLstat,
install-safe-path, install-package-dir, assertInstallBaseStable) and the
handshake-timeout constant DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS, so before
merging build the actual container and run the full E2E matrix: queue cloud-e2e,
sandbox-survival-e2e, hermes-e2e, and rebuild-openclaw-e2e for this branch,
verify the container build succeeds, then confirm the patched assertions
(lstat→stat and symlink check removal) and that
DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS is updated to 6e4 in the built image; only
merge after these E2E jobs pass.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 72dbdc7a-2924-4d44-9430-8aae6ddc28e5

📥 Commits

Reviewing files that changed from the base of the PR and between 3aa7d86 and 1787f54.

📒 Files selected for processing (2)

Dockerfile
test/e2e/scenario-framework-tests/e2e-lib-helpers.test.ts

github-actions · 2026-05-20T21:06:56Z

Selective E2E Results — ❌ Some jobs failed

Run: 26189876470
Target ref: refs/pull/3926/merge
Workflow ref: main
Requested jobs: onboard-negative-paths-e2e,sandbox-operations-e2e,messaging-providers-e2e
Summary: 0 passed, 3 failed, 0 skipped

Job	Result
messaging-providers-e2e	❌ failure
onboard-negative-paths-e2e	❌ failure
sandbox-operations-e2e	❌ failure

Failed jobs: messaging-providers-e2e, onboard-negative-paths-e2e, sandbox-operations-e2e. Check run artifacts for logs.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

Dockerfile (1)
177-229: E2E validation recommended for OpenClaw patching changes.

The Patch 3 and Patch 5 modifications alter baked OpenClaw behavior. Once the sed delimiter fix is applied, consider running the recommended E2E suite to validate the patched image works end-to-end.

As per coding guidelines: "This file affects the sandbox container image. Layer ordering, permissions, and baked config changes are only testable with a real container build." Recommended tests: cloud-e2e, sandbox-survival-e2e, hermes-e2e, rebuild-openclaw-e2e.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Dockerfile` around lines 177 - 229, Patch 3 (install-safe-path /
install-package-dir edits) and Patch 5 (DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS
constant change) modify baked OpenClaw behavior and must be validated by running
the full E2E suites after applying the sed delimiter fix; update and test the
Dockerfile changes that touch install-safe-path/install-package-dir (search for
files matching install-safe-path-*.js and install-package-dir-*.js and symbols
assertInstallBaseStable, baseLstat) and the DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS
edits (search for DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS in
client.js/server.impl.js), rebuild the container image, and execute the
recommended E2E tests (cloud-e2e, sandbox-survival-e2e, hermes-e2e,
rebuild-openclaw-e2e) to confirm symlink handling and extended WS handshake
timeout behave correctly in real containers.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@Dockerfile`:
- Line 227: The sed substitution using printf ... | xargs sed -i -E
's|DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS =
(1e4|15e3)|DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS = 6e4|g' is broken because the
chosen delimiter `|` conflicts with the alternation operator in the regex;
update the sed expression used with the hto_files pipeline to use a delimiter
that does not appear in the pattern (e.g., `#` or `@`) so the regex
(DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS = (1e4|15e3)) and the replacement
(DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS = 6e4) are parsed correctly, leaving the
rest of the command (printf '%s\n' "$hto_files" | xargs sed -i -E ...)
unchanged.

---

Nitpick comments:
In `@Dockerfile`:
- Around line 177-229: Patch 3 (install-safe-path / install-package-dir edits)
and Patch 5 (DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS constant change) modify baked
OpenClaw behavior and must be validated by running the full E2E suites after
applying the sed delimiter fix; update and test the Dockerfile changes that
touch install-safe-path/install-package-dir (search for files matching
install-safe-path-*.js and install-package-dir-*.js and symbols
assertInstallBaseStable, baseLstat) and the DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS
edits (search for DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS in
client.js/server.impl.js), rebuild the container image, and execute the
recommended E2E tests (cloud-e2e, sandbox-survival-e2e, hermes-e2e,
rebuild-openclaw-e2e) to confirm symlink handling and extended WS handshake
timeout behave correctly in real containers.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 2c9a6826-7ace-4c69-b500-0e443e69ab97

📥 Commits

Reviewing files that changed from the base of the PR and between 1787f54 and 439c7bc.

📒 Files selected for processing (1)

Dockerfile

coderabbitai

🧹 Nitpick comments (1)

Dockerfile (1)
225-229: Run the Dockerfile E2E matrix on this branch.

This sed fix looks good, but this patch still mutates baked OpenClaw assets inside the image. Please run the recommended cloud-e2e, sandbox-survival-e2e, hermes-e2e, and rebuild-openclaw-e2e jobs before merge.

As per coding guidelines, Dockerfile: "This file affects the sandbox container image. Layer ordering, permissions, and baked config changes are only testable with a real container build." E2E test recommendation: cloud-e2e, sandbox-survival-e2e, hermes-e2e, rebuild-openclaw-e2e.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Dockerfile` around lines 225 - 229, This change mutates baked OpenClaw assets
in the Dockerfile (the sed patch that updates
DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS), so before merging run the full
container-level E2E matrix to validate layer ordering and baked config: execute
cloud-e2e, sandbox-survival-e2e, hermes-e2e, and rebuild-openclaw-e2e against
this branch and confirm they pass; if any fail, revert or adjust the Dockerfile
patch and re-test until all four jobs succeed.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@Dockerfile`:
- Around line 225-229: This change mutates baked OpenClaw assets in the
Dockerfile (the sed patch that updates DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS), so
before merging run the full container-level E2E matrix to validate layer
ordering and baked config: execute cloud-e2e, sandbox-survival-e2e, hermes-e2e,
and rebuild-openclaw-e2e against this branch and confirm they pass; if any fail,
revert or adjust the Dockerfile patch and re-test until all four jobs succeed.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 9be00d8e-4a64-41e7-806c-197a6b8b7b6b

📥 Commits

Reviewing files that changed from the base of the PR and between 439c7bc and 9060ac8.

📒 Files selected for processing (1)

Dockerfile

github-actions · 2026-05-20T21:11:31Z

Selective E2E Results — ❌ Some jobs failed

Run: 26189982508
Target ref: 439c7bc2c224877efcacdf2634c70085df2f56a6
Workflow ref: main
Requested jobs: cloud-onboard-e2e,messaging-providers-e2e,channels-stop-start-e2e,sandbox-operations-e2e
Summary: 0 passed, 3 failed, 0 skipped

Job	Result
channels-stop-start-e2e	❌ failure
cloud-onboard-e2e	⚠️ cancelled
messaging-providers-e2e	❌ failure
sandbox-operations-e2e	❌ failure

Failed jobs: channels-stop-start-e2e, messaging-providers-e2e, sandbox-operations-e2e. Check run artifacts for logs.

github-actions · 2026-05-20T21:33:43Z

Selective E2E Results — ✅ All requested jobs passed

Run: 26190113439
Target ref: refs/pull/3926/merge
Workflow ref: main
Requested jobs: onboard-negative-paths-e2e,sandbox-operations-e2e,messaging-providers-e2e
Summary: 3 passed, 0 failed, 0 skipped

Job	Result
messaging-providers-e2e	✅ success
onboard-negative-paths-e2e	✅ success
sandbox-operations-e2e	✅ success

github-actions · 2026-05-20T22:08:47Z

Selective E2E Results — ✅ All requested jobs passed

Run: 26190220722
Target ref: 9060ac8a480aca2e269fac20de7b654c4248151b
Workflow ref: main
Requested jobs: cloud-onboard-e2e,channels-stop-start-e2e,sandbox-operations-e2e
Summary: 3 passed, 0 failed, 0 skipped

Job	Result
channels-stop-start-e2e	✅ success
cloud-onboard-e2e	✅ success
sandbox-operations-e2e	✅ success

ericksoa added 3 commits May 20, 2026 12:49

test(e2e): use sandbox status command in scenario helpers

8e237ca

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

test(e2e): restore WeChat plugin load path

44faed1

test(e2e): use sandbox logs command in scenario helpers

3aa7d86

github-actions Bot mentioned this pull request May 20, 2026

fix(e2e): use sandbox subcommands in scenario suites #3927

Merged

ericksoa added 2 commits May 20, 2026 13:55

Merge remote-tracking branch 'origin/main' into fix/e2e-green-0520

d8e6965

fix(docker): make OpenClaw patch layer idempotent

1787f54

fix(docker): accept newer OpenClaw patch shapes

439c7bc

coderabbitai Bot reviewed May 20, 2026

View reviewed changes

Comment thread Dockerfile Outdated

fix(docker): avoid sed delimiter collision

9060ac8

coderabbitai Bot reviewed May 20, 2026

View reviewed changes

jyaunches approved these changes May 20, 2026

View reviewed changes

github-actions Bot mentioned this pull request May 20, 2026

Revert "ci(images): extend base image build timeout (#3881)" #3928

Closed

github-actions Bot mentioned this pull request May 20, 2026

refactor(cli): emit onboard session machine events #3849

Merged

12 tasks

ericksoa merged commit 379b501 into main May 20, 2026
28 checks passed

ericksoa deleted the fix/e2e-green-0520 branch May 20, 2026 21:34

coderabbitai Bot mentioned this pull request May 20, 2026

chore: upgrade agent runtime dependencies #3925

Open

Conversation

ericksoa commented May 20, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Advisor Recommendation

E2E Recommendation Advisor

Required E2E

Optional E2E

New E2E recommendations

Dispatch hint

Uh oh!

github-actions Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Advisor

PR Review Advisor

Gate status

🔴 Blockers

🟡 Warnings

🔵 Suggestions

Acceptance coverage

Security review

Test / E2E status

✅ What looks good

Review completeness

Uh oh!

github-actions Bot commented May 20, 2026

Selective E2E Results — ❌ Some jobs failed

Uh oh!

github-actions Bot commented May 20, 2026

Selective E2E Results — ✅ All requested jobs passed

Uh oh!

github-actions Bot commented May 20, 2026

Selective E2E Results — ❌ Some jobs failed

Uh oh!

github-actions Bot commented May 20, 2026

Selective E2E Results — ❌ Some jobs failed

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 20, 2026

Selective E2E Results — ❌ Some jobs failed

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 20, 2026

Selective E2E Results — ❌ Some jobs failed

Uh oh!

github-actions Bot commented May 20, 2026

Selective E2E Results — ✅ All requested jobs passed

Uh oh!

Uh oh!

github-actions Bot commented May 20, 2026

Selective E2E Results — ✅ All requested jobs passed

Uh oh!

Reviewers

Assignees

Labels

ericksoa commented May 20, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 20, 2026 •

edited

Loading

github-actions Bot commented May 20, 2026 •

edited

Loading

github-actions Bot commented May 20, 2026 •

edited

Loading