test(e2e): migrate security policy credential suites by jyaunches · Pull Request #3905 · NVIDIA/NemoClaw

jyaunches · 2026-05-20T12:51:46Z

Summary

Migrates the security policy and credential E2E checks into focused validation suites. Adds shared security assertion helpers, suite wiring, parity-map coverage, and framework tests for the new security credential, policy, shields, and injection suites.

Related Issue

Fixes #3815

Changes

Add test/e2e/validation_suites/lib/security_policy_credentials.sh for shared credential redaction, context, and policy assertion primitives.
Add focused security suite steps under test/e2e/validation_suites/security/{credentials,policy,shields,injection}/.
Register security-credentials, security-policy, security-shields, and security-injection in test/e2e/validation_suites/suites.yaml.
Update test/e2e/docs/parity-map.yaml to map migrated legacy security assertions to the new suites.
Extend E2E framework tests for stable assertion IDs and suite-runner behavior.

Type of Change

Code change (feature, bug fix, or refactor)
Code change with doc updates
Doc only (prose changes, no code sample modifications)
Doc only (includes code sample changes)

Verification

npx prek run --all-files passes
npm test passes
Tests added or updated for new or changed behavior
No secrets, API keys, or credentials committed
Docs updated for user-facing behavior changes
make docs builds without warnings (doc changes only)
Doc pages follow the style guide (doc changes only)
New doc pages include SPDX header and frontmatter (new pages only)

Verification run:

E2E_CONTEXT_DIR=<tmp> E2E_DRY_RUN=1 HOME=<tmp> test/e2e/runtime/run-suites.sh security-credentials security-policy security-shields security-injection passes with seeded context.
npm test -- test/e2e/scenario-framework-tests/e2e-lib-helpers.test.ts passes, 26 tests.
npm test -- test/e2e/scenario-framework-tests/e2e-suite-runner.test.ts test/e2e/scenario-framework-tests/e2e-parity-map.test.ts test/e2e/scenario-framework-tests/e2e-coverage-report.test.ts test/e2e/scenario-framework-tests/e2e-convention-lint.test.ts test/e2e/scenario-framework-tests/e2e-metadata-final-hygiene.test.ts passes, 32 tests.
npx prek run --all-files was attempted after hook-compliance fix but failed on unrelated repo/environment issues: missing nemoclaw/node_modules/json5 / plugin build artifact and multiple unrelated CLI timeout failures.

Signed-off-by: Julie Yaunches jyaunches@nvidia.com

Summary by CodeRabbit

Tests
- Added and extended e2e tests for credential handling, redaction, dry-run behavior, and credential-leak prevention.
- Added checks ensuring no plaintext host credential store and that gateway credential listings redact values.
- Introduced new security validation scenarios for injection, policy presets, shields, and openshell compatibility.
- Refactored credential verification to use a shared validation library and updated suite configurations to the post-onboard credential assertions.

coderabbitai · 2026-05-20T12:51:59Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 3c55ebef-bb2f-4e3c-a081-1583a4c7d9e7

📥 Commits

Reviewing files that changed from the base of the PR and between c5fc179 and e114b68.

📒 Files selected for processing (5)

test/e2e/scenario-framework-tests/e2e-lib-helpers.test.ts
test/e2e/validation_suites/lib/security_policy_credentials.sh
test/e2e/validation_suites/security/injection/00-telegram-message-not-shell-executed.sh
test/e2e/validation_suites/security/policy/01-openshell-version-supports-credential-rewrite.sh
test/e2e/validation_suites/security/shields/00-config-consistent.sh

📝 Walkthrough

Walkthrough

Adds a reusable Bash validation library, new e2e validation scripts (credentials, injection, policy, shields) that use it, updates suites.yaml to wire those steps, refactors credentials-present to call the library, adds Vitest tests, and updates parity-map assertion IDs for credential checks.

Changes

E2E Security Policy & Credential Coverage Migration

Layer / File(s)	Summary
Core validation library `test/e2e/validation_suites/lib/security_policy_credentials.sh`	New Bash library with idempotent sourcing and helpers: `spc_assertion_id()`, `spc_require_context()`, `spc_context_get()`, `spc_redact_secret_text()`, `spc_log_provider_metadata()`, `spc_assert_credentials_expected()`, `spc_assert_no_plaintext_host_store()`, `spc_assert_policy_preset_present()`, `spc_assert_openshell_credential_rewrite_supported()`, shields and injection helpers; includes dry-run handling and secret redaction.
Validation script implementations `test/e2e/validation_suites/security/credentials/00-credentials-present.sh`, `test/e2e/validation_suites/security/credentials/01-no-plaintext-host-store.sh`, `test/e2e/validation_suites/security/injection/00-telegram-message-not-shell-executed.sh`, `test/e2e/validation_suites/security/policy/00-telegram-preset-applied.sh`, `test/e2e/validation_suites/security/policy/01-openshell-version-supports-credential-rewrite.sh`, `test/e2e/validation_suites/security/shields/00-config-consistent.sh`	Scripts source the shared library and invoke its assertions: credentials-present delegates to `spc_assert_credentials_expected`; adds `no-plaintext-host-store` check; injection script validates Telegram payload handling without shell evaluation; policy and shields scripts emit assertion IDs and call library helpers.
Suite orchestration and refactoring `test/e2e/validation_suites/suites.yaml`	Inlines previously shared anchors for `security-credentials`, `security-shields`, `security-policy`, and `security-injection`; adds `no-plaintext-host-store` step to the credentials suite and wires the new validation scripts into suites.
Tests and parity mapping `test/e2e/scenario-framework-tests/e2e-lib-helpers.test.ts`, `test/e2e/scenario-framework-tests/e2e-suite-runner.test.ts`, `test/e2e/docs/parity-map.yaml`	Adds Vitest tests for helper success/failure, redaction, and openshell marker detection; adds a suite-runner test for `security-credentials` dry-run output asserting redaction and absence of plaintext host-store messages; updates parity-map.yaml replacing legacy credential assertions with `post-onboard.credentials.gateway-list-redacts-values` and `post-onboard.credentials.no-plaintext-host-store` and metadata.

Sequence Diagram

sequenceDiagram
  participant ValidationScript as validation script
  participant SharedLib as lib/security_policy_credentials.sh
  participant HostFS as Host FS (~/.nemoclaw/)
  participant Stdout as Test Output

  ValidationScript->>SharedLib: source library (guard check)
  ValidationScript->>Stdout: emit assertion ID
  ValidationScript->>SharedLib: call assertion (e.g., spc_assert_no_plaintext_host_store)
  SharedLib->>HostFS: check ~/.nemoclaw/credentials.json (exists/read)
  SharedLib->>Stdout: emit redacted assertion result

  participant InjectionScript as injection script
  participant Context as $E2E_CONTEXT_DIR/context.env

  InjectionScript->>SharedLib: source library
  InjectionScript->>SharedLib: spc_context_get E2E_TELEGRAM_PAYLOAD_FIXTURE
  SharedLib->>Context: read context variable
  InjectionScript->>Stdout: log payload size and dry-run notice

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

NVIDIA/NemoClaw#3800: Related edits to parity bookkeeping and CI/lint tooling around parity inventory and coverage.

Suggested labels

Integration: OpenClaw

Suggested reviewers

cv

Poem

🐰 I hopped through scripts at break of dawn,

Redacting secrets till they're gone.
Suites now call the library neat,
Assertions stable, tests complete.
A carrot-coded CI treat.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title clearly summarizes the main change: migrating security policy credential E2E suites into the layered scenario framework.
Linked Issues check	✅ Passed	The PR fulfills all primary objectives from `#3815`: domain primitives added, focused suite steps created, suites registered, parity-map updated with stable IDs, and framework tests extended.
Out of Scope Changes check	✅ Passed	All changes directly support the migration goal. Test files, shared library, suite configurations, and framework tests are all scoped to the credential/policy/shields/injection E2E migration.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch issue-3815-migrate-security-policy-credential-e2e

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-05-20T12:52:20Z

PR Review Advisor

Recommendation: blocked
Confidence: high
Analyzed HEAD: e114b68d95c2c2f154ee387f92d1428caef45561
Findings: 1 blocker(s), 4 warning(s), 0 suggestion(s)

This is an automated advisory review. A human maintainer must make the final merge decision.

Limitations: Review used the provided trusted metadata and diff; no scripts, package-manager commands, or tests were executed.; The parity-map diff is truncated, so full classification of every legacy assertion could not be independently verified.; Issue #3815 has no comments in the provided context; acceptance coverage is based on the issue body only.; E2E Advisor content was found, but the latest current-head E2E recommendation check is still in progress, so current-head E2E advisor status is ambiguous.; CI, mergeability, and review-thread state are point-in-time from the provided GitHub context and may change after this review.

Workflow run

Full advisor summary

PR Review Advisor

Base: origin/main
Head: HEAD
Analyzed SHA: e114b68d95c2c2f154ee387f92d1428caef45561
Recommendation: blocked
Confidence: high

Blocked by current hard gates: CI is still pending, GitHub mergeStateStatus is BLOCKED, and 2 review threads remain unresolved; the current patch improves several security suite assertions but still needs human review of migrated E2E semantics.

Gate status

CI: pending — Trusted GitHub rollup for e114b68 shows 12 pending/in-progress/queued contexts, including E2E recommendation, wsl-e2e, macos-e2e, PR review advisor, CodeQL, unit-vitest-linux, ShellCheck SARIF, build-sandbox-images, build-sandbox-images-arm64, checks, and CodeRabbit.
Mergeability: fail — GitHub GraphQL reports mergeStateStatus=BLOCKED for PR test(e2e): migrate security policy credential suites #3905 at head e114b68.
Review threads: fail — Trusted GraphQL reports 2 unresolved review thread(s): injection check enforcement and shields consistency check enforcement remain unresolved despite follow-up comments.
Risky code tested: warning — Risky areas detected: credentials/inference/network. The PR changes security E2E assertions and adds targeted framework tests, but current-head CI has not completed and semantic live coverage still depends on maintainer review.

🔴 Blockers

Current head is not merge-ready under hard gates: The latest SHA is blocked by pending CI, GitHub mergeStateStatus=BLOCKED, and unresolved review threads. These are hard gates independent of code quality.
- Recommendation: Wait for required checks on e114b68 to complete successfully, confirm branch protection is green, and resolve or consciously disposition the remaining review threads before treating the PR as merge-ready.
- Evidence: Trusted gateStatus: ci=pending with 12 pending contexts; mergeability=fail with mergeStateStatus=BLOCKED; reviewThreads=fail with 2 unresolved review thread(s).

🟡 Warnings

Telegram injection assertion does not exercise the real Telegram or assistant path (test/e2e/validation_suites/lib/security_policy_credentials.sh:220): The new helper now performs a non-dry-run assertion, but it submits a command-substitution payload to a sandbox shell that echoes stdin and checks for a marker file. That proves this helper path treats stdin literally, but it does not submit a Telegram message through the messaging provider, bridge, assistant, or OpenClaw path that the stable ID post-onboard.security-injection.telegram-message-not-shell-executed appears to represent.
- Recommendation: Either rename/classify the assertion to match the lower-level sandbox literal-payload check, or add a real messaging/Telegram-path test that submits the fixture through the standard provider bridge and verifies no shell side effects or unsafe response occur. Keep dry-run behavior separate from non-dry-run enforcement.
- Evidence: spc_assert_telegram_payload_not_shell_executed constructs payload='$(touch ... && echo INJECTED)', runs openshell sandbox exec ... sh -c 'MSG=$(cat); printf "%s\n" "$MSG"', and checks a marker file. The changed security/injection/00-telegram-message-not-shell-executed.sh delegates directly to this helper. GraphQL still shows the related injection review thread unresolved.
Credential redaction assertion validates redacted output but not raw-output non-disclosure (test/e2e/validation_suites/lib/security_policy_credentials.sh:63): spc_assert_credentials_expected captures nemoclaw credentials list output after piping it through spc_redact_secret_text. This is good for log safety, but it means the assertion cannot detect whether the CLI emitted a raw API key or token before redaction. The E2E Advisor also recommended a protected raw-output check for this case.
- Recommendation: For non-dry-run security validation, capture raw CLI output to a protected temp file, assert that it contains expected provider metadata and no secret-looking values, then print only redacted output to logs. Ensure temp files are permission-restricted and removed or collected only as safe artifacts.
- Evidence: Line 63 captures listed="$(nemoclaw credentials list 2>&1 | spc_redact_secret_text)"; E2E Advisor suggested a credentials-security-validation check that verifies raw command output does not contain API key/token/secret patterns before redaction.
OpenShell credential-rewrite support check relies on binary string markers (test/e2e/validation_suites/lib/security_policy_credentials.sh:113): The OpenShell capability assertion now enforces something in non-dry-run mode, but it checks for literal strings in the local openshell binary. That is brittle and may not correspond to the actual gateway/OpenShell version, negotiated capability, or runtime support used by the sandbox.
- Recommendation: Prefer a canonical capability/version query from OpenShell or NemoClaw gateway metadata. If binary string inspection is only a fallback, document that limitation and add a test for the canonical path.
- Evidence: spc_assert_openshell_credential_rewrite_supported runs strings "${openshell_bin}" and searches for request-body-credential-rewrite and websocket-credential-rewrite; issue test(e2e): migrate security policy and credential coverage #3815 listed test-openshell-version-pin.sh coverage to absorb.
Shared E2E metadata has active overlap and drift risk (test/e2e/docs/parity-map.yaml:1092): The patch applies to existing files, but it modifies shared E2E metadata and framework test files with multiple active overlapping PRs. Merge order could change parity classifications or suite wiring.
- Recommendation: Before final merge, refresh against main and compare overlapping PRs that touch parity-map.yaml, suites.yaml, and e2e-lib-helpers.test.ts so the migrated security coverage remains consistent.
- Evidence: Trusted overlap data lists PR test(e2e): migrate inference routing provider coverage #3903 touching parity-map.yaml, e2e-lib-helpers.test.ts, and suites.yaml; PRs test(e2e): remove parity report workflow #3819, chore: bump OpenShell pin to 0.0.44 #3830, fix(e2e): treat ERR_PROXY_TUNNEL as proxy wiring success in M12 test #3880, and chore: upgrade agent runtime dependencies #3925 touching parity-map.yaml; and PR fix(e2e): use sandbox subcommands in scenario suites #3927 touching e2e-lib-helpers.test.ts.

🔵 Suggestions

None.

Acceptance coverage

unknown — Parent epic: Implement layered E2E scenario model #3588: The PR links Fixes test(e2e): migrate security policy and credential coverage #3815. No direct evidence was provided for parent epic Implement layered E2E scenario model #3588 closure or status.
partial — Migrate the security-policy-credentials E2E coverage area into the layered scenario framework without porting legacy scripts line-for-line. Add the missing primitive layer first, then move assertions into scenario plans/suites with stable IDs.: Adds test/e2e/validation_suites/lib/security_policy_credentials.sh, security suite scripts, stable IDs, and suite entries. Coverage is partially satisfied, but live semantics for Telegram injection and raw credential leak detection still need maintainer review.
unknown — test-network-policy.sh: Listed as legacy/current coverage to absorb. The provided diff excerpt does not show explicit migrated, deferred, or retired mapping for this script.
partial — test-shields-config.sh: Adds security-shields and post-onboard.security-shields.config-consistent; helper now queries nemoclaw <sandbox> shields status and checks config owner/mode, but the related review thread is still unresolved and current-head CI is pending.
partial — test-credential-migration.sh: Parity-map remaps credential list and no-plaintext host-store assertions to post-onboard.credentials.*; helper asserts gateway header and no plaintext host credentials.json. Some migration assertions remain deferred/retired in the visible diff.
partial — test-credential-sanitization.sh: Adds redaction helper and tests that synthetic secrets are not logged. However raw nemoclaw credentials list output is redacted before inspection, so raw CLI leak detection is not fully proven.
partial — test-telegram-injection.sh: Adds security-injection/00-telegram-message-not-shell-executed.sh and stable ID. The current helper checks a sandbox stdin shell path, not a real Telegram/messaging/assistant path.
unknown — test-gateway-drift-preflight.sh: No explicit migrated, deferred, or retired evidence for this script is visible in the provided diff excerpt.
unknown — test-gateway-health-honest.sh: No explicit migrated, deferred, or retired evidence for this script is visible in the provided diff excerpt.
partial — test-openshell-version-pin.sh: Adds post-onboard.gateway.openshell-version-supports-credential-rewrite and non-dry-run marker checks, but the implementation uses binary string inspection rather than canonical version/capability metadata.
met — Add or extend the domain primitive library: test/e2e/validation_suites/lib/security_policy_credentials.sh.: New file test/e2e/validation_suites/lib/security_policy_credentials.sh added with context, redaction, credential, policy, OpenShell capability, shields, and injection helper functions.
met — Helpers must consume $E2E_CONTEXT_DIR/context.env; suites must not reinstall, onboard, or rediscover setup state.: Helpers wrap e2e_context_require/e2e_context_get, tests seed E2E_CONTEXT_DIR/context.env, and suite scripts source the helper library rather than reinstalling/onboarding.
met — Add/extend suite family entries in test/e2e/validation_suites/suites.yaml.: suites.yaml adds or updates security-credentials, security-policy, security-shields, and security-injection entries.
met — Add onboarding profiles/test plans/onboarding assertions only when the behavior belongs before expected-state validation.: No onboarding profiles or plans are changed; this PR confines changes to validation suites, helper tests, and parity metadata.
partial — Emit stable assertion IDs using <layer>.<domain>.<behavior>.: Scripts emit stable IDs such as post-onboard.credentials.gateway-list-redacts-values, post-onboard.security-policy.telegram-preset-applied, and post-onboard.security-injection.telegram-message-not-shell-executed. post-onboard.gateway.openshell-version-supports-credential-rewrite should be confirmed against the intended domain taxonomy.
partial — Update test/e2e/docs/parity-map.yaml metadata with layer, gap_domain, owner, and runner/secret requirements where applicable.: Visible changes add layer, gap_domain, and owner to remapped credential assertions. The truncated excerpt does not prove all affected legacy assertions have complete metadata.
unknown — Preserve compatibility with existing run-scenario.sh <id> --plan-only behavior.: The diff does not modify run-scenario.sh. Provided tests cover suite runner dry-run behavior but do not show a plan-only test specific to the new security suites.
met — Domain primitive helpers exist and are used by migrated suite steps.: The new suite scripts under security/credentials, security/policy, security/shields, and security/injection source security_policy_credentials.sh and delegate to its helper functions.
partial — At least the highest-value assertions from the listed legacy coverage are mapped to stable scenario assertion IDs.: High-value credentials, policy, shields, OpenShell capability, and injection IDs are present. Some mappings are semantically partial, especially Telegram injection and raw credential leak detection.
partial — Remaining legacy assertions are explicitly classified as deferred or retired with layer/domain metadata.: The visible parity-map excerpt includes many deferred and retired entries, but the diff is truncated and not all visible entries include layer/domain metadata.

Security review

pass — Secrets and Credentials: No committed real secrets were identified in the diff; new test strings are synthetic, and helper output redacts common token/API-key/password forms. Credential security coverage still has a testing-depth warning captured separately.
warning — Input Validation and Data Sanitization: The PR adds redaction and an injection payload assertion, but the Telegram injection assertion currently exercises a sandbox stdin echo path rather than the true Telegram/messaging path, limiting confidence against command-injection regressions.
pass — Authentication and Authorization: Not applicable — the change modifies E2E validation scripts, tests, and metadata, not runtime authn/authz endpoints or token validation.
pass — Dependencies and Third-Party Libraries: No new dependencies, package-manager files, registries, or version pins are introduced.
warning — Error Handling and Logging: Redaction before logging is positive, but nemoclaw credentials list output is inspected after redaction, so the new assertion cannot prove the raw CLI did not leak a secret before redaction.
pass — Cryptography and Data Protection: Not applicable — no cryptographic operations, key generation, hashing, encryption, TLS, or data-protection primitives are changed.
warning — Configuration and Security Headers: No HTTP headers or container configs are changed. The new shields/config assertion is useful but relies on status-string parsing and owner/mode heuristics, and the related review thread remains unresolved.
fail — Security Testing: The PR is specifically migrating high-risk security E2E coverage, but current-head CI is pending, two review threads remain unresolved, and the Telegram injection assertion does not exercise the real Telegram/messaging path represented by its stable ID.
warning — Holistic Security Posture: Production runtime code is not modified, which limits direct exploit risk. However, migrated parity metadata can create a false sense of complete security coverage if semantically partial tests are recorded as covering legacy security assertions.

Test / E2E status

Test depth: e2e_required — Although this is test/metadata-only code, it changes security E2E coverage for credentials, policy, shields, and injection. Unit/dry-run tests and fakes cover helper behavior, but they cannot fully prove that the migrated live assertions catch regressions in a real OpenShell/NemoClaw sandbox. Current-head CI is also still pending.
E2E Advisor: ambiguous
Missing for analyzed SHA: E2E recommendation check for e114b68d95c2c2f154ee387f92d1428caef45561 is still IN_PROGRESS

✅ What looks good

The patch is scoped to E2E validation suites, scenario framework tests, and parity metadata; no production sandbox/runtime code is modified.
The shared security helper now centralizes assertion IDs, context access, redaction, credential, policy, OpenShell capability, shields, and injection primitives.
Several earlier placeholder/no-op assertions were improved in the current head with non-dry-run checks and targeted framework tests.
New shell scripts include SPDX headers and use set -euo pipefail.
Suite wiring separates security-credentials, security-policy, security-shields, and security-injection into focused validation suites.

Review completeness

Review used the provided trusted metadata and diff; no scripts, package-manager commands, or tests were executed.
The parity-map diff is truncated, so full classification of every legacy assertion could not be independently verified.
Issue test(e2e): migrate security policy and credential coverage #3815 has no comments in the provided context; acceptance coverage is based on the issue body only.
E2E Advisor content was found, but the latest current-head E2E recommendation check is still in progress, so current-head E2E advisor status is ambiguous.
CI, mergeability, and review-thread state are point-in-time from the provided GitHub context and may change after this review.
Human maintainer review required: yes

github-actions · 2026-05-20T12:53:38Z

E2E Advisor Recommendation

Required E2E: None
Optional E2E: scenario-runner-security-suites-ubuntu-repo-cloud-openclaw, scenario-runner-messaging-telegram-ubuntu-repo-cloud-openclaw

Dispatch hint: scenario=ubuntu-repo-cloud-openclaw suite_filter=security-credentials,security-policy,security-shields,security-injection,messaging-telegram

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

None. No merge-blocking E2E is required because this PR touches E2E tests, validation-suite scripts, and E2E parity metadata only; it does not modify installer, onboarding, sandbox lifecycle, credentials implementation, policy enforcement, inference routing, deployment, or assistant runtime/user-flow code.

Optional E2E

scenario-runner-security-suites-ubuntu-repo-cloud-openclaw (medium): Optional confidence check for the changed validation suite wiring and new security scripts in a real scenario context. Runs the existing Scenario Runner against the suites directly touched by this PR.
scenario-runner-messaging-telegram-ubuntu-repo-cloud-openclaw (medium): Optional adjacent check because the new security-injection suite overlaps Telegram message injection behavior already covered by the existing messaging-telegram suite.

New E2E recommendations

None.

Dispatch hint

Workflow: .github/workflows/e2e-scenarios.yaml
jobs input: scenario=ubuntu-repo-cloud-openclaw suite_filter=security-credentials,security-policy,security-shields,security-injection,messaging-telegram

coderabbitai

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

test/e2e/validation_suites/suites.yaml (1)
1-1: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add SPDX license header to this YAML file.

This file is missing the required SPDX copyright/license header.
Suggested patch
+ # SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+ # SPDX-License-Identifier: Apache-2.0
+
  suites:
As per coding guidelines, `Every source file must include an SPDX license header for copyright and Apache-2.0 license`.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/e2e/validation_suites/suites.yaml` at line 1, This YAML file is missing
the required SPDX license header; add the standard SPDX header comment at the
top of the file (before the top-level key "suites:") containing the copyright
owner and the Apache-2.0 license identifier (e.g., "SPDX-FileCopyrightText:
<copyright holder>" and "SPDX-License-Identifier: Apache-2.0") so the header
precedes the existing "suites:" entry.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/e2e/validation_suites/lib/security_policy_credentials.sh`:
- Around line 42-63: spc_assert_credentials_expected currently only runs
"nemoclaw credentials list" and succeeds on exit code rather than asserting that
at least one gateway credential is present; modify
spc_assert_credentials_expected to capture the command output (after piping
through spc_redact_secret_text), and if the expected state is "present" verify
the output contains at least one credential entry (e.g., non-empty output or
matches the gateway credential entry pattern) and return non-zero with a clear
error message if none found; keep the dry-run path and existing context checks
(spc_require_context, spc_log_provider_metadata) unchanged and use the existing
assertion id "post-onboard.credentials.gateway-list-redacts-values" to maintain
semantics.
- Around line 76-84: spc_assert_policy_preset_present is currently a no-op in
non-dry runs; update it so after calling spc_assertion_id and
spc_require_context it actually verifies the preset and fails the test if
missing: when e2e_env_is_dry_run is true keep the current echo, otherwise query
the system for the applied policy preset (using the same CLI/API the suite uses
for other assertions), compare the result to the expected "${preset}", and if
they differ call the test failure helper (or exit non‑zero) so the suite fails;
keep function name spc_assert_policy_preset_present and existing calls to
spc_assertion_id and e2e_env_is_dry_run.

In
`@test/e2e/validation_suites/security/injection/00-telegram-message-not-shell-executed.sh`:
- Around line 8-12: The test currently only logs the payload and exits; update
the script so that when e2e_env_is_dry_run returns false it actually submits the
payload (use the existing payload variable) to the system under test (e.g. via
your standard test helper or an HTTP POST) and then assert expected safe
behavior instead of just printing length: call the appropriate assertion helper
(e.g. spc_assert_* or a new assertion) to fail if the response or side-effect
shows shell-evaluated output or unexpected command execution; keep
spc_assertion_id and spc_require_context intact and ensure failures produce a
non-zero exit so regressions are enforced.

In
`@test/e2e/validation_suites/security/policy/01-openshell-version-supports-credential-rewrite.sh`:
- Around line 8-10: The script currently only echoes a message in
e2e_env_is_dry_run and never verifies OpenShell's capability in real runs;
update the block after spc_require_context to perform an actual capability check
when e2e_env_is_dry_run is false: call the appropriate gateway-capability
lookup/validation routine (e.g., the project helper that queries gateway
metadata or a function you add such as verify_openshell_capability_support) to
assert that the OpenShell gateway advertises "credential-rewrite" support and
exit non‑zero (failing the test) if the capability or required version is
absent; keep the dry-run echo for e2e_env_is_dry_run, but replace the no-op path
with the real check referenced above.

In `@test/e2e/validation_suites/security/shields/00-config-consistent.sh`:
- Around line 8-10: The test is non-enforcing: it only prints a dry-run message
and never compares shield configuration, so update the script to perform an
actual consistency check instead of just echoing when using spc_assertion_id and
spc_require_context; call the runtime helper e2e_env_is_dry_run to skip real
verification only in dry-run mode, otherwise load the expected shields config
(e.g. from the test fixture or canonical source), fetch the current deployed
shields config (via the appropriate CLI/helper used elsewhere in the suite),
perform a deterministic comparison, and fail the script (non-zero exit and/or
use the suite's assert helper) when differences are detected so regressions are
caught.

---

Outside diff comments:
In `@test/e2e/validation_suites/suites.yaml`:
- Line 1: This YAML file is missing the required SPDX license header; add the
standard SPDX header comment at the top of the file (before the top-level key
"suites:") containing the copyright owner and the Apache-2.0 license identifier
(e.g., "SPDX-FileCopyrightText: <copyright holder>" and
"SPDX-License-Identifier: Apache-2.0") so the header precedes the existing
"suites:" entry.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: d438b12c-d741-4656-a877-5690f9003a87

📥 Commits

Reviewing files that changed from the base of the PR and between ca045a9 and 8e7b6e7.

📒 Files selected for processing (11)

test/e2e/docs/parity-map.yaml
test/e2e/scenario-framework-tests/e2e-lib-helpers.test.ts
test/e2e/scenario-framework-tests/e2e-suite-runner.test.ts
test/e2e/validation_suites/lib/security_policy_credentials.sh
test/e2e/validation_suites/security/credentials/00-credentials-present.sh
test/e2e/validation_suites/security/credentials/01-no-plaintext-host-store.sh
test/e2e/validation_suites/security/injection/00-telegram-message-not-shell-executed.sh
test/e2e/validation_suites/security/policy/00-telegram-preset-applied.sh
test/e2e/validation_suites/security/policy/01-openshell-version-supports-credential-rewrite.sh
test/e2e/validation_suites/security/shields/00-config-consistent.sh
test/e2e/validation_suites/suites.yaml

wscurran · 2026-05-20T16:02:44Z

✨ Related open issues:

#3815 test(e2e): migrate security policy and credential coverage

jyaunches added 20 commits May 20, 2026 07:34

Simplify security policy credential E2E spec

829610e

Add test specification for security policy credential E2E migration

70046da

Add validation plan for security policy credential E2E migration

ef8d57e

Apply spec review recommendation from section 1

9228f62

Apply spec review recommendation from section 5

b42e88c

feat: Implement Phase 1 - security primitives

c16ab8a

Mark Phase 1 as completed [c16ab8a]

1fd9983

feat: Implement Phase 2 - credential suites

a6e06b3

Mark Phase 2 as completed [a6e06b3]

9fa66a9

feat: Implement Phase 3 - policy shields suites

dbe5707

Mark Phase 3 as completed [dbe5707]

3ba3d39

feat: Implement Phase 4 - injection version suites

04d6c80

Mark Phase 4 as completed [04d6c80]

7bab895

feat: Implement Phase 5 - parity review gate

e23ea30

Mark Phase 5 as completed [e23ea30]

52e27d5

chore: Implement Phase 6 - final hygiene

9fcba5d

Mark Phase 6 as completed [9fcba5d]

1e4fa4a

test(e2e): validate security migration spec

e5190a5

chore: remove vd workflow artifacts

fcf612c

test(e2e): fix security helper hook compliance

8e7b6e7

jyaunches self-assigned this May 20, 2026

coderabbitai Bot reviewed May 20, 2026

View reviewed changes

wscurran added E2E End-to-end testing — Brev infrastructure, test cases, nightly failures, and coverage gaps enhancement: testing Use this label to identify requests to improve NemoClaw test coverage. fix labels May 20, 2026

jyaunches added the v0.0.47 Release target label May 20, 2026

cv approved these changes May 20, 2026

View reviewed changes

This was referenced May 20, 2026

test(e2e): fix current nightly failures #3926

Merged

fix(e2e): use sandbox subcommands in scenario suites #3927

Open

merge: resolve main conflicts in e2e credential suites

c5fc179

cv enabled auto-merge (squash) May 21, 2026 02:45

cv disabled auto-merge May 21, 2026 02:46

test(e2e): enforce security credential suite assertions

e114b68

cv merged commit 18c7265 into main May 21, 2026
28 checks passed

Conversation

jyaunches commented May 20, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Related Issue

Changes

Type of Change

Verification

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Advisor

PR Review Advisor

Gate status

🔴 Blockers

🟡 Warnings

🔵 Suggestions

Acceptance coverage

Security review

Test / E2E status

✅ What looks good

Review completeness

Uh oh!

github-actions Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Advisor Recommendation

E2E Recommendation Advisor

Required E2E

Optional E2E

New E2E recommendations

Dispatch hint

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

wscurran commented May 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jyaunches commented May 20, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 20, 2026 •

edited

Loading

github-actions Bot commented May 20, 2026 •

edited

Loading

github-actions Bot commented May 20, 2026 •

edited

Loading