Skip to content

test(e2e): migrate security policy credential suites#3905

Merged
cv merged 22 commits into
mainfrom
issue-3815-migrate-security-policy-credential-e2e
May 21, 2026
Merged

test(e2e): migrate security policy credential suites#3905
cv merged 22 commits into
mainfrom
issue-3815-migrate-security-policy-credential-e2e

Conversation

@jyaunches
Copy link
Copy Markdown
Contributor

@jyaunches jyaunches commented May 20, 2026

Summary

Migrates the security policy and credential E2E checks into focused validation suites. Adds shared security assertion helpers, suite wiring, parity-map coverage, and framework tests for the new security credential, policy, shields, and injection suites.

Related Issue

Fixes #3815

Changes

  • Add test/e2e/validation_suites/lib/security_policy_credentials.sh for shared credential redaction, context, and policy assertion primitives.
  • Add focused security suite steps under test/e2e/validation_suites/security/{credentials,policy,shields,injection}/.
  • Register security-credentials, security-policy, security-shields, and security-injection in test/e2e/validation_suites/suites.yaml.
  • Update test/e2e/docs/parity-map.yaml to map migrated legacy security assertions to the new suites.
  • Extend E2E framework tests for stable assertion IDs and suite-runner behavior.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • make docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Verification run:

  • E2E_CONTEXT_DIR=<tmp> E2E_DRY_RUN=1 HOME=<tmp> test/e2e/runtime/run-suites.sh security-credentials security-policy security-shields security-injection passes with seeded context.
  • npm test -- test/e2e/scenario-framework-tests/e2e-lib-helpers.test.ts passes, 26 tests.
  • npm test -- test/e2e/scenario-framework-tests/e2e-suite-runner.test.ts test/e2e/scenario-framework-tests/e2e-parity-map.test.ts test/e2e/scenario-framework-tests/e2e-coverage-report.test.ts test/e2e/scenario-framework-tests/e2e-convention-lint.test.ts test/e2e/scenario-framework-tests/e2e-metadata-final-hygiene.test.ts passes, 32 tests.
  • npx prek run --all-files was attempted after hook-compliance fix but failed on unrelated repo/environment issues: missing nemoclaw/node_modules/json5 / plugin build artifact and multiple unrelated CLI timeout failures.

Signed-off-by: Julie Yaunches jyaunches@nvidia.com

Summary by CodeRabbit

  • Tests
    • Added and extended e2e tests for credential handling, redaction, dry-run behavior, and credential-leak prevention.
    • Added checks ensuring no plaintext host credential store and that gateway credential listings redact values.
    • Introduced new security validation scenarios for injection, policy presets, shields, and openshell compatibility.
    • Refactored credential verification to use a shared validation library and updated suite configurations to the post-onboard credential assertions.

Review Change Stack

@jyaunches jyaunches self-assigned this May 20, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 20, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 3c55ebef-bb2f-4e3c-a081-1583a4c7d9e7

📥 Commits

Reviewing files that changed from the base of the PR and between c5fc179 and e114b68.

📒 Files selected for processing (5)
  • test/e2e/scenario-framework-tests/e2e-lib-helpers.test.ts
  • test/e2e/validation_suites/lib/security_policy_credentials.sh
  • test/e2e/validation_suites/security/injection/00-telegram-message-not-shell-executed.sh
  • test/e2e/validation_suites/security/policy/01-openshell-version-supports-credential-rewrite.sh
  • test/e2e/validation_suites/security/shields/00-config-consistent.sh

📝 Walkthrough

Walkthrough

Adds a reusable Bash validation library, new e2e validation scripts (credentials, injection, policy, shields) that use it, updates suites.yaml to wire those steps, refactors credentials-present to call the library, adds Vitest tests, and updates parity-map assertion IDs for credential checks.

Changes

E2E Security Policy & Credential Coverage Migration

Layer / File(s) Summary
Core validation library
test/e2e/validation_suites/lib/security_policy_credentials.sh
New Bash library with idempotent sourcing and helpers: spc_assertion_id(), spc_require_context(), spc_context_get(), spc_redact_secret_text(), spc_log_provider_metadata(), spc_assert_credentials_expected(), spc_assert_no_plaintext_host_store(), spc_assert_policy_preset_present(), spc_assert_openshell_credential_rewrite_supported(), shields and injection helpers; includes dry-run handling and secret redaction.
Validation script implementations
test/e2e/validation_suites/security/credentials/00-credentials-present.sh, test/e2e/validation_suites/security/credentials/01-no-plaintext-host-store.sh, test/e2e/validation_suites/security/injection/00-telegram-message-not-shell-executed.sh, test/e2e/validation_suites/security/policy/00-telegram-preset-applied.sh, test/e2e/validation_suites/security/policy/01-openshell-version-supports-credential-rewrite.sh, test/e2e/validation_suites/security/shields/00-config-consistent.sh
Scripts source the shared library and invoke its assertions: credentials-present delegates to spc_assert_credentials_expected; adds no-plaintext-host-store check; injection script validates Telegram payload handling without shell evaluation; policy and shields scripts emit assertion IDs and call library helpers.
Suite orchestration and refactoring
test/e2e/validation_suites/suites.yaml
Inlines previously shared anchors for security-credentials, security-shields, security-policy, and security-injection; adds no-plaintext-host-store step to the credentials suite and wires the new validation scripts into suites.
Tests and parity mapping
test/e2e/scenario-framework-tests/e2e-lib-helpers.test.ts, test/e2e/scenario-framework-tests/e2e-suite-runner.test.ts, test/e2e/docs/parity-map.yaml
Adds Vitest tests for helper success/failure, redaction, and openshell marker detection; adds a suite-runner test for security-credentials dry-run output asserting redaction and absence of plaintext host-store messages; updates parity-map.yaml replacing legacy credential assertions with post-onboard.credentials.gateway-list-redacts-values and post-onboard.credentials.no-plaintext-host-store and metadata.

Sequence Diagram

sequenceDiagram
  participant ValidationScript as validation script
  participant SharedLib as lib/security_policy_credentials.sh
  participant HostFS as Host FS (~/.nemoclaw/)
  participant Stdout as Test Output

  ValidationScript->>SharedLib: source library (guard check)
  ValidationScript->>Stdout: emit assertion ID
  ValidationScript->>SharedLib: call assertion (e.g., spc_assert_no_plaintext_host_store)
  SharedLib->>HostFS: check ~/.nemoclaw/credentials.json (exists/read)
  SharedLib->>Stdout: emit redacted assertion result

  participant InjectionScript as injection script
  participant Context as $E2E_CONTEXT_DIR/context.env

  InjectionScript->>SharedLib: source library
  InjectionScript->>SharedLib: spc_context_get E2E_TELEGRAM_PAYLOAD_FIXTURE
  SharedLib->>Context: read context variable
  InjectionScript->>Stdout: log payload size and dry-run notice
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • NVIDIA/NemoClaw#3800: Related edits to parity bookkeeping and CI/lint tooling around parity inventory and coverage.

Suggested labels

Integration: OpenClaw

Suggested reviewers

  • cv

Poem

🐰 I hopped through scripts at break of dawn,

Redacting secrets till they're gone.
Suites now call the library neat,
Assertions stable, tests complete.
A carrot-coded CI treat.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly summarizes the main change: migrating security policy credential E2E suites into the layered scenario framework.
Linked Issues check ✅ Passed The PR fulfills all primary objectives from #3815: domain primitives added, focused suite steps created, suites registered, parity-map updated with stable IDs, and framework tests extended.
Out of Scope Changes check ✅ Passed All changes directly support the migration goal. Test files, shared library, suite configurations, and framework tests are all scoped to the credential/policy/shields/injection E2E migration.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch issue-3815-migrate-security-policy-credential-e2e

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.


Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 20, 2026

PR Review Advisor

Recommendation: blocked
Confidence: high
Analyzed HEAD: e114b68d95c2c2f154ee387f92d1428caef45561
Findings: 1 blocker(s), 4 warning(s), 0 suggestion(s)

This is an automated advisory review. A human maintainer must make the final merge decision.

Limitations: Review used the provided trusted metadata and diff; no scripts, package-manager commands, or tests were executed.; The parity-map diff is truncated, so full classification of every legacy assertion could not be independently verified.; Issue #3815 has no comments in the provided context; acceptance coverage is based on the issue body only.; E2E Advisor content was found, but the latest current-head E2E recommendation check is still in progress, so current-head E2E advisor status is ambiguous.; CI, mergeability, and review-thread state are point-in-time from the provided GitHub context and may change after this review.

Workflow run

Full advisor summary

PR Review Advisor

Base: origin/main
Head: HEAD
Analyzed SHA: e114b68d95c2c2f154ee387f92d1428caef45561
Recommendation: blocked
Confidence: high

Blocked by current hard gates: CI is still pending, GitHub mergeStateStatus is BLOCKED, and 2 review threads remain unresolved; the current patch improves several security suite assertions but still needs human review of migrated E2E semantics.

Gate status

  • CI: pending — Trusted GitHub rollup for e114b68 shows 12 pending/in-progress/queued contexts, including E2E recommendation, wsl-e2e, macos-e2e, PR review advisor, CodeQL, unit-vitest-linux, ShellCheck SARIF, build-sandbox-images, build-sandbox-images-arm64, checks, and CodeRabbit.
  • Mergeability: fail — GitHub GraphQL reports mergeStateStatus=BLOCKED for PR test(e2e): migrate security policy credential suites #3905 at head e114b68.
  • Review threads: fail — Trusted GraphQL reports 2 unresolved review thread(s): injection check enforcement and shields consistency check enforcement remain unresolved despite follow-up comments.
  • Risky code tested: warning — Risky areas detected: credentials/inference/network. The PR changes security E2E assertions and adds targeted framework tests, but current-head CI has not completed and semantic live coverage still depends on maintainer review.

🔴 Blockers

  • Current head is not merge-ready under hard gates: The latest SHA is blocked by pending CI, GitHub mergeStateStatus=BLOCKED, and unresolved review threads. These are hard gates independent of code quality.
    • Recommendation: Wait for required checks on e114b68 to complete successfully, confirm branch protection is green, and resolve or consciously disposition the remaining review threads before treating the PR as merge-ready.
    • Evidence: Trusted gateStatus: ci=pending with 12 pending contexts; mergeability=fail with mergeStateStatus=BLOCKED; reviewThreads=fail with 2 unresolved review thread(s).

🟡 Warnings

  • Telegram injection assertion does not exercise the real Telegram or assistant path (test/e2e/validation_suites/lib/security_policy_credentials.sh:220): The new helper now performs a non-dry-run assertion, but it submits a command-substitution payload to a sandbox shell that echoes stdin and checks for a marker file. That proves this helper path treats stdin literally, but it does not submit a Telegram message through the messaging provider, bridge, assistant, or OpenClaw path that the stable ID post-onboard.security-injection.telegram-message-not-shell-executed appears to represent.
    • Recommendation: Either rename/classify the assertion to match the lower-level sandbox literal-payload check, or add a real messaging/Telegram-path test that submits the fixture through the standard provider bridge and verifies no shell side effects or unsafe response occur. Keep dry-run behavior separate from non-dry-run enforcement.
    • Evidence: spc_assert_telegram_payload_not_shell_executed constructs payload='$(touch ... && echo INJECTED)', runs openshell sandbox exec ... sh -c 'MSG=$(cat); printf "%s\n" "$MSG"', and checks a marker file. The changed security/injection/00-telegram-message-not-shell-executed.sh delegates directly to this helper. GraphQL still shows the related injection review thread unresolved.
  • Credential redaction assertion validates redacted output but not raw-output non-disclosure (test/e2e/validation_suites/lib/security_policy_credentials.sh:63): spc_assert_credentials_expected captures nemoclaw credentials list output after piping it through spc_redact_secret_text. This is good for log safety, but it means the assertion cannot detect whether the CLI emitted a raw API key or token before redaction. The E2E Advisor also recommended a protected raw-output check for this case.
    • Recommendation: For non-dry-run security validation, capture raw CLI output to a protected temp file, assert that it contains expected provider metadata and no secret-looking values, then print only redacted output to logs. Ensure temp files are permission-restricted and removed or collected only as safe artifacts.
    • Evidence: Line 63 captures listed="$(nemoclaw credentials list 2>&1 | spc_redact_secret_text)"; E2E Advisor suggested a credentials-security-validation check that verifies raw command output does not contain API key/token/secret patterns before redaction.
  • OpenShell credential-rewrite support check relies on binary string markers (test/e2e/validation_suites/lib/security_policy_credentials.sh:113): The OpenShell capability assertion now enforces something in non-dry-run mode, but it checks for literal strings in the local openshell binary. That is brittle and may not correspond to the actual gateway/OpenShell version, negotiated capability, or runtime support used by the sandbox.
    • Recommendation: Prefer a canonical capability/version query from OpenShell or NemoClaw gateway metadata. If binary string inspection is only a fallback, document that limitation and add a test for the canonical path.
    • Evidence: spc_assert_openshell_credential_rewrite_supported runs strings "${openshell_bin}" and searches for request-body-credential-rewrite and websocket-credential-rewrite; issue test(e2e): migrate security policy and credential coverage #3815 listed test-openshell-version-pin.sh coverage to absorb.
  • Shared E2E metadata has active overlap and drift risk (test/e2e/docs/parity-map.yaml:1092): The patch applies to existing files, but it modifies shared E2E metadata and framework test files with multiple active overlapping PRs. Merge order could change parity classifications or suite wiring.

🔵 Suggestions

  • None.

Acceptance coverage

  • unknown — Parent epic: Implement layered E2E scenario model #3588: The PR links Fixes test(e2e): migrate security policy and credential coverage #3815. No direct evidence was provided for parent epic Implement layered E2E scenario model #3588 closure or status.
  • partial — Migrate the security-policy-credentials E2E coverage area into the layered scenario framework without porting legacy scripts line-for-line. Add the missing primitive layer first, then move assertions into scenario plans/suites with stable IDs.: Adds test/e2e/validation_suites/lib/security_policy_credentials.sh, security suite scripts, stable IDs, and suite entries. Coverage is partially satisfied, but live semantics for Telegram injection and raw credential leak detection still need maintainer review.
  • unknowntest-network-policy.sh: Listed as legacy/current coverage to absorb. The provided diff excerpt does not show explicit migrated, deferred, or retired mapping for this script.
  • partialtest-shields-config.sh: Adds security-shields and post-onboard.security-shields.config-consistent; helper now queries nemoclaw <sandbox> shields status and checks config owner/mode, but the related review thread is still unresolved and current-head CI is pending.
  • partialtest-credential-migration.sh: Parity-map remaps credential list and no-plaintext host-store assertions to post-onboard.credentials.*; helper asserts gateway header and no plaintext host credentials.json. Some migration assertions remain deferred/retired in the visible diff.
  • partialtest-credential-sanitization.sh: Adds redaction helper and tests that synthetic secrets are not logged. However raw nemoclaw credentials list output is redacted before inspection, so raw CLI leak detection is not fully proven.
  • partialtest-telegram-injection.sh: Adds security-injection/00-telegram-message-not-shell-executed.sh and stable ID. The current helper checks a sandbox stdin shell path, not a real Telegram/messaging/assistant path.
  • unknowntest-gateway-drift-preflight.sh: No explicit migrated, deferred, or retired evidence for this script is visible in the provided diff excerpt.
  • unknowntest-gateway-health-honest.sh: No explicit migrated, deferred, or retired evidence for this script is visible in the provided diff excerpt.
  • partialtest-openshell-version-pin.sh: Adds post-onboard.gateway.openshell-version-supports-credential-rewrite and non-dry-run marker checks, but the implementation uses binary string inspection rather than canonical version/capability metadata.
  • met — Add or extend the domain primitive library: test/e2e/validation_suites/lib/security_policy_credentials.sh.: New file test/e2e/validation_suites/lib/security_policy_credentials.sh added with context, redaction, credential, policy, OpenShell capability, shields, and injection helper functions.
  • met — Helpers must consume $E2E_CONTEXT_DIR/context.env; suites must not reinstall, onboard, or rediscover setup state.: Helpers wrap e2e_context_require/e2e_context_get, tests seed E2E_CONTEXT_DIR/context.env, and suite scripts source the helper library rather than reinstalling/onboarding.
  • met — Add/extend suite family entries in test/e2e/validation_suites/suites.yaml.: suites.yaml adds or updates security-credentials, security-policy, security-shields, and security-injection entries.
  • met — Add onboarding profiles/test plans/onboarding assertions only when the behavior belongs before expected-state validation.: No onboarding profiles or plans are changed; this PR confines changes to validation suites, helper tests, and parity metadata.
  • partial — Emit stable assertion IDs using <layer>.<domain>.<behavior>.: Scripts emit stable IDs such as post-onboard.credentials.gateway-list-redacts-values, post-onboard.security-policy.telegram-preset-applied, and post-onboard.security-injection.telegram-message-not-shell-executed. post-onboard.gateway.openshell-version-supports-credential-rewrite should be confirmed against the intended domain taxonomy.
  • partial — Update test/e2e/docs/parity-map.yaml metadata with layer, gap_domain, owner, and runner/secret requirements where applicable.: Visible changes add layer, gap_domain, and owner to remapped credential assertions. The truncated excerpt does not prove all affected legacy assertions have complete metadata.
  • unknown — Preserve compatibility with existing run-scenario.sh <id> --plan-only behavior.: The diff does not modify run-scenario.sh. Provided tests cover suite runner dry-run behavior but do not show a plan-only test specific to the new security suites.
  • met — Domain primitive helpers exist and are used by migrated suite steps.: The new suite scripts under security/credentials, security/policy, security/shields, and security/injection source security_policy_credentials.sh and delegate to its helper functions.
  • partial — At least the highest-value assertions from the listed legacy coverage are mapped to stable scenario assertion IDs.: High-value credentials, policy, shields, OpenShell capability, and injection IDs are present. Some mappings are semantically partial, especially Telegram injection and raw credential leak detection.
  • partial — Remaining legacy assertions are explicitly classified as deferred or retired with layer/domain metadata.: The visible parity-map excerpt includes many deferred and retired entries, but the diff is truncated and not all visible entries include layer/domain metadata.

Security review

  • pass — Secrets and Credentials: No committed real secrets were identified in the diff; new test strings are synthetic, and helper output redacts common token/API-key/password forms. Credential security coverage still has a testing-depth warning captured separately.
  • warning — Input Validation and Data Sanitization: The PR adds redaction and an injection payload assertion, but the Telegram injection assertion currently exercises a sandbox stdin echo path rather than the true Telegram/messaging path, limiting confidence against command-injection regressions.
  • pass — Authentication and Authorization: Not applicable — the change modifies E2E validation scripts, tests, and metadata, not runtime authn/authz endpoints or token validation.
  • pass — Dependencies and Third-Party Libraries: No new dependencies, package-manager files, registries, or version pins are introduced.
  • warning — Error Handling and Logging: Redaction before logging is positive, but nemoclaw credentials list output is inspected after redaction, so the new assertion cannot prove the raw CLI did not leak a secret before redaction.
  • pass — Cryptography and Data Protection: Not applicable — no cryptographic operations, key generation, hashing, encryption, TLS, or data-protection primitives are changed.
  • warning — Configuration and Security Headers: No HTTP headers or container configs are changed. The new shields/config assertion is useful but relies on status-string parsing and owner/mode heuristics, and the related review thread remains unresolved.
  • fail — Security Testing: The PR is specifically migrating high-risk security E2E coverage, but current-head CI is pending, two review threads remain unresolved, and the Telegram injection assertion does not exercise the real Telegram/messaging path represented by its stable ID.
  • warning — Holistic Security Posture: Production runtime code is not modified, which limits direct exploit risk. However, migrated parity metadata can create a false sense of complete security coverage if semantically partial tests are recorded as covering legacy security assertions.

Test / E2E status

  • Test depth: e2e_required — Although this is test/metadata-only code, it changes security E2E coverage for credentials, policy, shields, and injection. Unit/dry-run tests and fakes cover helper behavior, but they cannot fully prove that the migrated live assertions catch regressions in a real OpenShell/NemoClaw sandbox. Current-head CI is also still pending.
  • E2E Advisor: ambiguous
  • Missing for analyzed SHA: E2E recommendation check for e114b68d95c2c2f154ee387f92d1428caef45561 is still IN_PROGRESS

✅ What looks good

  • The patch is scoped to E2E validation suites, scenario framework tests, and parity metadata; no production sandbox/runtime code is modified.
  • The shared security helper now centralizes assertion IDs, context access, redaction, credential, policy, OpenShell capability, shields, and injection primitives.
  • Several earlier placeholder/no-op assertions were improved in the current head with non-dry-run checks and targeted framework tests.
  • New shell scripts include SPDX headers and use set -euo pipefail.
  • Suite wiring separates security-credentials, security-policy, security-shields, and security-injection into focused validation suites.

Review completeness

  • Review used the provided trusted metadata and diff; no scripts, package-manager commands, or tests were executed.
  • The parity-map diff is truncated, so full classification of every legacy assertion could not be independently verified.
  • Issue test(e2e): migrate security policy and credential coverage #3815 has no comments in the provided context; acceptance coverage is based on the issue body only.
  • E2E Advisor content was found, but the latest current-head E2E recommendation check is still in progress, so current-head E2E advisor status is ambiguous.
  • CI, mergeability, and review-thread state are point-in-time from the provided GitHub context and may change after this review.
  • Human maintainer review required: yes

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 20, 2026

E2E Advisor Recommendation

Required E2E: None
Optional E2E: scenario-runner-security-suites-ubuntu-repo-cloud-openclaw, scenario-runner-messaging-telegram-ubuntu-repo-cloud-openclaw

Dispatch hint: scenario=ubuntu-repo-cloud-openclaw suite_filter=security-credentials,security-policy,security-shields,security-injection,messaging-telegram

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • None. No merge-blocking E2E is required because this PR touches E2E tests, validation-suite scripts, and E2E parity metadata only; it does not modify installer, onboarding, sandbox lifecycle, credentials implementation, policy enforcement, inference routing, deployment, or assistant runtime/user-flow code.

Optional E2E

  • scenario-runner-security-suites-ubuntu-repo-cloud-openclaw (medium): Optional confidence check for the changed validation suite wiring and new security scripts in a real scenario context. Runs the existing Scenario Runner against the suites directly touched by this PR.
  • scenario-runner-messaging-telegram-ubuntu-repo-cloud-openclaw (medium): Optional adjacent check because the new security-injection suite overlaps Telegram message injection behavior already covered by the existing messaging-telegram suite.

New E2E recommendations

  • None.

Dispatch hint

  • Workflow: .github/workflows/e2e-scenarios.yaml
  • jobs input: scenario=ubuntu-repo-cloud-openclaw suite_filter=security-credentials,security-policy,security-shields,security-injection,messaging-telegram

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
test/e2e/validation_suites/suites.yaml (1)

1-1: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add SPDX license header to this YAML file.

This file is missing the required SPDX copyright/license header.

Suggested patch
+ # SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+ # SPDX-License-Identifier: Apache-2.0
+
  suites:
As per coding guidelines, `Every source file must include an SPDX license header for copyright and Apache-2.0 license`.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/e2e/validation_suites/suites.yaml` at line 1, This YAML file is missing
the required SPDX license header; add the standard SPDX header comment at the
top of the file (before the top-level key "suites:") containing the copyright
owner and the Apache-2.0 license identifier (e.g., "SPDX-FileCopyrightText:
<copyright holder>" and "SPDX-License-Identifier: Apache-2.0") so the header
precedes the existing "suites:" entry.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/e2e/validation_suites/lib/security_policy_credentials.sh`:
- Around line 42-63: spc_assert_credentials_expected currently only runs
"nemoclaw credentials list" and succeeds on exit code rather than asserting that
at least one gateway credential is present; modify
spc_assert_credentials_expected to capture the command output (after piping
through spc_redact_secret_text), and if the expected state is "present" verify
the output contains at least one credential entry (e.g., non-empty output or
matches the gateway credential entry pattern) and return non-zero with a clear
error message if none found; keep the dry-run path and existing context checks
(spc_require_context, spc_log_provider_metadata) unchanged and use the existing
assertion id "post-onboard.credentials.gateway-list-redacts-values" to maintain
semantics.
- Around line 76-84: spc_assert_policy_preset_present is currently a no-op in
non-dry runs; update it so after calling spc_assertion_id and
spc_require_context it actually verifies the preset and fails the test if
missing: when e2e_env_is_dry_run is true keep the current echo, otherwise query
the system for the applied policy preset (using the same CLI/API the suite uses
for other assertions), compare the result to the expected "${preset}", and if
they differ call the test failure helper (or exit non‑zero) so the suite fails;
keep function name spc_assert_policy_preset_present and existing calls to
spc_assertion_id and e2e_env_is_dry_run.

In
`@test/e2e/validation_suites/security/injection/00-telegram-message-not-shell-executed.sh`:
- Around line 8-12: The test currently only logs the payload and exits; update
the script so that when e2e_env_is_dry_run returns false it actually submits the
payload (use the existing payload variable) to the system under test (e.g. via
your standard test helper or an HTTP POST) and then assert expected safe
behavior instead of just printing length: call the appropriate assertion helper
(e.g. spc_assert_* or a new assertion) to fail if the response or side-effect
shows shell-evaluated output or unexpected command execution; keep
spc_assertion_id and spc_require_context intact and ensure failures produce a
non-zero exit so regressions are enforced.

In
`@test/e2e/validation_suites/security/policy/01-openshell-version-supports-credential-rewrite.sh`:
- Around line 8-10: The script currently only echoes a message in
e2e_env_is_dry_run and never verifies OpenShell's capability in real runs;
update the block after spc_require_context to perform an actual capability check
when e2e_env_is_dry_run is false: call the appropriate gateway-capability
lookup/validation routine (e.g., the project helper that queries gateway
metadata or a function you add such as verify_openshell_capability_support) to
assert that the OpenShell gateway advertises "credential-rewrite" support and
exit non‑zero (failing the test) if the capability or required version is
absent; keep the dry-run echo for e2e_env_is_dry_run, but replace the no-op path
with the real check referenced above.

In `@test/e2e/validation_suites/security/shields/00-config-consistent.sh`:
- Around line 8-10: The test is non-enforcing: it only prints a dry-run message
and never compares shield configuration, so update the script to perform an
actual consistency check instead of just echoing when using spc_assertion_id and
spc_require_context; call the runtime helper e2e_env_is_dry_run to skip real
verification only in dry-run mode, otherwise load the expected shields config
(e.g. from the test fixture or canonical source), fetch the current deployed
shields config (via the appropriate CLI/helper used elsewhere in the suite),
perform a deterministic comparison, and fail the script (non-zero exit and/or
use the suite's assert helper) when differences are detected so regressions are
caught.

---

Outside diff comments:
In `@test/e2e/validation_suites/suites.yaml`:
- Line 1: This YAML file is missing the required SPDX license header; add the
standard SPDX header comment at the top of the file (before the top-level key
"suites:") containing the copyright owner and the Apache-2.0 license identifier
(e.g., "SPDX-FileCopyrightText: <copyright holder>" and
"SPDX-License-Identifier: Apache-2.0") so the header precedes the existing
"suites:" entry.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: d438b12c-d741-4656-a877-5690f9003a87

📥 Commits

Reviewing files that changed from the base of the PR and between ca045a9 and 8e7b6e7.

📒 Files selected for processing (11)
  • test/e2e/docs/parity-map.yaml
  • test/e2e/scenario-framework-tests/e2e-lib-helpers.test.ts
  • test/e2e/scenario-framework-tests/e2e-suite-runner.test.ts
  • test/e2e/validation_suites/lib/security_policy_credentials.sh
  • test/e2e/validation_suites/security/credentials/00-credentials-present.sh
  • test/e2e/validation_suites/security/credentials/01-no-plaintext-host-store.sh
  • test/e2e/validation_suites/security/injection/00-telegram-message-not-shell-executed.sh
  • test/e2e/validation_suites/security/policy/00-telegram-preset-applied.sh
  • test/e2e/validation_suites/security/policy/01-openshell-version-supports-credential-rewrite.sh
  • test/e2e/validation_suites/security/shields/00-config-consistent.sh
  • test/e2e/validation_suites/suites.yaml

Comment thread test/e2e/validation_suites/lib/security_policy_credentials.sh
Comment thread test/e2e/validation_suites/lib/security_policy_credentials.sh
Comment thread test/e2e/validation_suites/security/shields/00-config-consistent.sh Outdated
@wscurran wscurran added E2E End-to-end testing — Brev infrastructure, test cases, nightly failures, and coverage gaps enhancement: testing Use this label to identify requests to improve NemoClaw test coverage. fix labels May 20, 2026
@wscurran
Copy link
Copy Markdown
Contributor

@jyaunches jyaunches added the v0.0.47 Release target label May 20, 2026
@cv cv enabled auto-merge (squash) May 21, 2026 02:45
@cv cv disabled auto-merge May 21, 2026 02:46
@cv cv merged commit 18c7265 into main May 21, 2026
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

E2E End-to-end testing — Brev infrastructure, test cases, nightly failures, and coverage gaps enhancement: testing Use this label to identify requests to improve NemoClaw test coverage. fix v0.0.47 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

test(e2e): migrate security policy and credential coverage

3 participants