Skip to content

fix(policy): scope Hermes messaging policies#3984

Merged
ericksoa merged 2 commits into
mainfrom
u/sdang/fix-hermes-discord-signed
May 21, 2026
Merged

fix(policy): scope Hermes messaging policies#3984
ericksoa merged 2 commits into
mainfrom
u/sdang/fix-hermes-discord-signed

Conversation

@sandl99
Copy link
Copy Markdown
Contributor

@sandl99 sandl99 commented May 21, 2026

Summary

This PR scopes Hermes messaging policy so selected channels are applied without pre-enabling every Hermes messaging provider. It also makes dynamic preset application use Hermes-specific policy content, preventing Discord from falling back to generic Node-oriented allowlists on Hermes sandboxes.
Screenshot 2026-05-21 at 16 27 44

Related Issue

Fixes #3981

Changes

  • Filter inactive Hermes messaging policy entries from the create-time sandbox policy.
  • Resolve built-in policy presets against the sandbox agent so Hermes can use Hermes-specific messaging entries.
  • Record selected channel policies already present in the prepared base policy as applied presets.
  • Clarify Hermes messaging entries as policy templates and add regression coverage for Discord-only Hermes policy behavior.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • make docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Additional checks run:

  • npm run build:cli
  • npm run typecheck:cli
  • npm run source-shape:check
  • npx vitest run src/lib/onboard/initial-policy.test.ts test/policies.test.ts
  • Git pre-commit hooks passed during git commit
  • Git pre-push hooks passed during git push (including TypeScript CLI, CLI tests, and source-shape budget)

Signed-off-by: San Dang sdang@nvidia.com

Summary by CodeRabbit

  • New Features

    • Support for agent-specific policy variants so presets can resolve to agent-provided policy variants.
    • Messaging channels are filtered so sandboxes include only active/selected channels.
  • Refactor

    • Preset application/loading is sandbox-aware and deduplicates applied preset names for clearer results.
  • Documentation

    • Clarified messaging policy header to note channel-template behavior and sandbox filtering.
  • Tests

    • Added tests covering agent-specific presets and messaging-channel filtering.

Review Change Stack

Signed-off-by: San Dang <sdang@nvidia.com>
@sandl99 sandl99 added bug Something isn't working OpenShell Support for OpenShell, a safe, private runtime for autonomous AI agents Sandbox Use this label to identify issues related to the NemoClaw isolated environment based on OpenShell. Integration: Hermes labels May 21, 2026
@sandl99 sandl99 self-assigned this May 21, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 21, 2026

📝 Walkthrough

Walkthrough

Adds Hermes messaging-channel filtering during initial sandbox policy preparation (producing a temporary filtered policy when creating Hermes sandboxes) and introduces sandbox-aware preset loading so preset application/removal uses agent-provided preset variants when available.

Changes

Hermes messaging policy filtering and agent-aware presets

Layer / File(s) Summary
Hermes policy filtering helpers and detection
src/lib/onboard/initial-policy.ts
Adds node:path import, HERMES_MESSAGING_POLICY_KEYS, YAML/object-shape helpers, Hermes policy-path detection, and filtering logic that writes a temporary filtered policy excluding inactive messaging channels.
Initial policy refactor, dedupe, and cleanup handling
src/lib/onboard/initial-policy.ts
Refactors prepareInitialSandboxCreatePolicy to accept agentName?, introduces buildCleanup() (nullable), reloads base policy after filtering, and deduplicates appliedPresets across existing channel and create-time presets.
Sandbox-aware preset loading and selection
src/lib/policy/index.ts
Adds loadPresetForSandbox() to parse a preset's network_policies, compute preset keys, load policyAdditionsPath from the sandbox agent when present, select matching agent-provided policies (full-keyset, direct name, alias, or value.name), synthesize a preset YAML with selected network_policies, or fall back to the built-in preset.
Use sandbox-aware loader in preset operations
src/lib/policy/index.ts
Updates removePreset, applyPreset, applyPresets, and getGatewayPresets to call loadPresetForSandbox(sandboxName, presetName) so preset operations reflect agent-selected variants.
Tests and Hermes policy additions comment
agents/hermes/policy-additions.yaml, src/lib/onboard/initial-policy.test.ts, test/policies.test.ts
Replaces the messaging comment in Hermes policy-additions.yaml to document channel templates; adds unit tests verifying Hermes filtering (temporary filtered policy creation and cleanup, appliedPresets composition) and an applyPresets test asserting Hermes-specific Discord Python binaries and scoped PATCH rules.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • NVIDIA/NemoClaw#2415: Related changes enforcing opt-in messaging egress and adjusting base policies / regression checks.
  • NVIDIA/NemoClaw#3672: Touches channels/preset removal logic which will interact with the sandbox-aware preset loading in this PR.
  • NVIDIA/NemoClaw#3512: Adds Hermes WeChat preset content that this PR's Hermes filtering could include or exclude during sandbox creation.

Suggested labels

fix, enhancement: policy

Suggested reviewers

  • ericksoa
  • cv

Poem

A rabbit trims the policies with care,
Hermes channels kept only where active are,
Discord runs Python, scoped and neat,
Telegram and Slack now quietly retreat,
Hooray — the sandbox hops light and spare! 🐰✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 31.25% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix(policy): scope Hermes messaging policies' is concise, clear, and directly summarizes the main change of restricting Hermes messaging policy entries to only selected channels.
Linked Issues check ✅ Passed The PR comprehensively addresses all coding requirements from issue #3981: filters inactive Hermes messaging policies, resolves agent-aware preset variants, prevents pre-enabling inactive channels, ensures Python binary allowlists for Hermes Discord, and includes regression test coverage.
Out of Scope Changes check ✅ Passed All changes are directly related to scoping Hermes messaging policies: YAML header clarification, Hermes-specific policy filtering logic, agent-aware preset resolution, preset application updates, and regression tests for Discord-only behavior.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch u/sdang/fix-hermes-discord-signed

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 21, 2026

E2E Advisor Recommendation

Required E2E: hermes-discord-e2e, hermes-slack-e2e, network-policy-e2e
Optional E2E: channels-stop-start-e2e, messaging-providers-e2e, hermes-e2e

Dispatch hint: hermes-discord-e2e,hermes-slack-e2e,network-policy-e2e

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • hermes-discord-e2e (~60 min; requires NVIDIA_API_KEY and Docker/OpenShell sandbox): Validates the main changed user flow: Hermes onboarding with Discord enabled, OpenShell placeholder/token isolation, native Discord gateway path, and policy behavior for the Hermes Discord sandbox.
  • hermes-slack-e2e (~60 min; requires NVIDIA_API_KEY and Docker/OpenShell sandbox): Validates agent-specific Hermes Slack policy content and Python/OpenShell credential rewrite path, which are directly affected by loading agent policy additions instead of only built-in presets.
  • network-policy-e2e (~45 min; requires NVIDIA_API_KEY and Docker/OpenShell sandbox): Covers live policy-add/remove, hot reload, deny-by-default, allowlist, inference exemption, and SSRF validation; needed because the PR changes generic preset loading/removal and gateway preset detection paths.

Optional E2E

  • channels-stop-start-e2e (~120 min; broad lifecycle coverage): Broad confidence for messaging channel lifecycle across OpenClaw and Hermes, including telegram, discord, wechat, slack, and whatsapp; useful because the PR changes active channel policy selection and the Hermes WeChat alias path, but it is relatively expensive.
  • messaging-providers-e2e (~75 min): Additional confidence that generic messaging provider/placeholder/L7-proxy behavior for Telegram, Discord, and Slack still works after policy preset routing changes.
  • hermes-e2e (~60 min): Baseline Hermes install/onboard/health/live inference smoke to catch unintended regressions in Hermes policy-additions or onboarding policy creation outside messaging-specific flows.

New E2E recommendations

  • Hermes inactive messaging policy filtering (high): Existing Hermes Discord/Slack E2Es validate selected channel behavior but do not appear to explicitly assert that a Discord-only Hermes sandbox lacks telegram, slack, and wechat_bridge network_policies in the loaded OpenShell policy.
    • Suggested test: Add an E2E assertion to Hermes Discord onboarding that openshell policy get --full contains discord: and does not contain inactive Hermes messaging policy keys (telegram:, slack:, wechat_bridge:).
  • Hermes WeChat agent policy alias (medium): The PR adds agent-specific alias resolution from the wechat preset to Hermes wechat_bridge; current unit coverage is good, but a focused live policy-add/remove E2E would better protect the OpenShell gateway policy mutation path.
    • Suggested test: Add a Hermes WeChat policy-add/remove E2E that applies wechat, verifies wechat_bridge is loaded in the gateway policy and wechat is not, then removes it and verifies wechat_bridge is removed.

Dispatch hint

  • Workflow: .github/workflows/nightly-e2e.yaml
  • jobs input: hermes-discord-e2e,hermes-slack-e2e,network-policy-e2e

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 21, 2026

PR Review Advisor

Recommendation: blocked
Confidence: high
Analyzed HEAD: 6024262a5148bd5ef7b0d3b56b99ac09e70ef108
Findings: 4 blocker(s), 4 warning(s), 0 suggestion(s)

This is an automated advisory review. A human maintainer must make the final merge decision.

Limitations: No commands, package-manager operations, tests, or PR scripts were executed by this advisory review.; The supplied diff is truncated; conclusions rely on trusted deterministic context plus the visible diff excerpts.; CI and E2E status is not final for head SHA 6024262.; E2E Advisor recommendations were found, but required job pass/fail results for the current head SHA were not supplied.; Linked issue #3981 has zero comments in the supplied trusted context; acceptance mapping uses the issue body clauses only.; Human maintainer review is required for final merge decisions, especially because this PR changes sandbox policy and network egress behavior.

Workflow run

Full advisor summary

PR Review Advisor

Base: origin/main
Head: HEAD
Analyzed SHA: 6024262a5148bd5ef7b0d3b56b99ac09e70ef108
Recommendation: blocked
Confidence: high

Do not merge yet: GitHub mergeability is BLOCKED, CI/E2E are still pending for 6024262, required Hermes/network-policy E2E evidence is missing, and src/lib/policy/index.ts grew as a current monolith hotspot.

Gate status

  • CI: pending — Trusted GraphQL/status context for head SHA 6024262 shows pending/in-progress/queued contexts including cli-parity, E2E recommendation, wsl-e2e, macos-e2e, PR review advisor, CodeQL, unit-vitest-linux, checks, ShellCheck SARIF, build-sandbox-images, build-sandbox-images-arm64, and CodeRabbit; trusted summary reports 13 status context(s) appear pending.
  • Mergeability: fail — GitHub reports mergeStateStatus=BLOCKED and mergeable_state=blocked for PR fix(policy): scope Hermes messaging policies #3984 at head SHA 6024262.
  • Review threads: pass — Trusted context reports 2 review thread(s), all resolved; both CodeRabbit findings are marked addressed in commit 6024262.
  • Risky code tested: fail — Although path heuristics reported no risky areas, the changed files modify sandbox create-time policy filtering and runtime policy application/removal/listing in agents/hermes/policy-additions.yaml, src/lib/onboard/initial-policy.ts, and src/lib/policy/index.ts. E2E Advisor requires hermes-e2e, hermes-discord-e2e, hermes-slack-e2e, and network-policy-e2e, and no pass evidence for those jobs at this head SHA is present.

🔴 Blockers

  • Required CI is still pending for the latest head SHA: The PR cannot be considered merge-ready while required checks are still in progress or queued for 6024262.
    • Recommendation: Wait for all required CI, CodeQL, build, unit, CodeRabbit, and platform/E2E recommendation checks to complete successfully for the exact head SHA before considering merge.
    • Evidence: Trusted GraphQL statusCheckRollup lists pending/in-progress/queued contexts including cli-parity, E2E recommendation, wsl-e2e, macos-e2e, CodeQL, unit-vitest-linux, checks, ShellCheck SARIF, build-sandbox-images, build-sandbox-images-arm64, PR review advisor, and CodeRabbit.
  • PR is blocked by GitHub mergeability state: GitHub currently reports the PR as blocked, so branch protection or other required merge gates have not been satisfied.
    • Recommendation: Resolve branch protection, required checks, review decision, and any other GitHub merge blockers before merge.
    • Evidence: Trusted context: mergeStateStatus=BLOCKED; mergeable_state=blocked; reviewDecision=REVIEW_REQUIRED.
  • Required Hermes and network-policy E2E evidence is missing for this head SHA: This PR changes sandbox policy boundaries and runtime policy preset resolution. Unit tests with synthetic YAML and fake openshell scripts are useful but cannot prove live OpenShell gateway behavior, credential rewrite, rebuild persistence, or inactive-policy absence.
    • Recommendation: Require hermes-e2e, hermes-discord-e2e, hermes-slack-e2e, and network-policy-e2e to pass for 6024262. Prefer adding live assertions that Discord-only Hermes policy lacks Telegram, Slack, and WeChat/wechat_bridge egress.
    • Evidence: E2E Advisor comment requires hermes-e2e, hermes-discord-e2e, hermes-slack-e2e, and network-policy-e2e. GraphQL shows E2E recommendation in progress; no required E2E pass evidence for the current head SHA is supplied.
  • Policy module monolith grew beyond the source-shape budget (src/lib/policy/index.ts:1): The current large hotspot src/lib/policy/index.ts grew by 96 lines. The trusted monolith delta marks this as a blocker because current monolith growth by 20 or more lines should be extracted or offset before merge.
    • Recommendation: Extract the new agent-specific preset loading/resolution helpers into a focused policy/agent module or offset the growth with equivalent reductions in src/lib/policy/index.ts.
    • Evidence: Trusted monolithDeltas: src/lib/policy/index.ts baseLines=1114, headLines=1210, delta=96, severity=blocker.

🟡 Warnings

  • Hermes WeChat create-time policy may not be recorded as an applied preset (src/lib/onboard/initial-policy.ts:268): The PR maps Hermes WeChat runtime preset application to the wechat_bridge policy key, but initial create-time applied-preset detection still records active channels only when basePolicyNames.has(channel). For active channel wechat, a Hermes base policy contains wechat_bridge rather than wechat, so a selected WeChat channel can be present in the create policy but omitted from appliedPresets.
    • Recommendation: Use HERMES_MESSAGING_POLICY_KEYS when deriving existingChannelPresets, so active channel wechat records as applied when wechat_bridge exists. Add a regression test for Hermes WeChat create-time appliedPresets.
    • Evidence: HERMES_MESSAGING_POLICY_KEYS maps wechat to ["wechat_bridge"], while existingChannelPresets is computed from activeMessagingChannels.filter((channel) => basePolicyNames.has(channel)). Added tests cover Discord filtering and runtime WeChat alias application, but not create-time WeChat applied-preset recording.
  • Agent-specific resolver coverage does not exercise all affected callers (test/policies.test.ts:536): The new loadPresetForSandbox path is used by applyPreset, applyPresets, removePreset, and getGatewayPresets. Added tests exercise applyPresets for Hermes Discord and Hermes WeChat with a fake openshell, but direct policy-add, remove, and gateway-list matching paths remain uncovered.
    • Recommendation: Add focused tests for applyPreset("hermes", "discord"), removePreset on Hermes Discord/WeChat alias content, getGatewayPresets matching Hermes-specific keys, and default/OpenClaw fallback to the generic Discord preset.
    • Evidence: Diff changes loadPreset calls in applyPreset, applyPresets, removePreset, and getGatewayPresets to loadPresetForSandbox. Added tests in test/policies.test.ts exercise applyPresets only.
  • Agent-specific policy resolution silently falls back to generic presets (src/lib/policy/index.ts:166): loadAgentPresetContent catches all errors and returns null, after which loadPresetForSandbox falls back to built-in preset content. This preserves compatibility, but for Hermes it can mask security-relevant failures in agent-specific policy resolution and reintroduce broader generic policy behavior.
    • Recommendation: Emit a debug or diagnostic warning when an agent sandbox has a policyAdditionsPath but agent-specific preset resolution fails, while still avoiding secret leakage. Consider tests for malformed agent policy fallback behavior.
    • Evidence: loadAgentPresetContent wraps registry lookup, agent loading, policy file reads, YAML parsing, and key selection in catch { return null; }; loadPresetForSandbox then returns builtinPresetContent when agent content is null.
  • Active overlapping PRs touch the same policy paths: The PR patches code that still exists and has recent history, with no rename drift detected, but there is active overlapping work touching src/lib/policy/index.ts, test/policies.test.ts, and src/lib/onboard/initial-policy.ts. This increases review and merge-drift risk for sandbox policy behavior.

🔵 Suggestions

  • None.

Acceptance coverage

  • partial — A Hermes sandbox configured for Discord can show Telegram, Slack, and WeChat policy entries in the live OpenShell policy.: src/lib/onboard/initial-policy.ts adds filterHermesInactiveMessagingPolicies; src/lib/onboard/initial-policy.test.ts asserts synthetic Discord-only Hermes policy removes telegram, slack, and wechat_bridge. No live OpenShell policy inspection has passed for this head SHA.
  • partial — That is not the expected least-privilege behavior: selecting one Hermes messaging channel should only enable that channel egress.: The filter removes inactive Hermes messaging keys and tests cover Discord-only filtering. Coverage is not complete for all channels at create/rebuild time, and the WeChat create-time appliedPresets mapping still appears incomplete.
  • partial — There is also a related Discord failure mode where applying the generic discord preset to a Hermes sandbox can leave the live policy with Node-oriented binary allowlists instead of Hermes/Python allowlists.: src/lib/policy/index.ts adds loadPresetForSandbox and loadAgentPresetContent; test/policies.test.ts asserts Hermes Discord applyPresets includes /usr/bin/python3* and /opt/hermes/.venv/bin/python and avoids generic PATCH /**. Direct applyPreset and live gateway behavior are not yet proven.
  • partial — Hermes Discord runs through Python, so Discord gateway traffic from /usr/bin/python3* or /opt/hermes/.venv/bin/python can be denied.: Hermes Discord test validates Python binary allowlists in generated policy. No completed hermes-discord-e2e evidence confirms live gateway traffic succeeds.
  • metagents/hermes/policy-additions.yaml is used as the Hermes create-time policy, but it currently contains multiple messaging provider entries.: agents/hermes/policy-additions.yaml still contains telegram, discord, slack, and wechat_bridge entries, now documented as messaging policy templates.
  • partial — Those entries are intended as Hermes-specific templates, but sandbox creation applies the file wholesale, so inactive providers become live policy.: The YAML comment now states these entries are templates and prepareInitialSandboxCreatePolicy filters inactive Hermes messaging policies. Live sandbox creation/rebuild evidence is still missing.
  • met — Separately, built-in preset application loads nemoclaw-blueprint/policies/presets/<preset>.yaml without considering the sandbox agent.: The previous loadPreset calls in applyPreset, applyPresets, removePreset, and getGatewayPresets are replaced with loadPresetForSandbox for built-in presets.
  • partial — For Hermes, this can apply generic OpenClaw/Node messaging policy instead of Hermes-specific policy.: loadPresetForSandbox can synthesize preset YAML from the registered sandbox agent's policyAdditionsPath, and tests cover Hermes Discord and WeChat applyPresets. Direct apply/remove/list and live behavior are not fully covered.
  • partial — OpenClaw/default sandboxes use the default OpenClaw base policy plus selected/default presets.: loadPresetForSandbox falls back to built-in preset content when the registry sandbox has no agent or no matching agent policy. No explicit new default/OpenClaw onboarding regression test is shown in the diff.
  • partial — Hermes sandboxes use the Hermes base policy for Hermes filesystem, binaries, inference, PyPI, and Nous endpoints.: filterHermesInactiveMessagingPolicies only deletes inactive messaging keys and should preserve pypi/nous/inference entries; initial-policy test preserves pypi while removing inactive messaging keys. No live Hermes create/rebuild E2E evidence is present.
  • partial — Hermes messaging policies are only applied for selected channels.: Discord-only unit coverage removes telegram, slack, and wechat_bridge from a Hermes policy. Runtime WeChat alias application is tested. Full channel matrix and live gateway policy assertions are missing.
  • partialpolicy-add discord on a Hermes sandbox applies Hermes-specific Discord policy, including Python binary allowlists.: The resolver used by applyPreset is changed to loadPresetForSandbox, and applyPresets has a Hermes Discord test proving Python allowlists in generated policy. A direct applyPreset/policy-add test and completed live E2E are still missing.
  • partial — A Discord-only Hermes sandbox must not retain Telegram, Slack, or WeChat egress.: src/lib/onboard/initial-policy.test.ts asserts the filtered synthetic policy contains only pypi and discord. E2E Advisor specifically recommends live openshell policy get --full assertions; no pass evidence is supplied.
  • partial — Creating or rebuilding a Hermes sandbox with only Discord enabled results in live policy containing Discord messaging egress but not Telegram, Slack, or WeChat messaging egress.: Unit tests cover create-time filtering logic, but creating/rebuilding a live Hermes sandbox is not proven. Required hermes-e2e and hermes-discord-e2e have not been shown passing for this head SHA.
  • partial — Applying discord to a Hermes sandbox uses Hermes-specific Discord rules and Python binary allowlists.: test/policies.test.ts verifies applyPresets for a registry sandbox with agent hermes includes Hermes-specific Discord policy and Python binaries. Direct applyPreset and live policy-set validation are missing.
  • partial — Applying discord to a default/OpenClaw sandbox still uses the generic Discord preset.: Code fallback should use builtinPresetContent for sandboxes without an agent or without agent policy matches, but no explicit default/OpenClaw Discord regression test is included in the shown diff.
  • partial — Existing policy preset registry behavior remains unchanged.: Existing batch apply test still asserts registry.policies equals ["npm", "pypi"], and new Hermes tests assert registry.policies records ["discord"] and ["wechat"]. removePreset and getGatewayPresets registry/list behavior are changed but lack targeted new tests.

Security review

  • pass — 1. Secrets and Credentials: No hardcoded tokens, passwords, API keys, PEMs, or credential files are added. The changed Hermes policy references credential rewrite controls but does not commit credential material.
  • pass — 2. Input Validation and Data Sanitization: Built-in preset names still resolve under PRESETS_DIR with path traversal checks. Sandbox names remain RFC1123-validated before policy set/remove paths. YAML parsing uses YAML.parse without eval-like deserialization. No new user-controlled shell-string construction was introduced in the changed logic.
  • pass — 3. Authentication and Authorization: No new endpoints, auth flows, or token-validation paths are introduced. The change selects sandbox-agent-specific policy content from the local registry/agent definitions and is intended to narrow Hermes messaging egress.
  • pass — 4. Dependencies and Third-Party Libraries: No new production or test dependencies are introduced in the changed files.
  • warning — 5. Error Handling and Logging: loadAgentPresetContent catches all errors and silently falls back to generic built-in preset content. That may mask Hermes agent-policy resolution failures that affect least-privilege security behavior. Add non-sensitive diagnostics for agent sandbox fallback cases.
  • pass — 6. Cryptography and Data Protection: Not applicable — no cryptographic operations, custom crypto, hashing, or data-at-rest/in-transit primitives are changed.
  • warning — 7. Configuration and Security Headers: The PR modifies security-sensitive network policy configuration and runtime preset resolution. The intended direction is least privilege, but live OpenShell policy behavior and required Hermes/network-policy E2E checks have not completed for the current head SHA.
  • warning — 8. Security Testing: Regression tests were added for Hermes inactive policy filtering, relative Hermes policy paths, Hermes Discord policy content, and Hermes WeChat alias application. However, sandbox/network-policy changes require live E2E validation; required hermes-e2e, hermes-discord-e2e, hermes-slack-e2e, and network-policy-e2e pass evidence is missing.
  • warning — 9. Holistic Security Posture: The change should improve least-privilege posture for Hermes messaging sandboxes, but merge should wait for CI/E2E completion. Remaining concerns include silent fallback to generic presets, incomplete create-time WeChat applied-preset recording, and untested remove/list caller paths.

Test / E2E status

  • Test depth: e2e_required — Runtime/sandbox/infrastructure paths need real execution coverage: agents/hermes/policy-additions.yaml, src/lib/onboard/initial-policy.ts, and src/lib/policy/index.ts affect create-time sandbox policy, live policy-add/remove/list behavior, network egress boundaries, and messaging credential rewrite behavior. Unit tests with YAML and fake openshell are helpful but insufficient.
  • E2E Advisor: missing
  • Required E2E jobs: hermes-e2e, hermes-discord-e2e, hermes-slack-e2e, network-policy-e2e
  • Missing for analyzed SHA: hermes-e2e, hermes-discord-e2e, hermes-slack-e2e, network-policy-e2e

✅ What looks good

  • The PR patches files that still exist on the branch; no rename drift was detected in the trusted drift evidence.
  • The Hermes policy YAML comment now correctly documents messaging entries as templates rather than unconditional create-time allowlists.
  • The initial-policy tests include negative assertions that inactive Telegram, Slack, and WeChat/wechat_bridge policies are removed from a Discord-only Hermes policy.
  • The relative Hermes policy path regression test addresses the resolved CodeRabbit path-detection finding.
  • The policy tests verify Hermes Discord uses Python binary allowlists and avoids falling back to overly broad generic mutation rules.
  • The policy tests add Hermes WeChat alias coverage for runtime applyPresets, addressing the resolved CodeRabbit alias finding.
  • No new dependencies or committed secrets were introduced.

Review completeness

  • No commands, package-manager operations, tests, or PR scripts were executed by this advisory review.
  • The supplied diff is truncated; conclusions rely on trusted deterministic context plus the visible diff excerpts.
  • CI and E2E status is not final for head SHA 6024262.
  • E2E Advisor recommendations were found, but required job pass/fail results for the current head SHA were not supplied.
  • Linked issue Hermes sandbox pre-enables inactive messaging policies #3981 has zero comments in the supplied trusted context; acceptance mapping uses the issue body clauses only.
  • Human maintainer review is required for final merge decisions, especially because this PR changes sandbox policy and network egress behavior.
  • Human maintainer review required: yes

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/onboard/initial-policy.ts`:
- Around line 203-206: isHermesPolicyPath currently only matches when the
normalized path ends with "/agents/hermes/policy-additions.yaml", so relative
variants like "agents/hermes/policy-additions.yaml" are missed; update
isHermesPolicyPath to normalize separators and then compare against a target
fragment (e.g., "agents/hermes/policy-additions.yaml") using either equality or
endsWith on the normalized string (for example check normalized === target ||
normalized.endsWith("/" + target) or simply normalized.endsWith(target)) to
correctly detect both relative and absolute paths; refer to the function name
isHermesPolicyPath and the parameter policyPath when making the change.

In `@src/lib/policy/index.ts`:
- Around line 108-128: selectAgentPolicyKeys fails to resolve Hermes presets
when built-in presetName (e.g., "wechat") doesn't match the Hermes policy key
(e.g., "wechat_bridge"); add an alias mapping lookup before the final fallback
so known mismatches map to the correct agent policy key(s). In
selectAgentPolicyKeys, after checking builtinKeys and direct presetName key, add
logic to consult a small alias map (e.g., { "wechat": ["wechat_bridge"] }) and
return any matching keys present in agentPolicies; keep the existing fallback
that matches value.name === presetName for other cases. Ensure you reference
selectAgentPolicyKeys, agentPolicies, presetName and builtinPresetContent when
implementing the alias lookup.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 5f790ed3-a5a8-4ca5-bdb9-1ed7d55c4264

📥 Commits

Reviewing files that changed from the base of the PR and between 18c7265 and 6b9aa3a.

📒 Files selected for processing (5)
  • agents/hermes/policy-additions.yaml
  • src/lib/onboard/initial-policy.test.ts
  • src/lib/onboard/initial-policy.ts
  • src/lib/policy/index.ts
  • test/policies.test.ts

Comment thread src/lib/onboard/initial-policy.ts
Comment thread src/lib/policy/index.ts
Signed-off-by: San Dang <sdang@nvidia.com>
@sandl99 sandl99 requested a review from ericksoa May 21, 2026 09:29
@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26218614241
Target ref: 6024262a5148bd5ef7b0d3b56b99ac09e70ef108
Workflow ref: main
Requested jobs: hermes-discord-e2e,hermes-slack-e2e,network-policy-e2e
Summary: 3 passed, 0 failed, 0 skipped

Job Result
hermes-discord-e2e ✅ success
hermes-slack-e2e ✅ success
network-policy-e2e ✅ success

Copy link
Copy Markdown
Contributor

@ericksoa ericksoa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed current head 6024262. This unblocks the Hermes Discord/Slack policy path, selective E2E is green, and the known WeChat alias bookkeeping risk is acceptable for this merge.

@ericksoa ericksoa merged commit 2e90f0f into main May 21, 2026
32 checks passed
@ericksoa ericksoa deleted the u/sdang/fix-hermes-discord-signed branch May 21, 2026 15:18
@sandl99 sandl99 added the VRDC Issues and PRs submitted by NVIDIA VRDC test team. label May 21, 2026
@senthilr-nv
Copy link
Copy Markdown
Contributor

On-device verification — DGX Spark (aarch64, GB10)

Validated u/sdang/fix-hermes-discord-signed (v0.0.48-4-g6024262a5) on DGX Spark with a fresh Hermes onboard (Discord + Telegram). Discord now survives the post-onboard policy reload.

Setup

  • Destroyed previous broken hermes-spark sandbox (v0.0.48, Discord permanently DENIED after policy reload)
  • Checked out this branch, rebuilt dist/ from source
  • Fresh nemoclaw onboard with NEMOCLAW_AGENT=hermes, DISCORD_BOT_TOKEN, DISCORD_SERVER_ID, TELEGRAM_BOT_TOKEN in env
  • Onboard completed in 46s, no manual intervention

Before (v0.0.48 — broken)

05:32:48 ALLOWED python3.13(147) -> discord.com:443 [policy:discord]      # initial connect
05:32:57 DENIED  discord.com:443 [policy:discord engine:l7]                # policy reload kills L7 tunnel
05:33:16 DENIED  python3.13(147) -> discord.com:443 [policy:- engine:opa]  # never recovers

After (this PR — fixed)

15:10:13 ALLOWED python3.13(148) -> discord.com:443 [policy:discord]
15:10:14 NET:UPGRADE gateway.discord.gg:443                                # WebSocket connected
  ... policy reload ...
15:10:33 ALLOWED python3.13(148) -> discord.com:443 [policy:discord]       # survives reload
15:10:55 ALLOWED python3.13(148) -> gateway-us-east1-c.discord.gg:443      # regional fallback works
15:15:24 ALLOWED POST .../channels/.../messages [policy:discord]            # bot active 5min later

Verification checklist

  • Discord connects and stays connected after policy reload (v1 to v3)
  • Regional Discord gateway fallback (gateway-us-east1-c.discord.gg) now ALLOWED (was DENIED on v0.0.48)
  • Discord bot sending typing + messages to channels 5+ min after boot
  • Telegram: reconnects after policy reload, getUpdates polling active, sendMessage confirmed
  • Inference: /v1/models returns hermes-agent, streaming chat completion working
  • Process tree clean: nemoclaw-start (PID 60) -> hermes gateway run (PID 148)
  • All 26 CI checks pass on the PR

LGTM — the Hermes-specific policy resolution fixes the Discord breakage we reported. Ready to merge.

miyoungc added a commit that referenced this pull request May 21, 2026
## Summary
Refreshes NemoClaw release notes for v0.0.47 and v0.0.48, then
regenerates the corresponding user-skill references so agent-facing docs
match the source pages.

Preview:
https://nvidia-preview-docs-release-notes-47-48.docs.buildwithfern.com/nemoclaw/about/release-notes

## Changes
- Adds explicit v0.0.47 and v0.0.48 sections to
`docs/about/release-notes.mdx`.
- Documents follow-up WSL Ollama, sandbox image, share mount, and
troubleshooting updates from recent release changes.
- Regenerates `nemoclaw-user-*` skill references from the Fern MDX
source docs.

## Source Summary
- #4003 -> `docs/about/release-notes.mdx`: Notes the messaging manifest
registry work as part of v0.0.48 release coverage.
- #3984 -> `docs/about/release-notes.mdx`: Captures Hermes messaging
policy scoping in the v0.0.48 release notes.
- #3963 -> `docs/about/release-notes.mdx`: Captures DGX Spark Hermes GPU
recreation startup recovery in the v0.0.48 release notes.
- #3961 -> `docs/about/release-notes.mdx`: Captures Discord loopback
proxy routing in the v0.0.48 release notes.
- #3940 -> `docs/about/release-notes.mdx`: Captures installer prompt
clarification and express-install behavior in the v0.0.48 release notes.
- #3946 -> `docs/about/release-notes.mdx`: Carries forward the Homebrew
preinstall clarification in release coverage.
- #3937 -> `docs/about/release-notes.mdx`: Carries forward the dashboard
URL command and post-install next steps coverage.
- #3921 -> `docs/about/release-notes.mdx`: Carries forward managed vLLM
default behavior for DGX Spark and DGX Station.
- #3931 -> `docs/about/release-notes.mdx`,
`docs/reference/architecture.mdx`: Documents the sandbox `python` to
`python3` compatibility symlink.
- #1485 -> `docs/about/release-notes.mdx`,
`docs/reference/architecture.mdx`: Documents the sandbox image Docker
health check.
- #3784 -> `docs/about/release-notes.mdx`: Captures VM-driver snapshot
health-check reliability in release notes.
- #3917 -> `docs/about/release-notes.mdx`: Captures package-based
workspace template resolution in release notes.
- #3170 -> `docs/about/release-notes.mdx`: Captures installer checksum
compatibility from preferring `sha256sum`.
- #3898 -> `docs/about/release-notes.mdx`: Adds v0.0.47 release coverage
for messaging provider scenario validation.
- #3897 -> `docs/about/release-notes.mdx`: Adds v0.0.47 release coverage
for baseline onboarding scenario validation.
- #3834 -> `docs/about/release-notes.mdx`: Adds v0.0.47 release coverage
for PR review advisor automation.
- #3838 -> `docs/about/release-notes.mdx`: Adds v0.0.47 release coverage
for CLI display registry refactoring.

## Type of Change
- [ ] Code change (feature, bug fix, or refactor)
- [ ] Code change with doc updates
- [ ] Doc only (prose changes, no code sample modifications)
- [x] Doc only (includes code sample changes)

## Verification
- [x] `npx prek run --all-files` passes
- [ ] `npm test` passes
- [ ] Tests added or updated for new or changed behavior
- [x] No secrets, API keys, or credentials committed
- [x] Docs updated for user-facing behavior changes
- [ ] `make docs` builds without warnings (doc changes only)
- [x] Doc pages follow the [style
guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md)
(doc changes only)
- [ ] New doc pages include SPDX header and frontmatter (new pages only)

`make docs` was attempted but could not complete because `npx fern-api`
failed with `403 Forbidden` from `https://registry.npmjs.org/fern-api`
in this environment. Pre-commit and pre-push hooks passed after
refreshing the local CLI build output with `npm run build:cli`; no build
artifacts were committed.

---
Signed-off-by: Miyoung Choi <miyoungc@nvidia.com>

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Documentation**
* Added WSL onboarding notes for Windows-host Ollama detection, restart
guidance, and PowerShell checks.
* Clarified express-install behavior (non-interactive, sudo prompts) and
default sandbox policy selection.
* Added Windows preparation guidance when installer tooling is missing
(winget/App Installer or Docker Desktop).
* Expanded sandbox docs with Docker health checks, Homebrew/python
compatibility helpers, share-mount path validation, Discord
troubleshooting, and new v0.0.48/v0.0.47 release notes.
* **Chores**
  * Improved docs preview workflow error handling.

<!-- review_stack_entry_start -->

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/NVIDIA/NemoClaw/pull/4007?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)

<!-- review_stack_entry_end -->
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working Integration: Hermes OpenShell Support for OpenShell, a safe, private runtime for autonomous AI agents Sandbox Use this label to identify issues related to the NemoClaw isolated environment based on OpenShell. VRDC Issues and PRs submitted by NVIDIA VRDC test team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Hermes sandbox pre-enables inactive messaging policies

3 participants