fix(shields): union live policy into permissive before applying#3976
fix(shields): union live policy into permissive before applying#3976laitingsheng wants to merge 3 commits into
Conversation
OpenShell refuses to remove a `filesystem_policy.read_only` or `filesystem_policy.read_write` entry on a live sandbox. The static `openclaw-sandbox-permissive.yaml` baseline never sees the runtime enrichment that NemoClaw adds at create time, so applying it on top of a live sandbox can hit "filesystem read_write path '<X>' cannot be removed on a live sandbox". This has reappeared with each new runtime-injected path: - #3168 — Hermes `/opt/hermes` (fixed by per-agent permissive YAML) - #3957 — OpenClaw `/home/linuxbrew` (fixed by patching the YAML) - #3942 — GPU sandboxes `/proc` (the current bug) Close the loop with a generic helper. Before `shields down` applies the permissive policy, fetch the live sandbox's policy via the snapshot we already capture and union its `filesystem_policy.read_write` and `filesystem_policy.read_only` into the base permissive YAML. The result is a strict superset of the live filesystem section, so OpenShell never sees a removal on transition. Resolution rules on overlap: a path that is `read_write` on the live side wins and is removed from `read_only` in the output — we never emit the same path in both lists, and we never downgrade a writable path to read-only as part of an unrelated transition. The helper takes its live-policy fetcher and base-policy reader as injected deps so it stays a pure function with no runtime cycle into the policy module, and so the regression test can drive it without mocking node_modules. When the live policy is empty, parses as an error, or omits the filesystem section, the helper degrades to the existing static path — matches prior behaviour rather than failing closed. Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughAdds a runtime permissive policy builder that merges live sandbox filesystem_policy lists into a base permissive YAML, uses it during shieldsDown when policyName is "permissive", and ensures any temporary merged policy file is cleaned up after applying the policy. ChangesRuntime permissive policy merging
Sequence DiagramsequenceDiagram
participant shieldsDown
participant buildRuntimePermissivePolicy
participant readBasePolicy
participant secureTempFile
participant run_apply
participant cleanupTempDir
shieldsDown->>buildRuntimePermissivePolicy: call(basePermissivePath, deps)
buildRuntimePermissivePolicy->>readBasePolicy: read base YAML
readBasePolicy-->>buildRuntimePermissivePolicy: base YAML
buildRuntimePermissivePolicy->>secureTempFile: write merged YAML -> temp path
secureTempFile-->>buildRuntimePermissivePolicy: temp path
buildRuntimePermissivePolicy-->>shieldsDown: return policyFile
shieldsDown->>run_apply: run(buildPolicySetCommand(policyFile, sandboxName))
run_apply-->>shieldsDown: apply result
shieldsDown->>cleanupTempDir: (finally) cleanupTempDir(policyFile) if temp
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint skipped: no ESLint configuration detected in root package.json. To enable, add Comment |
E2E Advisor RecommendationRequired E2E: Dispatch hint: Auto-dispatched E2E: Full advisor summaryE2E Recommendation AdvisorBase: Required E2E
Optional E2E
New E2E recommendations
Dispatch hint
|
PR Review AdvisorRecommendation: blocked This is an automated advisory review. A human maintainer must make the final merge decision. Limitations: Review used provided deterministic GitHub context and the supplied diff; tests, builds, package-manager commands, PR scripts, and live sandboxes were not executed by this advisory review.; CI was still pending for the current head SHA at review time, so final check outcomes are unknown.; Literal #3942 issue body/comments were not included in the trusted linkedIssues payload; #3942 acceptance mapping is limited to the PR body's untrusted statement.; The diff in the prompt is truncated if large, though all three declared changed files are represented.; Open PR overlaps on src/lib/shields/index.ts were identified from provided metadata but not resolved here. Full advisor summaryPR Review AdvisorBase: Runtime permissive-policy union looks directionally correct and the required shields-config E2E passed for this head SHA, but merge remains blocked by pending CI, GitHub mergeStateStatus=BLOCKED/review required, and the shields monolith growth blocker. Gate status
🔴 Blockers
🟡 Warnings
🔵 Suggestions
Acceptance coverage
Security review
Test / E2E status
✅ What looks good
Review completeness
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/lib/shields/permissive-runtime.ts`:
- Around line 59-60: Wrap the real filesystem boundaries in try/catch so any I/O
failure degrades to the static permissive policy instead of aborting:
specifically, guard calls to readBasePolicy(), secureTempFile(), and
writeFileSync() (and the block that uses safeYamlObject(baseYaml)) with error
handling that logs the exception and returns/uses the existing static permissive
policy path; do not catch internal helper logic—only catch around those I/O
calls and ensure the function returns the static fallback when an I/O error
occurs.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 213077a6-95e7-4a3b-9dc1-ac8221c26fc8
📒 Files selected for processing (3)
src/lib/shields/index.tssrc/lib/shields/permissive-runtime.tstest/permissive-runtime.test.ts
Selective E2E Results — ✅ All requested jobs passedRun: 26210119841
|
* Drop the local parsePolicyBlock clone: callers already invoke parseCurrentPolicy on the raw `openshell policy get` output, so the helper now takes the pre-parsed YAML body via `livePolicyYaml` instead of duplicating the header-stripping regex. shields/index.ts passes the `policyYaml` it already captured for the snapshot. * Wrap the boundary I/O — base-policy read, temp-file create, write — in try/catch. On any I/O failure the helper returns the static base path so shields-down keeps degrading to the existing apply path rather than aborting before the policy set ever runs. * Reword the doc comment: the helper unions filesystem path lists (read_only + read_write), not the entire filesystem_policy block. Other `filesystem_policy` fields such as `include_workdir` are preserved verbatim from the static base. Added a regression test to lock that behaviour in. Also: extra regression tests for the throwing-readBasePolicy and unparseable-base-YAML paths, and the include_workdir passthrough. Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/permissive-runtime.test.ts`:
- Around line 44-48: Guard the cleanup enqueue against fallback base-paths:
after calling buildRuntimePermissivePolicy (the function named
buildRuntimePermissivePolicy) check that the returned out value is not equal to
the sentinel base path (the literal "/unused-base.yaml" used as the unused base
or whatever basePath constant is used) before pushing it onto tempFilesToClean;
if it equals the sentinel, throw or fail the test immediately so afterEach
doesn't attempt to rmSync(path.dirname(out)) on root. Apply the same
check/early-fail to the other success-path tests that push into tempFilesToClean
(the other buildRuntimePermissivePolicy and similar calls around the other
success cases).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 705c5155-a8dd-4473-92c8-6904ab911436
📒 Files selected for processing (3)
src/lib/shields/index.tssrc/lib/shields/permissive-runtime.tstest/permissive-runtime.test.ts
Selective E2E Results — ✅ All requested jobs passedRun: 26211195274
|
* Track the mkdtemp directory and call cleanupTempDir in the
writeFileSync catch branch so a partial temp-file failure no longer
leaks a 0700 directory under /tmp.
* Expose an injectable writeTempPolicy dep so the write-failure
fallback can be driven from tests without monkey-patching node:fs.
Default behaviour (omit dep) keeps using secureTempFile +
fs.writeFileSync.
* Add a regression test that drives the write-failure branch and
asserts the helper returns the static base path.
* Tighten the test cleanup helper so it refuses to enqueue any
output whose dirname is not under os.tmpdir(). Earlier the tests
could have done `rm -rf path.dirname("/unused-base.yaml")` (i.e.
"/") if the helper ever degraded to the static path on a code
regression. The guarded helper now also asserts out !== basePath
in the success cases for an explicit early-warning signal.
* Reword the dep doc comment: "stays a pure transform" was
misleading because the helper does in fact do I/O (base read,
temp file write). Say "live-policy acquisition stays outside this
helper" instead.
Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
Selective E2E Results — ✅ All requested jobs passedRun: 26212840050
|
Summary
nemoclaw <name> shields downfails on GPU sandboxes with:OpenShell refuses to remove a
filesystem_policy.read_onlyorfilesystem_policy.read_writeentry on a live sandbox. The staticopenclaw-sandbox-permissive.yamlbaseline never sees the runtime enrichment that NemoClaw adds at create time, so applying it on top of a live sandbox triggers the rejection whenever a runtime-injected path is missing from the static YAML.This has reappeared with each new runtime-injected path:
/opt/hermes(closed by per-agent permissive YAML)/home/linuxbrew(closed by patching the static YAML in 60d3e5e)/proc(this PR)Close the loop with a generic helper. Before
shields downapplies the permissive policy, fetch the live sandbox's policy via the snapshot we already capture and union itsfilesystem_policy.read_writeandfilesystem_policy.read_onlyinto the base permissive YAML. The result has filesystem path lists that are a superset of the live filesystem section, so OpenShell never sees a removal on theshields downtransition. Future runtime-injected paths are absorbed automatically on theshields downpath. Other permissive-apply surfaces (applyPermissivePolicyinsrc/lib/policy/index.ts— currently unused by any in-source caller — and the E2E scripts that read the static YAML directly) are unchanged and continue to rely on the static file being correct; the earlier whack-a-mole fixes for #3168 and #3957 are kept for those paths.Related Issue
Partially addresses #3942 - Part 2
Closes #3957
Changes
buildRuntimePermissivePolicy(basePath, deps). Takes the already-parsed live policy YAML body (from the caller'sparseCurrentPolicy(rawPolicy)) and a lazy base-policy reader as injected deps. Unionsfilesystem_policy.read_only+filesystem_policy.read_writeinto the static base. Otherfilesystem_policyfields (e.g.include_workdir) are preserved verbatim. RW wins on path overlap. Returns the static base path verbatim when (a) the live YAML has no filesystem path lists, (b) the base YAML cannot be parsed, or (c) base-read / temp-file I/O throws — so an I/O fault degrades to the existing static apply path rather than aborting shields-down. IfsecureTempFilealready created its mkdtemp directory before the subsequentwriteFileSyncthrows, the directory is cleaned up viacleanupTempDirso the failure path never leaks 0700 dirs under/tmp. AwriteTempPolicydep is exposed so tests can drive the write-failure branch without monkey-patchingnode:fs.buildRuntimePermissivePolicyin. Reuse the already-capturedpolicyYaml(parsed viaparseCurrentPolicy(rawPolicy)) so there is no duplicate parser. Wrap theopenshell policy setcall intry / finallyso the temp file's mkdtemp directory is cleaned up viacleanupTempDir./procpreservation on GPU sandboxes,include_workdirpassthrough, dedup across lists, the RW-wins rule on conflicts, and five degraded paths (empty live YAML, no filesystem section,readBasePolicythrows, base YAML unparseable,writeTempPolicythrows) returning the static base path. Cleanup helper rejects any output that is not underos.tmpdir()to avoid an accidentalrm -rfon the user's checkout when the helper degrades to the static base path.Type of Change
Verification
npx prek run --all-filespassesnpm testpassesmake docsbuilds without warnings (doc changes only)Signed-off-by: Tinson Lai tinsonl@nvidia.com
Summary by CodeRabbit