Skip to content

fix: remove proxy hooks from sandbox rc files#3842

Open
chengjiew wants to merge 4 commits into
mainfrom
fix/3835_no_proxy_rc_entries
Open

fix: remove proxy hooks from sandbox rc files#3842
chengjiew wants to merge 4 commits into
mainfrom
fix/3835_no_proxy_rc_entries

Conversation

@chengjiew
Copy link
Copy Markdown
Contributor

@chengjiew chengjiew commented May 20, 2026

Summary

  • Keep sandbox .bashrc and .profile free of NemoClaw proxy source stanzas.
  • Clean legacy runtime proxy shims from sandbox user rc files at startup while preserving other rc content.
  • Update e2e/docs/tests so proxy env is sourced only through root-owned system hooks.

Test Plan

  • npm test -- test/service-env.test.ts test/sandbox-provisioning.test.ts test/repro-2376.test.ts
  • bash -n scripts/nemoclaw-start.sh scripts/lib/sandbox-init.sh test/e2e-gateway-isolation.sh
  • git diff --check
  • Linux e2e on aits-log-worker-6:
    • fixed image: NEMOCLAW_TEST_IMAGE=nemoclaw-production-3835 bash test/e2e-gateway-isolation.sh -> 34 passed, 0 failed
    • pre-fix image with stricter e2e: 32 passed, 2 failed at .bashrc/.profile proxy checks

Notes

  • Local pre-commit/pre-push full CLI coverage was not used for delivery because the hook environment causes unrelated git fixture failures while running tests that create temporary repositories. The focused tests above and Linux e2e passed.

Summary by CodeRabbit

  • Documentation

    • Clarified sandbox shell initialization: per-user rc files are pre-created locked without embedded proxy entries; interactive sessions receive gateway auth via system-wide shell hooks.
  • Refactor

    • Moved runtime proxy sourcing out of per-user rc files and added startup cleanup to remove legacy proxy shims from user rc files.
  • Tests

    • Updated tests and end-to-end checks to validate the new RC initialization and proxy-cleanup behavior.

Review Change Stack


Signed-off-by: Chengjie Wang chengjiew@nvidia.com

Signed-off-by: Test User <test@example.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 20, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 20, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 57c8cc1a-a692-4a68-b280-1b320a4d5b0d

📥 Commits

Reviewing files that changed from the base of the PR and between 5734e76 and 40b32a0.

📒 Files selected for processing (2)
  • docs/security/best-practices.mdx
  • scripts/nemoclaw-start.sh
✅ Files skipped from review due to trivial changes (1)
  • docs/security/best-practices.mdx

📝 Walkthrough

Walkthrough

The PR moves runtime proxy env sourcing out of per-user rc files: base images pre-create clean rc files, startup removes legacy proxy shims, and system-wide shell hooks read /tmp/nemoclaw-proxy-env.sh to populate interactive shells.

Changes

Proxy sourcing architecture refactor: rc files to system-wide hooks

Layer / File(s) Summary
Base image and runtime shim cleanup
Dockerfile.base, scripts/nemoclaw-start.sh
Dockerfile.base stops embedding proxy sourcing lines in pre-created .bashrc and .profile. The ensure_runtime_shell_env_shim() function in scripts/nemoclaw-start.sh shifts from backfilling proxy shims to removing legacy proxy-env sourcing lines via awk filtering, with ownership/permission normalization and verification that the shim is removed.
Documentation and inline comments
docs/deployment/sandbox-hardening.mdx, docs/security/best-practices.mdx, scripts/lib/sandbox-init.sh
Documentation and comments now describe system-wide shell hooks (from /etc) sourcing /tmp/nemoclaw-proxy-env.sh for interactive sandbox sessions instead of per-user rc-file sourcing.
Tests updated for new behavior
test/e2e-gateway-isolation.sh, test/repro-2376.test.ts, test/sandbox-provisioning.test.ts, test/service-env.test.ts
E2E and unit tests now assert that user rc files contain no proxy sourcing, confirm legacy proxy shims are removed during startup, and verify non-proxy initialization lines (PATH, umask) are preserved.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

Suggested labels

Docker, OpenShell, E2E, status: rfr

Suggested reviewers

  • jyaunches

Poem

🐰 I tidy rc files with a careful paw,
System hooks now hum what users saw.
Old shims tucked out, /tmp holds the key,
Sandbox shells wake clean and free.
Hop — the session greets the world with glee.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix: remove proxy hooks from sandbox rc files' directly and specifically describes the main change: removing proxy configuration hooks from sandbox user rc files. It accurately reflects the primary objective of the PR.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/3835_no_proxy_rc_entries

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.


Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

Comment thread test/service-env.test.ts Fixed
Comment thread test/service-env.test.ts Fixed
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 20, 2026

E2E Advisor Recommendation

Required E2E: test-e2e-gateway-isolation, test-e2e-sandbox, test-non-root-sandbox-smoke, openclaw-onboard-security-posture-e2e
Optional E2E: sandbox-operations-e2e, network-policy-e2e, hermes-onboard-security-posture-e2e

Dispatch hint: openclaw-onboard-security-posture-e2e

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • test-e2e-gateway-isolation (medium): Directly covers the changed security boundary: root/gateway/sandbox user separation, locked rc files, /tmp/nemoclaw-proxy-env.sh permissions, /etc/profile.d and /etc/bash.bashrc proxy hooks, and bash -ic/-lc env propagation.
  • test-e2e-sandbox (medium): Builds the updated sandbox image and runs the core in-container sandbox smoke suite, which is required for Dockerfile.base changes that can affect installed OpenClaw, blueprint assets, and baseline sandbox layout.
  • test-non-root-sandbox-smoke (low): The entrypoint now cleans legacy rc-file proxy shims before locking files. This job catches no-new-privileges/non-root startup regressions in the setup chain, a known failure mode for rc-file handling.
  • openclaw-onboard-security-posture-e2e (high): Full OpenClaw onboard path with security posture assertions for trusted rc files, runtime proxy-env permissions, configure guards, startup logs, gateway health, and live inference. This is the closest real-user-flow E2E for the changed shell-hook and token propagation behavior.

Optional E2E

  • sandbox-operations-e2e (high): Useful confidence for real sandbox connect/logs/exec and gateway recovery flows after changing how connect shells receive proxy and gateway environment. Not merge-blocking if the focused security posture E2E passes.
  • network-policy-e2e (high): Adjacent confidence that proxy-env and shell-hook changes do not alter sandbox egress through the OpenShell gateway or policy enforcement. The PR does not change policy YAML, so this is optional rather than required.
  • hermes-onboard-security-posture-e2e (high): Shared sandbox-init trust-boundary helpers and related Hermes regression context make this a useful cross-agent check, though the primary runtime code change is in the OpenClaw start script and Dockerfile.base.

New E2E recommendations

  • live-connect-shell-env (high): Existing image-level gateway isolation tests validate bash -ic/-lc sourcing, and the security posture E2E manually sources /tmp/nemoclaw-proxy-env.sh. There is still a gap for a live OpenShell connect/SSH session proving OPENCLAW_GATEWAY_TOKEN, proxy vars, and command guards are automatically present via system-wide hooks while /sandbox/.bashrc and /sandbox/.profile remain clean.
    • Suggested test: Add a live OpenClaw connect-shell security E2E that onboards a sandbox, runs an actual openshell sandbox connect or SSH interactive shell probe, asserts proxy/gateway env auto-sourcing, verifies openclaw configure is blocked without manual sourcing, and confirms per-user rc files contain no proxy-env stanza.

Dispatch hint

  • Workflow: nightly-e2e.yaml
  • jobs input: openclaw-onboard-security-posture-e2e

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
scripts/nemoclaw-start.sh (1)

1776-1855: Add the recommended entrypoint E2E matrix for this refactor.

This function runs on every sandbox boot and affects runtime trust-boundary files; please run the recommended sandbox-survival/sandbox-operations/cloud/OpenClaw pairing jobs before merge.

As per coding guidelines: "scripts/nemoclaw-start.sh: changes affect every sandbox boot and are invisible to unit tests."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/nemoclaw-start.sh` around lines 1776 - 1855, The change in
ensure_runtime_shell_env_shim touches sandbox startup and trusted user rc files;
before merging, add and run the recommended entrypoint E2E matrix
(sandbox-survival, sandbox-operations, cloud/OpenClaw pairing) to validate this
refactor—ensure the test matrix covers scenarios around _RUNTIME_SHELL_ENV_SHIM
removal, symlink/non-regular rc_file cases, and ownership/permission adjustments
done by ensure_runtime_shell_env_shim so these integration jobs pass prior to
merge.
docs/deployment/sandbox-hardening.mdx (1)

109-109: ⚡ Quick win

Use active voice in this sentence.

Consider rewriting to active voice (for example: “System-wide shell hooks read /tmp/nemoclaw-proxy-env.sh to source runtime proxy configuration.”).

As per coding guidelines: "Active voice required. Flag passive constructions."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/deployment/sandbox-hardening.mdx` at line 109, The sentence "Runtime
proxy configuration is sourced from system-wide shell hooks that read
`/tmp/nemoclaw-proxy-env.sh`." uses passive voice; rewrite it in active voice
such as "System-wide shell hooks read `/tmp/nemoclaw-proxy-env.sh` to source
runtime proxy configuration." Update the text in the docs line containing that
exact sentence so it uses the active construction (refer to the string
`/tmp/nemoclaw-proxy-env.sh` and the phrase "runtime proxy configuration") and
preserve meaning and punctuation.
Dockerfile.base (1)

129-140: Run the broader image-level E2E suite before merge.

This change affects base-image shell init behavior and should also be validated with the recommended nightly sandbox/cloud jobs, not only focused unit/e2e checks.

As per coding guidelines: "Dockerfile.base: Layer ordering, permissions, and baked config changes are only testable with a real container build."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Dockerfile.base` around lines 129 - 140, This change to Dockerfile.base
modifies the base image shell init files (/sandbox/.bashrc and
/sandbox/.profile) and their ownership/permissions (the RUN block creating files
and the chown/chmod commands), so before merging run the full image-level
E2E/nightly sandbox and cloud jobs (the recommended broader suite) to validate
layer ordering, permissions, and baked config behavior in a real container
build; ensure those pipeline jobs pass and attach their results to the PR, or
revert/adjust the Dockerfile.base RUN block if tests reveal issues with
initialization, ownership, or permission semantics.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/nemoclaw-start.sh`:
- Around line 1819-1843: The temp file for rc cleanup (tmp_file) is created
using a predictable path "${rc_file}.nemoclaw-clean.$$", which allows symlink
attacks when running as root; change the flow in the rc cleanup around the
tmp_file variable and the following awk + cat replacement to use a
safely-created temporary file from mktemp (e.g., call mktemp to allocate a
unique, non-symlinkable file before running awk, write awk output into that
file, ensure the atomic replace via mv or install instead of direct cat
>"$rc_file", and keep cleanup (rm) on error/exit); update references to tmp_file
used in the awk redirect and subsequent cat/mv so they point to the
mktemp-created file and ensure any early exits remove the temp file.

---

Nitpick comments:
In `@Dockerfile.base`:
- Around line 129-140: This change to Dockerfile.base modifies the base image
shell init files (/sandbox/.bashrc and /sandbox/.profile) and their
ownership/permissions (the RUN block creating files and the chown/chmod
commands), so before merging run the full image-level E2E/nightly sandbox and
cloud jobs (the recommended broader suite) to validate layer ordering,
permissions, and baked config behavior in a real container build; ensure those
pipeline jobs pass and attach their results to the PR, or revert/adjust the
Dockerfile.base RUN block if tests reveal issues with initialization, ownership,
or permission semantics.

In `@docs/deployment/sandbox-hardening.mdx`:
- Line 109: The sentence "Runtime proxy configuration is sourced from
system-wide shell hooks that read `/tmp/nemoclaw-proxy-env.sh`." uses passive
voice; rewrite it in active voice such as "System-wide shell hooks read
`/tmp/nemoclaw-proxy-env.sh` to source runtime proxy configuration." Update the
text in the docs line containing that exact sentence so it uses the active
construction (refer to the string `/tmp/nemoclaw-proxy-env.sh` and the phrase
"runtime proxy configuration") and preserve meaning and punctuation.

In `@scripts/nemoclaw-start.sh`:
- Around line 1776-1855: The change in ensure_runtime_shell_env_shim touches
sandbox startup and trusted user rc files; before merging, add and run the
recommended entrypoint E2E matrix (sandbox-survival, sandbox-operations,
cloud/OpenClaw pairing) to validate this refactor—ensure the test matrix covers
scenarios around _RUNTIME_SHELL_ENV_SHIM removal, symlink/non-regular rc_file
cases, and ownership/permission adjustments done by
ensure_runtime_shell_env_shim so these integration jobs pass prior to merge.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 7585cf58-a5a3-46bd-b900-3063a7aa030f

📥 Commits

Reviewing files that changed from the base of the PR and between 6b8bb5e and 31cfbd0.

📒 Files selected for processing (9)
  • Dockerfile.base
  • docs/deployment/sandbox-hardening.mdx
  • docs/security/best-practices.mdx
  • scripts/lib/sandbox-init.sh
  • scripts/nemoclaw-start.sh
  • test/e2e-gateway-isolation.sh
  • test/repro-2376.test.ts
  • test/sandbox-provisioning.test.ts
  • test/service-env.test.ts

Comment thread scripts/nemoclaw-start.sh
Comment on lines +1819 to +1843
local tmp_file
tmp_file="${rc_file}.nemoclaw-clean.$$"
if ! awk -v shim="$_RUNTIME_SHELL_ENV_SHIM" '
$0 == "# Source runtime proxy config" {
if ((getline next_line) > 0) {
if (next_line == shim || next_line ~ /\/tmp\/nemoclaw-proxy-env\.sh/) {
next
}
print $0
print next_line
next
}
}
$0 == shim { next }
$0 ~ /\/tmp\/nemoclaw-proxy-env\.sh/ { next }
{ print }
' "$rc_file" >"$tmp_file"; then
rm -f "$tmp_file"
echo "[SECURITY] could not clean runtime env shim from $rc_file" >&2
failed=1
continue
fi
if ! cat "$tmp_file" >"$rc_file"; then
rm -f "$tmp_file"
echo "[SECURITY] could not replace cleaned rc file: $rc_file" >&2
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Use mktemp for rc cleanup temp files to prevent symlink clobber as root.

The temp path is predictable and created under /sandbox (sandbox-writable). A pre-planted symlink can redirect root writes during startup.

🔒 Suggested hardening
-    local tmp_file
-    tmp_file="${rc_file}.nemoclaw-clean.$$"
+    local tmp_file
+    tmp_file="$(mktemp "${rc_file}.nemoclaw-clean.XXXXXX")" || {
+      echo "[SECURITY] could not allocate temp file for rc cleanup: $rc_file" >&2
+      failed=1
+      continue
+    }
@@
-    if ! cat "$tmp_file" >"$rc_file"; then
+    if ! mv -f "$tmp_file" "$rc_file"; then
       rm -f "$tmp_file"
       echo "[SECURITY] could not replace cleaned rc file: $rc_file" >&2
       failed=1
       continue
     fi
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/nemoclaw-start.sh` around lines 1819 - 1843, The temp file for rc
cleanup (tmp_file) is created using a predictable path
"${rc_file}.nemoclaw-clean.$$", which allows symlink attacks when running as
root; change the flow in the rc cleanup around the tmp_file variable and the
following awk + cat replacement to use a safely-created temporary file from
mktemp (e.g., call mktemp to allocate a unique, non-symlinkable file before
running awk, write awk output into that file, ensure the atomic replace via mv
or install instead of direct cat >"$rc_file", and keep cleanup (rm) on
error/exit); update references to tmp_file used in the awk redirect and
subsequent cat/mv so they point to the mktemp-created file and ensure any early
exits remove the temp file.

@chengjiew chengjiew changed the title Fix proxy hook leakage into sandbox rc files fix: remove proxy hooks from sandbox rc files May 20, 2026
Signed-off-by: Chengjie Wang <chengjiew@nvidia.com>
@chengjiew chengjiew added the v0.0.47 Release target label May 20, 2026
@wscurran wscurran added fix Sandbox Use this label to identify issues related to the NemoClaw isolated environment based on OpenShell. labels May 20, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 21, 2026

PR Review Advisor

Recommendation: blocked
Confidence: high
Analyzed HEAD: 40b32a0420f01f13add1f6e5ea9f0b2b291f5b54
Findings: 3 blocker(s), 2 warning(s), 0 suggestion(s)

This is an automated advisory review. A human maintainer must make the final merge decision.

Limitations: Review is based on the provided trusted metadata, current read-only file inspection, and the supplied diff; no scripts/tests/package-manager commands were executed.; Linked issue text/comments were not available in trusted metadata, so linked-issue acceptance clauses could not be extracted literally.; CI and E2E status may change after the captured context; this result reflects the provided head SHA and status rollup.; The diff was provided as truncated-if-large; reviewed evidence focuses on the shown changed hunks and trusted context.

Workflow run

Full advisor summary

PR Review Advisor

Base: origin/main
Head: HEAD
Analyzed SHA: 40b32a0420f01f13add1f6e5ea9f0b2b291f5b54
Recommendation: blocked
Confidence: high

Blocked: the PR improves rc-file trust boundaries in intent, but the startup cleanup still uses a predictable root-written temp path under /sandbox, mergeability is blocked, one review thread remains unresolved, and required E2E pass evidence is missing for the head SHA.

Gate status

  • CI: pass — 5 required status context(s) completed with no failures. Non-required contexts still pending: 0; failed: 0.
  • Mergeability: fail — mergeStateStatus=BLOCKED
  • Review threads: fail — 1 unresolved review thread(s).
  • Risky code tested: warning — Risky areas detected (installer/bootstrap shell, onboarding/host glue); test files changed, but coverage still needs semantic review.

🔴 Blockers

  • Predictable rc cleanup temp path can be symlink-clobbered during root startup (scripts/nemoclaw-start.sh:1892): ensure_runtime_shell_env_shim writes awk output to tmp_file="${rc_file}.nemoclaw-clean.$$" beside /sandbox/.bashrc or /sandbox/.profile, then copies it back with cat "$tmp_file" >"$rc_file". Because /sandbox is sandbox-writable and this startup path can run as root, a sandbox user can pre-create the predictable path as a symlink before restart and redirect root writes or otherwise interfere with cleanup in a security-critical trust-boundary path.
    • Recommendation: Allocate the cleanup file with mktemp in a safe pattern, verify it is a regular file owned as expected, write awk output to that file, and replace the rc file with mv/install rather than cat redirection. Add a regression test that pre-plants a symlink or predictable temp path and verifies startup refuses or safely handles it.
    • Evidence: Current code still contains tmp_file="${rc_file}.nemoclaw-clean.$$" followed by awk output redirection to "$tmp_file" and cat "$tmp_file" >"$rc_file". The unresolved CodeRabbit review thread at scripts/nemoclaw-start.sh line 1916 flags the same symlink-clobber risk.
  • Hard gates are not satisfied: GitHub reports mergeStateStatus=BLOCKED and the PR still has one unresolved review thread. These are hard gates for sandbox startup and host-glue changes.
    • Recommendation: Resolve the outstanding review thread, re-check mergeability for head SHA 40b32a0, and only then re-run advisory review.
    • Evidence: Trusted gate status reports mergeability fail with mergeStateStatus=BLOCKED and reviewThreads fail with 1 unresolved review thread.
  • Required E2E jobs are not evidenced as passed for the head SHA: The PR changes Dockerfile.base and sandbox startup shell trust-boundary behavior, so unit/source-shape tests cannot fully prove runtime behavior in real OpenShell sessions. The E2E Advisor requires four jobs, but the trusted status evidence does not show those required jobs passing for the analyzed head SHA.
    • Recommendation: Confirm the E2E Advisor-required jobs pass for 40b32a0: test-e2e-gateway-isolation, test-e2e-sandbox, test-non-root-sandbox-smoke, and openclaw-onboard-security-posture-e2e. Include live connect-shell coverage if possible for automatic system-hook sourcing while user rc files remain clean.
    • Evidence: E2E Advisor comment lists Required E2E: test-e2e-gateway-isolation, test-e2e-sandbox, test-non-root-sandbox-smoke, openclaw-onboard-security-posture-e2e. Trusted rollup/comment evidence does not show those required jobs passed for the head SHA; a comment only says selected E2E jobs were dispatched.

🟡 Warnings

🔵 Suggestions

  • None.

Acceptance coverage

  • unknown — No linked issue clauses were available in trusted metadata for PR fix: remove proxy hooks from sandbox rc files #3842.: github.linkedIssues is an empty array. The branch name suggests relation to [Nemoclaw][All Platforms] nemoclaw still injects proxy source stanza into .bashrc/.profile, but proxy test expects no proxy entries in RC files #3835, but linked issue text/comments were not provided as trusted acceptance clauses.
  • partial — Keep sandbox .bashrc and .profile free of NemoClaw proxy source stanzas.: Dockerfile.base now creates locked .bashrc/.profile with NemoClaw init comments and no proxy source lines; e2e and provisioning tests assert no proxy entries. However, legacy cleanup that enforces this at startup has a blocker unsafe temp-file implementation.
  • partial — Clean legacy runtime proxy shims from sandbox user rc files at startup while preserving other rc content.: scripts/nemoclaw-start.sh adds awk filtering in ensure_runtime_shell_env_shim, and test/service-env.test.ts checks shim removal while preserving PATH/umask lines. The implementation uses a predictable temp file under /sandbox and lacks a symlink/pre-planted-temp negative regression test.
  • partial — Update e2e/docs/tests so proxy env is sourced only through root-owned system hooks.: Docs and tests were updated to describe/assert system-wide hooks; Dockerfile.base uses /etc/profile.d/nemoclaw-proxy.sh and prepends /etc/bash.bashrc. Required E2E pass evidence for the current head SHA is still missing.

Security review

  • pass — Category 1: Secrets and Credentials: No new hardcoded secrets, API keys, passwords, or committed credential files were identified. OPENCLAW_GATEWAY_TOKEN propagation remains via /tmp/nemoclaw-proxy-env.sh and is documented as intentionally available to sandbox shells for local gateway access.
  • fail — Category 2: Input Validation and Data Sanitization: The rc cleanup path uses a predictable temp filename under a sandbox-writable path and root redirection/copy behavior. This is a filesystem trust-boundary validation failure that can enable symlink clobber or tampering during startup.
  • pass — Category 3: Authentication and Authorization: No new endpoints or authorization checks are introduced. Gateway token exposure semantics are not expanded beyond the existing model of exporting the token to interactive sandbox shells through the runtime env file.
  • pass — Category 4: Dependencies and Third-Party Libraries: No new dependencies are added. The Docker base image remains pinned by digest, apt packages remain version-pinned in the shown diff, and gosu checksum verification is unchanged.
  • warning — Category 5: Error Handling and Logging: The cleanup function logs security errors for refusal/failure paths and does not appear to leak secrets, but the unsafe temp-file path can be exploited before those errors are detected.
  • pass — Category 6: Cryptography and Data Protection: Not applicable — no cryptographic algorithms, key handling, or data-at-rest/in-transit protection mechanisms are changed.
  • warning — Category 7: Configuration and Security Headers: Moving proxy sourcing from per-user rc files to root-owned system hooks is directionally positive, and the hooks are chmod 444. However, the legacy rc cleanup path is part of configuration hardening and currently uses an unsafe temp-file replacement pattern.
  • warning — Category 8: Security Testing: The PR adds/updates tests for clean rc files, system-wide hooks, and legacy shim removal. Missing coverage remains for the symlink-clobber path, and required real E2E jobs for live OpenShell/session behavior are not evidenced as passed for the current head SHA.
  • fail — Category 9: Holistic Security Posture: The intended posture improvement of removing writable-home rc proxy trust is undermined by adding a root startup file-rewrite path with predictable temp files under /sandbox. This is a sandbox escape/host-glue class risk in a security-critical lifecycle path.

Test / E2E status

  • Test depth: e2e_required — Runtime/sandbox/infrastructure paths need real execution coverage: Dockerfile.base, docs/deployment/sandbox-hardening.mdx, docs/security/best-practices.mdx, scripts/lib/sandbox-init.sh, scripts/nemoclaw-start.sh.
  • E2E Advisor: missing
  • Required E2E jobs: test-e2e-gateway-isolation, test-e2e-sandbox, test-non-root-sandbox-smoke, openclaw-onboard-security-posture-e2e
  • Missing for analyzed SHA: test-e2e-gateway-isolation, test-e2e-sandbox, test-non-root-sandbox-smoke, openclaw-onboard-security-posture-e2e

✅ What looks good

  • The design direction moves proxy sourcing away from user rc files toward root-owned system-wide hooks, reducing reliance on per-user startup files once the cleanup bug is fixed.
  • Dockerfile.base keeps the base image pinned by digest and retains pinned apt package versions in the shown diff.
  • Tests were updated to assert clean rc files, system-wide hook installation, and preservation of non-proxy rc content during legacy shim cleanup.
  • The prior CodeQL temporary-file comments in test/service-env.test.ts appear addressed by switching the rc shim test to mkdtempSync under a dedicated temporary directory.

Review completeness

  • Review is based on the provided trusted metadata, current read-only file inspection, and the supplied diff; no scripts/tests/package-manager commands were executed.
  • Linked issue text/comments were not available in trusted metadata, so linked-issue acceptance clauses could not be extracted literally.
  • CI and E2E status may change after the captured context; this result reflects the provided head SHA and status rollup.
  • The diff was provided as truncated-if-large; reviewed evidence focuses on the shown changed hunks and trusted context.
  • Human maintainer review required: yes

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (5)
Dockerfile.base (2)

218-254: Run the sandbox E2E set for this base-image layer change.

These changes alter the sandbox base image contract, so please run the targeted nightly E2E jobs before merge to validate runtime behavior in a real container build.

As per coding guidelines: Dockerfile.base: “This file affects the sandbox container image... only testable with a real container build” and the listed cloud-e2e, sandbox-survival-e2e, hermes-e2e, rebuild-openclaw-e2e workflow recommendation.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Dockerfile.base` around lines 218 - 254, This change to Dockerfile.base (ARG
HOMEBREW_VERSION and the RUN block that creates /home/linuxbrew, chowns it,
clones Homebrew with gosu sandbox, symlinks brew, and runs gosu sandbox brew
--version) modifies the sandbox base-image contract—before merging, run the
targeted nightly sandbox E2E suites (at minimum cloud-e2e, sandbox-survival-e2e,
hermes-e2e and rebuild-openclaw-e2e) to validate runtime behavior in a real
container build and confirm the installed Homebrew prefix, symlinks, permissions
and brew functionality work as expected.

71-72: ⚡ Quick win

Make python symlink creation idempotent.

Line 72 can fail if /usr/local/bin/python already exists in a future upstream base image refresh.

Proposed patch
-    && ln -s /usr/bin/python3 /usr/local/bin/python
+    && ln -sf /usr/bin/python3 /usr/local/bin/python
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Dockerfile.base` around lines 71 - 72, Make the python symlink creation
idempotent by changing the current unconditional ln invocation (ln -s
/usr/bin/python /usr/local/bin/python) so it won't fail if /usr/local/bin/python
already exists; update the Dockerfile command that creates the symlink to either
force-update the link or only create it when absent (e.g., use a force flag or a
conditional test around the ln call) so the build is resilient to upstream base
image changes.
scripts/nemoclaw-start.sh (1)

2187-2491: Please run the sandbox-entrypoint E2E matrix for this change.

This file controls sandbox boot lifecycle paths that unit tests won’t fully cover; run the recommended nightly E2E subset before merge.

As per coding guidelines: scripts/nemoclaw-start.sh: “Changes affect every sandbox boot and are invisible to unit tests” with the recommended sandbox-survival-e2e,sandbox-operations-e2e,cloud-e2e,openclaw-slack-pairing-e2e jobs.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/nemoclaw-start.sh` around lines 2187 - 2491, The change touches
sandbox boot lifecycle code (e.g., seed_default_workspace_templates_as_sandbox,
provision_agent_workspaces, fix_openclaw_ownership and the non-root/root startup
paths) and needs integration verification; please run the sandbox-entrypoint E2E
matrix before merging by executing the recommended nightly subset:
sandbox-survival-e2e, sandbox-operations-e2e, cloud-e2e, and
openclaw-slack-pairing-e2e (ensure the tests exercise both non-root and root
paths, gateway respawn, workspace provisioning, and template seeding).
docs/security/best-practices.mdx (2)

464-468: ⚡ Quick win

Use active voice and one sentence per source line here.

Line 464 has multiple sentences on one line, and Lines 464/468 use passive phrasing (“are offered” / “is offered”).

As per coding guidelines: “Active voice required.” and “One sentence per line in source.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/security/best-practices.mdx` around lines 464 - 468, Rewrite the
paragraph to use active voice and place one sentence per source line: split the
current combined sentence into separate lines so each line contains a single
sentence, convert passive constructions like “are offered”/“is offered” into
active voice (e.g., “The menu shows DGX Spark and DGX Station managed vLLM
entries when detected.” and “The menu lists an already-running vLLM on
localhost:8000 when detected.”), and keep the note about NEMOCLAW_EXPERIMENTAL
gating local NVIDIA NIM and generic Linux managed vLLM install/start as a single
active sentence (e.g., “Setting NEMOCLAW_EXPERIMENTAL=1 enables local NVIDIA NIM
and generic Linux managed vLLM install/start.”) while preserving the meaning of
DGX and localhost behavior.

85-85: ⚡ Quick win

Avoid colon punctuation between clauses.

Line 85 uses “Landlock layout: no.” where the colon does not introduce a list.

As per coding guidelines: “Colons should only introduce a list. Flag colons used as general punctuation between clauses.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/security/best-practices.mdx` at line 85, The phrase "Landlock layout:
no." uses a colon as general punctuation; update the text to remove the colon
and rephrase for clarity (e.g., "Landlock layout — no.", "No Landlock layout",
or "Landlock layout — none") so colons are only used to introduce lists; locate
and edit the exact fragment "Landlock layout: no." in the table row to apply the
change.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@Dockerfile.base`:
- Around line 218-254: This change to Dockerfile.base (ARG HOMEBREW_VERSION and
the RUN block that creates /home/linuxbrew, chowns it, clones Homebrew with gosu
sandbox, symlinks brew, and runs gosu sandbox brew --version) modifies the
sandbox base-image contract—before merging, run the targeted nightly sandbox E2E
suites (at minimum cloud-e2e, sandbox-survival-e2e, hermes-e2e and
rebuild-openclaw-e2e) to validate runtime behavior in a real container build and
confirm the installed Homebrew prefix, symlinks, permissions and brew
functionality work as expected.
- Around line 71-72: Make the python symlink creation idempotent by changing the
current unconditional ln invocation (ln -s /usr/bin/python
/usr/local/bin/python) so it won't fail if /usr/local/bin/python already exists;
update the Dockerfile command that creates the symlink to either force-update
the link or only create it when absent (e.g., use a force flag or a conditional
test around the ln call) so the build is resilient to upstream base image
changes.

In `@docs/security/best-practices.mdx`:
- Around line 464-468: Rewrite the paragraph to use active voice and place one
sentence per source line: split the current combined sentence into separate
lines so each line contains a single sentence, convert passive constructions
like “are offered”/“is offered” into active voice (e.g., “The menu shows DGX
Spark and DGX Station managed vLLM entries when detected.” and “The menu lists
an already-running vLLM on localhost:8000 when detected.”), and keep the note
about NEMOCLAW_EXPERIMENTAL gating local NVIDIA NIM and generic Linux managed
vLLM install/start as a single active sentence (e.g., “Setting
NEMOCLAW_EXPERIMENTAL=1 enables local NVIDIA NIM and generic Linux managed vLLM
install/start.”) while preserving the meaning of DGX and localhost behavior.
- Line 85: The phrase "Landlock layout: no." uses a colon as general
punctuation; update the text to remove the colon and rephrase for clarity (e.g.,
"Landlock layout — no.", "No Landlock layout", or "Landlock layout — none") so
colons are only used to introduce lists; locate and edit the exact fragment
"Landlock layout: no." in the table row to apply the change.

In `@scripts/nemoclaw-start.sh`:
- Around line 2187-2491: The change touches sandbox boot lifecycle code (e.g.,
seed_default_workspace_templates_as_sandbox, provision_agent_workspaces,
fix_openclaw_ownership and the non-root/root startup paths) and needs
integration verification; please run the sandbox-entrypoint E2E matrix before
merging by executing the recommended nightly subset: sandbox-survival-e2e,
sandbox-operations-e2e, cloud-e2e, and openclaw-slack-pairing-e2e (ensure the
tests exercise both non-root and root paths, gateway respawn, workspace
provisioning, and template seeding).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 47199f63-decc-4c4a-9817-a09f4a4f21f4

📥 Commits

Reviewing files that changed from the base of the PR and between a918039 and 5734e76.

📒 Files selected for processing (4)
  • Dockerfile.base
  • docs/security/best-practices.mdx
  • scripts/nemoclaw-start.sh
  • test/sandbox-provisioning.test.ts

@jyaunches
Copy link
Copy Markdown
Contributor

Dispatched the selective E2E jobs recommended by the E2E Advisor for updated head 40b32a0420f01f13add1f6e5ea9f0b2b291f5b54 after merging latest main into this branch.\n\nRun: https://github.com/NVIDIA/NemoClaw/actions/runs/26257165731\nJobs: cloud-onboard-e2e, openclaw-onboard-security-posture-e2e

@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26257165731
Target ref: 40b32a0420f01f13add1f6e5ea9f0b2b291f5b54
Workflow ref: main
Requested jobs: cloud-onboard-e2e,openclaw-onboard-security-posture-e2e
Summary: 2 passed, 0 failed, 0 skipped

Job Result
cloud-onboard-e2e ✅ success
openclaw-onboard-security-posture-e2e ✅ success

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fix Sandbox Use this label to identify issues related to the NemoClaw isolated environment based on OpenShell. v0.0.49 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants