fix: remove proxy hooks from sandbox rc files#3842
Conversation
Signed-off-by: Test User <test@example.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (2)
✅ Files skipped from review due to trivial changes (1)
📝 WalkthroughWalkthroughThe PR moves runtime proxy env sourcing out of per-user rc files: base images pre-create clean rc files, startup removes legacy proxy shims, and system-wide shell hooks read /tmp/nemoclaw-proxy-env.sh to populate interactive shells. ChangesProxy sourcing architecture refactor: rc files to system-wide hooks
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint skipped: no ESLint configuration detected in root package.json. To enable, add Comment |
|
🌿 Preview your docs: https://nvidia-preview-pr-3842.docs.buildwithfern.com/nemoclaw |
E2E Advisor RecommendationRequired E2E: Dispatch hint: Full advisor summaryE2E Recommendation AdvisorBase: Required E2E
Optional E2E
New E2E recommendations
Dispatch hint
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
scripts/nemoclaw-start.sh (1)
1776-1855: Add the recommended entrypoint E2E matrix for this refactor.This function runs on every sandbox boot and affects runtime trust-boundary files; please run the recommended sandbox-survival/sandbox-operations/cloud/OpenClaw pairing jobs before merge.
As per coding guidelines: "
scripts/nemoclaw-start.sh: changes affect every sandbox boot and are invisible to unit tests."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/nemoclaw-start.sh` around lines 1776 - 1855, The change in ensure_runtime_shell_env_shim touches sandbox startup and trusted user rc files; before merging, add and run the recommended entrypoint E2E matrix (sandbox-survival, sandbox-operations, cloud/OpenClaw pairing) to validate this refactor—ensure the test matrix covers scenarios around _RUNTIME_SHELL_ENV_SHIM removal, symlink/non-regular rc_file cases, and ownership/permission adjustments done by ensure_runtime_shell_env_shim so these integration jobs pass prior to merge.docs/deployment/sandbox-hardening.mdx (1)
109-109: ⚡ Quick winUse active voice in this sentence.
Consider rewriting to active voice (for example: “System-wide shell hooks read
/tmp/nemoclaw-proxy-env.shto source runtime proxy configuration.”).As per coding guidelines: "Active voice required. Flag passive constructions."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/deployment/sandbox-hardening.mdx` at line 109, The sentence "Runtime proxy configuration is sourced from system-wide shell hooks that read `/tmp/nemoclaw-proxy-env.sh`." uses passive voice; rewrite it in active voice such as "System-wide shell hooks read `/tmp/nemoclaw-proxy-env.sh` to source runtime proxy configuration." Update the text in the docs line containing that exact sentence so it uses the active construction (refer to the string `/tmp/nemoclaw-proxy-env.sh` and the phrase "runtime proxy configuration") and preserve meaning and punctuation.Dockerfile.base (1)
129-140: Run the broader image-level E2E suite before merge.This change affects base-image shell init behavior and should also be validated with the recommended nightly sandbox/cloud jobs, not only focused unit/e2e checks.
As per coding guidelines: "
Dockerfile.base: Layer ordering, permissions, and baked config changes are only testable with a real container build."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Dockerfile.base` around lines 129 - 140, This change to Dockerfile.base modifies the base image shell init files (/sandbox/.bashrc and /sandbox/.profile) and their ownership/permissions (the RUN block creating files and the chown/chmod commands), so before merging run the full image-level E2E/nightly sandbox and cloud jobs (the recommended broader suite) to validate layer ordering, permissions, and baked config behavior in a real container build; ensure those pipeline jobs pass and attach their results to the PR, or revert/adjust the Dockerfile.base RUN block if tests reveal issues with initialization, ownership, or permission semantics.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@scripts/nemoclaw-start.sh`:
- Around line 1819-1843: The temp file for rc cleanup (tmp_file) is created
using a predictable path "${rc_file}.nemoclaw-clean.$$", which allows symlink
attacks when running as root; change the flow in the rc cleanup around the
tmp_file variable and the following awk + cat replacement to use a
safely-created temporary file from mktemp (e.g., call mktemp to allocate a
unique, non-symlinkable file before running awk, write awk output into that
file, ensure the atomic replace via mv or install instead of direct cat
>"$rc_file", and keep cleanup (rm) on error/exit); update references to tmp_file
used in the awk redirect and subsequent cat/mv so they point to the
mktemp-created file and ensure any early exits remove the temp file.
---
Nitpick comments:
In `@Dockerfile.base`:
- Around line 129-140: This change to Dockerfile.base modifies the base image
shell init files (/sandbox/.bashrc and /sandbox/.profile) and their
ownership/permissions (the RUN block creating files and the chown/chmod
commands), so before merging run the full image-level E2E/nightly sandbox and
cloud jobs (the recommended broader suite) to validate layer ordering,
permissions, and baked config behavior in a real container build; ensure those
pipeline jobs pass and attach their results to the PR, or revert/adjust the
Dockerfile.base RUN block if tests reveal issues with initialization, ownership,
or permission semantics.
In `@docs/deployment/sandbox-hardening.mdx`:
- Line 109: The sentence "Runtime proxy configuration is sourced from
system-wide shell hooks that read `/tmp/nemoclaw-proxy-env.sh`." uses passive
voice; rewrite it in active voice such as "System-wide shell hooks read
`/tmp/nemoclaw-proxy-env.sh` to source runtime proxy configuration." Update the
text in the docs line containing that exact sentence so it uses the active
construction (refer to the string `/tmp/nemoclaw-proxy-env.sh` and the phrase
"runtime proxy configuration") and preserve meaning and punctuation.
In `@scripts/nemoclaw-start.sh`:
- Around line 1776-1855: The change in ensure_runtime_shell_env_shim touches
sandbox startup and trusted user rc files; before merging, add and run the
recommended entrypoint E2E matrix (sandbox-survival, sandbox-operations,
cloud/OpenClaw pairing) to validate this refactor—ensure the test matrix covers
scenarios around _RUNTIME_SHELL_ENV_SHIM removal, symlink/non-regular rc_file
cases, and ownership/permission adjustments done by
ensure_runtime_shell_env_shim so these integration jobs pass prior to merge.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 7585cf58-a5a3-46bd-b900-3063a7aa030f
📒 Files selected for processing (9)
Dockerfile.basedocs/deployment/sandbox-hardening.mdxdocs/security/best-practices.mdxscripts/lib/sandbox-init.shscripts/nemoclaw-start.shtest/e2e-gateway-isolation.shtest/repro-2376.test.tstest/sandbox-provisioning.test.tstest/service-env.test.ts
| local tmp_file | ||
| tmp_file="${rc_file}.nemoclaw-clean.$$" | ||
| if ! awk -v shim="$_RUNTIME_SHELL_ENV_SHIM" ' | ||
| $0 == "# Source runtime proxy config" { | ||
| if ((getline next_line) > 0) { | ||
| if (next_line == shim || next_line ~ /\/tmp\/nemoclaw-proxy-env\.sh/) { | ||
| next | ||
| } | ||
| print $0 | ||
| print next_line | ||
| next | ||
| } | ||
| } | ||
| $0 == shim { next } | ||
| $0 ~ /\/tmp\/nemoclaw-proxy-env\.sh/ { next } | ||
| { print } | ||
| ' "$rc_file" >"$tmp_file"; then | ||
| rm -f "$tmp_file" | ||
| echo "[SECURITY] could not clean runtime env shim from $rc_file" >&2 | ||
| failed=1 | ||
| continue | ||
| fi | ||
| if ! cat "$tmp_file" >"$rc_file"; then | ||
| rm -f "$tmp_file" | ||
| echo "[SECURITY] could not replace cleaned rc file: $rc_file" >&2 |
There was a problem hiding this comment.
Use mktemp for rc cleanup temp files to prevent symlink clobber as root.
The temp path is predictable and created under /sandbox (sandbox-writable). A pre-planted symlink can redirect root writes during startup.
🔒 Suggested hardening
- local tmp_file
- tmp_file="${rc_file}.nemoclaw-clean.$$"
+ local tmp_file
+ tmp_file="$(mktemp "${rc_file}.nemoclaw-clean.XXXXXX")" || {
+ echo "[SECURITY] could not allocate temp file for rc cleanup: $rc_file" >&2
+ failed=1
+ continue
+ }
@@
- if ! cat "$tmp_file" >"$rc_file"; then
+ if ! mv -f "$tmp_file" "$rc_file"; then
rm -f "$tmp_file"
echo "[SECURITY] could not replace cleaned rc file: $rc_file" >&2
failed=1
continue
fi🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@scripts/nemoclaw-start.sh` around lines 1819 - 1843, The temp file for rc
cleanup (tmp_file) is created using a predictable path
"${rc_file}.nemoclaw-clean.$$", which allows symlink attacks when running as
root; change the flow in the rc cleanup around the tmp_file variable and the
following awk + cat replacement to use a safely-created temporary file from
mktemp (e.g., call mktemp to allocate a unique, non-symlinkable file before
running awk, write awk output into that file, ensure the atomic replace via mv
or install instead of direct cat >"$rc_file", and keep cleanup (rm) on
error/exit); update references to tmp_file used in the awk redirect and
subsequent cat/mv so they point to the mktemp-created file and ensure any early
exits remove the temp file.
Signed-off-by: Chengjie Wang <chengjiew@nvidia.com>
PR Review AdvisorRecommendation: blocked This is an automated advisory review. A human maintainer must make the final merge decision. Limitations: Review is based on the provided trusted metadata, current read-only file inspection, and the supplied diff; no scripts/tests/package-manager commands were executed.; Linked issue text/comments were not available in trusted metadata, so linked-issue acceptance clauses could not be extracted literally.; CI and E2E status may change after the captured context; this result reflects the provided head SHA and status rollup.; The diff was provided as truncated-if-large; reviewed evidence focuses on the shown changed hunks and trusted context. Full advisor summaryPR Review AdvisorBase: Blocked: the PR improves rc-file trust boundaries in intent, but the startup cleanup still uses a predictable root-written temp path under /sandbox, mergeability is blocked, one review thread remains unresolved, and required E2E pass evidence is missing for the head SHA. Gate status
🔴 Blockers
🟡 Warnings
🔵 Suggestions
Acceptance coverage
Security review
Test / E2E status
✅ What looks good
Review completeness
|
There was a problem hiding this comment.
🧹 Nitpick comments (5)
Dockerfile.base (2)
218-254: Run the sandbox E2E set for this base-image layer change.These changes alter the sandbox base image contract, so please run the targeted nightly E2E jobs before merge to validate runtime behavior in a real container build.
As per coding guidelines:
Dockerfile.base: “This file affects the sandbox container image... only testable with a real container build” and the listedcloud-e2e,sandbox-survival-e2e,hermes-e2e,rebuild-openclaw-e2eworkflow recommendation.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Dockerfile.base` around lines 218 - 254, This change to Dockerfile.base (ARG HOMEBREW_VERSION and the RUN block that creates /home/linuxbrew, chowns it, clones Homebrew with gosu sandbox, symlinks brew, and runs gosu sandbox brew --version) modifies the sandbox base-image contract—before merging, run the targeted nightly sandbox E2E suites (at minimum cloud-e2e, sandbox-survival-e2e, hermes-e2e and rebuild-openclaw-e2e) to validate runtime behavior in a real container build and confirm the installed Homebrew prefix, symlinks, permissions and brew functionality work as expected.
71-72: ⚡ Quick winMake python symlink creation idempotent.
Line 72 can fail if
/usr/local/bin/pythonalready exists in a future upstream base image refresh.Proposed patch
- && ln -s /usr/bin/python3 /usr/local/bin/python + && ln -sf /usr/bin/python3 /usr/local/bin/python🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Dockerfile.base` around lines 71 - 72, Make the python symlink creation idempotent by changing the current unconditional ln invocation (ln -s /usr/bin/python /usr/local/bin/python) so it won't fail if /usr/local/bin/python already exists; update the Dockerfile command that creates the symlink to either force-update the link or only create it when absent (e.g., use a force flag or a conditional test around the ln call) so the build is resilient to upstream base image changes.scripts/nemoclaw-start.sh (1)
2187-2491: Please run the sandbox-entrypoint E2E matrix for this change.This file controls sandbox boot lifecycle paths that unit tests won’t fully cover; run the recommended nightly E2E subset before merge.
As per coding guidelines:
scripts/nemoclaw-start.sh: “Changes affect every sandbox boot and are invisible to unit tests” with the recommendedsandbox-survival-e2e,sandbox-operations-e2e,cloud-e2e,openclaw-slack-pairing-e2ejobs.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/nemoclaw-start.sh` around lines 2187 - 2491, The change touches sandbox boot lifecycle code (e.g., seed_default_workspace_templates_as_sandbox, provision_agent_workspaces, fix_openclaw_ownership and the non-root/root startup paths) and needs integration verification; please run the sandbox-entrypoint E2E matrix before merging by executing the recommended nightly subset: sandbox-survival-e2e, sandbox-operations-e2e, cloud-e2e, and openclaw-slack-pairing-e2e (ensure the tests exercise both non-root and root paths, gateway respawn, workspace provisioning, and template seeding).docs/security/best-practices.mdx (2)
464-468: ⚡ Quick winUse active voice and one sentence per source line here.
Line 464 has multiple sentences on one line, and Lines 464/468 use passive phrasing (“are offered” / “is offered”).
As per coding guidelines: “Active voice required.” and “One sentence per line in source.”
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/security/best-practices.mdx` around lines 464 - 468, Rewrite the paragraph to use active voice and place one sentence per source line: split the current combined sentence into separate lines so each line contains a single sentence, convert passive constructions like “are offered”/“is offered” into active voice (e.g., “The menu shows DGX Spark and DGX Station managed vLLM entries when detected.” and “The menu lists an already-running vLLM on localhost:8000 when detected.”), and keep the note about NEMOCLAW_EXPERIMENTAL gating local NVIDIA NIM and generic Linux managed vLLM install/start as a single active sentence (e.g., “Setting NEMOCLAW_EXPERIMENTAL=1 enables local NVIDIA NIM and generic Linux managed vLLM install/start.”) while preserving the meaning of DGX and localhost behavior.
85-85: ⚡ Quick winAvoid colon punctuation between clauses.
Line 85 uses “Landlock layout: no.” where the colon does not introduce a list.
As per coding guidelines: “Colons should only introduce a list. Flag colons used as general punctuation between clauses.”
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/security/best-practices.mdx` at line 85, The phrase "Landlock layout: no." uses a colon as general punctuation; update the text to remove the colon and rephrase for clarity (e.g., "Landlock layout — no.", "No Landlock layout", or "Landlock layout — none") so colons are only used to introduce lists; locate and edit the exact fragment "Landlock layout: no." in the table row to apply the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@Dockerfile.base`:
- Around line 218-254: This change to Dockerfile.base (ARG HOMEBREW_VERSION and
the RUN block that creates /home/linuxbrew, chowns it, clones Homebrew with gosu
sandbox, symlinks brew, and runs gosu sandbox brew --version) modifies the
sandbox base-image contract—before merging, run the targeted nightly sandbox E2E
suites (at minimum cloud-e2e, sandbox-survival-e2e, hermes-e2e and
rebuild-openclaw-e2e) to validate runtime behavior in a real container build and
confirm the installed Homebrew prefix, symlinks, permissions and brew
functionality work as expected.
- Around line 71-72: Make the python symlink creation idempotent by changing the
current unconditional ln invocation (ln -s /usr/bin/python
/usr/local/bin/python) so it won't fail if /usr/local/bin/python already exists;
update the Dockerfile command that creates the symlink to either force-update
the link or only create it when absent (e.g., use a force flag or a conditional
test around the ln call) so the build is resilient to upstream base image
changes.
In `@docs/security/best-practices.mdx`:
- Around line 464-468: Rewrite the paragraph to use active voice and place one
sentence per source line: split the current combined sentence into separate
lines so each line contains a single sentence, convert passive constructions
like “are offered”/“is offered” into active voice (e.g., “The menu shows DGX
Spark and DGX Station managed vLLM entries when detected.” and “The menu lists
an already-running vLLM on localhost:8000 when detected.”), and keep the note
about NEMOCLAW_EXPERIMENTAL gating local NVIDIA NIM and generic Linux managed
vLLM install/start as a single active sentence (e.g., “Setting
NEMOCLAW_EXPERIMENTAL=1 enables local NVIDIA NIM and generic Linux managed vLLM
install/start.”) while preserving the meaning of DGX and localhost behavior.
- Line 85: The phrase "Landlock layout: no." uses a colon as general
punctuation; update the text to remove the colon and rephrase for clarity (e.g.,
"Landlock layout — no.", "No Landlock layout", or "Landlock layout — none") so
colons are only used to introduce lists; locate and edit the exact fragment
"Landlock layout: no." in the table row to apply the change.
In `@scripts/nemoclaw-start.sh`:
- Around line 2187-2491: The change touches sandbox boot lifecycle code (e.g.,
seed_default_workspace_templates_as_sandbox, provision_agent_workspaces,
fix_openclaw_ownership and the non-root/root startup paths) and needs
integration verification; please run the sandbox-entrypoint E2E matrix before
merging by executing the recommended nightly subset: sandbox-survival-e2e,
sandbox-operations-e2e, cloud-e2e, and openclaw-slack-pairing-e2e (ensure the
tests exercise both non-root and root paths, gateway respawn, workspace
provisioning, and template seeding).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 47199f63-decc-4c4a-9817-a09f4a4f21f4
📒 Files selected for processing (4)
Dockerfile.basedocs/security/best-practices.mdxscripts/nemoclaw-start.shtest/sandbox-provisioning.test.ts
|
Dispatched the selective E2E jobs recommended by the E2E Advisor for updated head |
Selective E2E Results — ✅ All requested jobs passedRun: 26257165731
|
Summary
.bashrcand.profilefree of NemoClaw proxy source stanzas.Test Plan
npm test -- test/service-env.test.ts test/sandbox-provisioning.test.ts test/repro-2376.test.tsbash -n scripts/nemoclaw-start.sh scripts/lib/sandbox-init.sh test/e2e-gateway-isolation.shgit diff --checkaits-log-worker-6:NEMOCLAW_TEST_IMAGE=nemoclaw-production-3835 bash test/e2e-gateway-isolation.sh-> 34 passed, 0 failed.bashrc/.profileproxy checksNotes
Summary by CodeRabbit
Documentation
Refactor
Tests
Signed-off-by: Chengjie Wang chengjiew@nvidia.com