fix(onboard): reject Jetson sandbox GPU passthrough by yimoj · Pull Request #3965 · NVIDIA/NemoClaw

yimoj · 2026-05-21T03:12:38Z

Summary

Jetson/Tegra hosts now stay on CPU sandbox mode for default/auto GPU behavior, and explicit sandbox GPU passthrough requests fail early with Jetson-specific guidance before gateway or sandbox creation. This prevents the forced-GPU gateway recovery loop shared by the Jetson/Tegra reports.

Related Issue

Fixes #3710 (#3710)
Fixes #3473 (#3473)

Changes

Keep Jetson/Tegra auto and unset sandbox GPU modes on the CPU path unless GPU passthrough is explicitly forced.
Reject forced Jetson/Tegra sandbox GPU passthrough before CDI, gateway reuse, gateway creation, or sandbox creation.
Replace Jetson CPU-only gateway recovery advice with CPU-mode guidance instead of destroy/recreate instructions.
Add focused tests for Jetson forced/default GPU mode and Jetson gateway recovery messaging.

Type of Change

Code change (feature, bug fix, or refactor)
Code change with doc updates
Doc only (prose changes, no code sample modifications)
Doc only (includes code sample changes)

Verification

Targeted verification passed; broad local test hooks have unrelated pre-existing/full-suite failures documented below.

npx prek run --all-files passes
npm test passes
Tests added or updated for new or changed behavior
No secrets, API keys, or credentials committed
Docs updated for user-facing behavior changes
make docs builds without warnings (doc changes only)
Doc pages follow the style guide (doc changes only)
New doc pages include SPDX header and frontmatter (new pages only)

Passed locally:

npx vitest run src/lib/onboard/sandbox-gpu-mode.test.ts src/lib/onboard/gpu-recovery.test.ts src/lib/onboard/gateway-gpu-passthrough.test.ts
npx vitest run test/onboard.test.ts src/lib/onboard/preflight.test.ts
npm run typecheck:cli
npm run build:cli
npx @biomejs/biome check src/lib/onboard.ts src/lib/onboard/gateway-gpu-passthrough.ts src/lib/onboard/gpu-recovery.ts src/lib/onboard/sandbox-gpu-mode.ts src/lib/onboard/gateway-gpu-passthrough.test.ts src/lib/onboard/gpu-recovery.test.ts src/lib/onboard/sandbox-gpu-mode.test.ts
Worktree CLI repro: NEMOCLAW_NONINTERACTIVE=1 node ./bin/nemoclaw.js onboard --fresh --non-interactive --yes --yes-i-accept-third-party-software --gpu --name codex-jetson-repro exits before gateway/sandbox creation on this non-GPU x86 host.
Jetson seam checks confirm forced GPU returns the unsupported Jetson/Tegra error and default Jetson mode stays CPU.
codex review -c sandbox_mode="danger-full-access" --uncommitted reported no actionable findings after fixes.

Known local failures:

npm test fails in test/fetch-guard-patch-regression.test.ts because the test executes a local removal of /usr/local/lib/node_modules/openclaw and hits Permission denied; this reproduces in isolation and is unrelated to Jetson onboarding.
Full-suite test/cli.test.ts runs intermittently timed out in CLI dispatch tests under load; the affected tests passed when isolated (sandbox inspection help uses native oclif usage, rejects malformed logs flags before calling OpenShell, and channels mutation dry-run paths dispatch through oclif).
npx prek run --all-files fails only at the test-cli hook for the same fetch-guard failure plus one full-suite CLI timeout; non-test hooks and plugin tests passed.

Signed-off-by: Yimo Jiang yimoj@nvidia.com

Summary by CodeRabbit

Bug Fixes
- GPU passthrough on Jetson is now treated as unsupported and surfaces clear, early error messaging; gateway reuse aborts immediately to avoid unsafe operations.
Improvements
- Stronger sandbox-GPU validation during setup and resume with immediate failures for invalid configs.
- Jetson-specific guidance forces CPU fallback by default and provides clearer workaround instructions.
New Features
- Linux-only GPU preflight checks that validate container runtime CDI specs and print actionable remediation steps.

coderabbitai · 2026-05-21T03:12:50Z

📝 Walkthrough

Walkthrough

GPU sandbox passthrough on Jetson is blocked: Jetson-specific messages added, sandbox GPU mode defaults to disabled on Jetson unless forced, preflight validation is centralized and checks CDI specs, and gateway reconciliation aborts early for Jetson passthrough requests.

Changes

Jetson GPU Passthrough Unsupported Blocking

Layer / File(s)	Summary
Jetson GPU unsupported messaging constants and recovery options `src/lib/onboard/sandbox-gpu-mode.ts`, `src/lib/onboard/gpu-recovery.ts`, `src/lib/onboard/gpu-recovery.test.ts`	Jetson-specific GPU sandbox passthrough error and workaround messages are exported; `GpuPassthroughRecoveryOptions` enables platform-specific recovery messaging used by `gpuPassthroughRecoveryLines` and `reportGpuPassthroughRecovery`. Tests validate Jetson-specific hints.
Sandbox GPU mode resolution with Jetson detection `src/lib/onboard/sandbox-gpu-mode.ts`, `src/lib/onboard/sandbox-gpu-mode.test.ts`	`resolveSandboxGpuMode` forces GPU mode to `"0"` when Jetson is detected and env is unset/`"auto"`, preserves explicit overrides, and `resolveSandboxGpuConfig` appends Jetson-specific errors when GPU mode is `"1"`. Tests updated for these behaviors.
GPU preflight config validation refactoring `src/lib/onboard.ts`, `src/lib/onboard/sandbox-gpu-preflight.ts`	Onboard preflight now resolves `sandboxGpuConfig` early and calls `exitOnSandboxGpuConfigErrors` immediately; CDI-spec existence and preflight validation are delegated to `sandbox-gpu-preflight.ts`. Resume preflight mirrors this pattern.
Gateway GPU reconciliation with Jetson blocking `src/lib/onboard/gateway-gpu-passthrough.ts`, `src/lib/onboard/gateway-gpu-passthrough.test.ts`, `src/lib/onboard.ts`	`reconcileGatewayGpuReuseForGpuIntent` accepts `hostGpuPlatform?: NvidiaPlatform

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

NVIDIA/NemoClaw#3670: Related gateway GPU reuse/reconciliation updates that this PR extends with host platform handling.
NVIDIA/NemoClaw#3618: Related Jetson GPU passthrough behavior and sandbox config adjustments.
NVIDIA/NemoClaw#3181: Related preflight changes around optedOutGpuPassthrough and onboarding messaging.

Suggested labels

NemoClaw CLI, fix, Platform: ARM64, v0.0.47

Suggested reviewers

ericksoa
cv

Poem

🐇 I hopped through the code with a curious zap,
Jetson said "stop" — so I drew up a map.
Mode set to zero, recovery lines neat,
Messages guide users to a safer seat.
Hooray for clarity — I munch on a carrot and nap.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The pull request title directly describes the main change: rejecting Jetson sandbox GPU passthrough. This aligns with the primary objective of ensuring Jetson hosts remain on CPU sandbox mode and preventing forced GPU passthrough.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/lib/onboard.ts (1)

1210-1232: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Collapse the duplicated sandbox-GPU preflight path.

config.errors is now validated in three places across the same flow, and the resolve/opt-out logic is duplicated between first-run and --resume. That already pushed src/lib/onboard.ts over the onboard-entrypoint budget, and it makes future Jetson behavior easier to drift.

♻️ One way to centralize it

 function validateSandboxGpuPreflight(config: SandboxGpuConfig): void {
-  validateSandboxGpuConfigErrors(config);
   if (!config.sandboxGpuEnabled) return;
   if (process.platform !== "linux") return;
   ...
 }
 
+function resolveSandboxGpuPreflightState(
+  gpu: ReturnType<typeof nim.detectGpu>,
+  options: {
+    flag: SandboxGpuFlag;
+    device: string | null;
+    optedOutGpuPassthrough?: boolean;
+    noGpu?: boolean;
+    persistedGpuPassthrough?: boolean;
+    gpuRequested?: boolean;
+  },
+) {
+  const sandboxGpuConfig = resolveSandboxGpuConfig(gpu, {
+    flag: options.flag,
+    device: options.device,
+  });
+  validateSandboxGpuConfigErrors(sandboxGpuConfig);
+  return {
+    sandboxGpuConfig,
+    optedOutGpuPassthrough:
+      options.optedOutGpuPassthrough === true ||
+      options.noGpu === true ||
+      (options.gpuRequested !== true && options.persistedGpuPassthrough === false) ||
+      !sandboxGpuConfig.sandboxGpuEnabled,
+  };
+}

Also applies to: 3311-3321, 9374-9392

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/onboard.ts` around lines 1210 - 1232, Centralize sandbox GPU
preflight validation by removing duplicated config.errors checks and duplicate
resolve/opt-out logic: make validateSandboxGpuConfigErrors the single validator
invoked once at the onboarding entrypoint (instead of calling it inside
validateSandboxGpuPreflight and at the other duplicate sites), refactor
validateSandboxGpuConfigErrors to return an error list or throw instead of
calling process.exit directly, and update validateSandboxGpuPreflight (and the
other duplicated blocks around the same flow) to only perform platform/CDI
detection and remediation messaging using the validated config; ensure
references to validateSandboxGpuPreflight and validateSandboxGpuConfigErrors are
updated so no duplicate error-handling paths remain.

🧹 Nitpick comments (1)

src/lib/onboard.ts (1)
9511-9516: Run the core onboard E2Es for this Jetson gateway-reuse handoff.

This call now decides gateway reuse with Jetson platform context threaded in, so I'd still want the core onboard flows exercised before merge, especially cloud-e2e, sandbox-operations-e2e, rebuild-openclaw-e2e, and openshell-gateway-upgrade-e2e.

As per coding guidelines, src/lib/onboard.ts: This file contains core onboarding logic. Changes here affect the full sandbox creation and configuration flow. E2E test recommendation includes cloud-e2e, sandbox-operations-e2e, rebuild-openclaw-e2e, and openshell-gateway-upgrade-e2e.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/onboard.ts` around lines 9511 - 9516, The change to gateway reuse
decision now passes Jetson context into reconcileGatewayGpuReuseForGpuIntent
(call sites referencing gatewayReuseState, gpuPassthrough, GATEWAY_NAME,
recordedSandboxName/requestedSandboxName and hostGpuPlatform), so before merging
run the core onboard end-to-end suites to validate the full sandbox flows:
cloud-e2e, sandbox-operations-e2e, rebuild-openclaw-e2e, and
openshell-gateway-upgrade-e2e; fix any failures exposed by those tests
(investigate reconcileGatewayGpuReuseForGpuIntent behavior with Jetson
gpu?.platform values and adjust logic or tests accordingly) and only merge once
all listed E2Es pass.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/lib/onboard.ts`:
- Around line 1210-1232: Centralize sandbox GPU preflight validation by removing
duplicated config.errors checks and duplicate resolve/opt-out logic: make
validateSandboxGpuConfigErrors the single validator invoked once at the
onboarding entrypoint (instead of calling it inside validateSandboxGpuPreflight
and at the other duplicate sites), refactor validateSandboxGpuConfigErrors to
return an error list or throw instead of calling process.exit directly, and
update validateSandboxGpuPreflight (and the other duplicated blocks around the
same flow) to only perform platform/CDI detection and remediation messaging
using the validated config; ensure references to validateSandboxGpuPreflight and
validateSandboxGpuConfigErrors are updated so no duplicate error-handling paths
remain.

---

Nitpick comments:
In `@src/lib/onboard.ts`:
- Around line 9511-9516: The change to gateway reuse decision now passes Jetson
context into reconcileGatewayGpuReuseForGpuIntent (call sites referencing
gatewayReuseState, gpuPassthrough, GATEWAY_NAME,
recordedSandboxName/requestedSandboxName and hostGpuPlatform), so before merging
run the core onboard end-to-end suites to validate the full sandbox flows:
cloud-e2e, sandbox-operations-e2e, rebuild-openclaw-e2e, and
openshell-gateway-upgrade-e2e; fix any failures exposed by those tests
(investigate reconcileGatewayGpuReuseForGpuIntent behavior with Jetson
gpu?.platform values and adjust logic or tests accordingly) and only merge once
all listed E2Es pass.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 92bed78d-1b8a-4b0c-8208-6fe808258d64

📥 Commits

Reviewing files that changed from the base of the PR and between dc63189 and d0a5a63.

📒 Files selected for processing (7)

src/lib/onboard.ts
src/lib/onboard/gateway-gpu-passthrough.test.ts
src/lib/onboard/gateway-gpu-passthrough.ts
src/lib/onboard/gpu-recovery.test.ts
src/lib/onboard/gpu-recovery.ts
src/lib/onboard/sandbox-gpu-mode.test.ts
src/lib/onboard/sandbox-gpu-mode.ts

coderabbitai

🧹 Nitpick comments (1)

src/lib/onboard/gpu-recovery.ts (1)

110-116: ⚡ Quick win

Skip registry lookup when Jetson is already known unsupported.

reportGpuPassthroughRecovery still evaluates loadNames() even when unsupportedPlatform === "jetson", but that branch never uses names. Short-circuiting avoids unnecessary registry IO and keeps the early-reject path tighter.

Proposed refactor

 export function reportGpuPassthroughRecovery(
   emit: (line: string) => void,
   loadNames: () => string[] = getRegisteredSandboxNamesForGpuRecovery,
   options: GpuPassthroughRecoveryOptions = {},
 ): void {
-  for (const line of gpuPassthroughRecoveryLines(loadNames(), options)) emit(line);
+  const names =
+    options.unsupportedPlatform === "jetson" ? null : loadNames();
+  for (const line of gpuPassthroughRecoveryLines(names, options)) emit(line);
 }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/onboard/gpu-recovery.ts` around lines 110 - 116, Short-circuit the
registry lookup in reportGpuPassthroughRecovery: if options.unsupportedPlatform
=== "jetson" do not call loadNames() (which triggers registry IO); instead call
gpuPassthroughRecoveryLines with an empty names array (or otherwise iterate its
output) and emit those lines, otherwise call loadNames() as today. Update
reportGpuPassthroughRecovery to conditionally obtain names only when
unsupportedPlatform is not "jetson" (referencing reportGpuPassthroughRecovery,
loadNames, getRegisteredSandboxNamesForGpuRecovery, gpuPassthroughRecoveryLines,
and GpuPassthroughRecoveryOptions).

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/lib/onboard/gpu-recovery.ts`:
- Around line 110-116: Short-circuit the registry lookup in
reportGpuPassthroughRecovery: if options.unsupportedPlatform === "jetson" do not
call loadNames() (which triggers registry IO); instead call
gpuPassthroughRecoveryLines with an empty names array (or otherwise iterate its
output) and emit those lines, otherwise call loadNames() as today. Update
reportGpuPassthroughRecovery to conditionally obtain names only when
unsupportedPlatform is not "jetson" (referencing reportGpuPassthroughRecovery,
loadNames, getRegisteredSandboxNamesForGpuRecovery, gpuPassthroughRecoveryLines,
and GpuPassthroughRecoveryOptions).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 1034411c-0705-4748-911e-536e862853b2

📥 Commits

Reviewing files that changed from the base of the PR and between d0a5a63 and ace53da.

📒 Files selected for processing (8)

src/lib/onboard.ts
src/lib/onboard/gateway-gpu-passthrough.test.ts
src/lib/onboard/gateway-gpu-passthrough.ts
src/lib/onboard/gpu-recovery.test.ts
src/lib/onboard/gpu-recovery.ts
src/lib/onboard/sandbox-gpu-mode.test.ts
src/lib/onboard/sandbox-gpu-mode.ts
src/lib/onboard/sandbox-gpu-preflight.ts

coderabbitai

🧹 Nitpick comments (3)

src/lib/onboard/sandbox-gpu-mode.ts (1)
48-49: 💤 Low value

Consider clarifying whether Jetson GPU passthrough is permanently unsupported.

The comment states that GPU sandbox passthrough "does not currently work on Jetson," which might suggest this is a temporary limitation. Based on the error messages JETSON_SANDBOX_GPU_UNSUPPORTED_MESSAGE and the PR objectives, this appears to be an architectural constraint rather than a temporary gap. If Jetson GPU passthrough is permanently unsupported, consider removing "currently" to avoid setting incorrect expectations.
📝 Suggested clarification
-  // GPU sandbox passthrough does not currently work on Jetson; keep auto/default
-  // behavior on the CPU sandbox path unless the user explicitly forces GPU.
+  // GPU sandbox passthrough is not supported on Jetson/Tegra; keep auto/default
+  // behavior on the CPU sandbox path unless the user explicitly forces GPU.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/onboard/sandbox-gpu-mode.ts` around lines 48 - 49, Update the comment
in sandbox-gpu-mode.ts to remove the word "currently" and state that Jetson GPU
sandbox passthrough is unsupported (architectural constraint) rather than a
temporary limitation; also ensure any related messaging/constants like
JETSON_SANDBOX_GPU_UNSUPPORTED_MESSAGE reflect this permanent status so the
comment and messages are consistent with the code's behavior.
src/lib/onboard/sandbox-gpu-mode.test.ts (1)
110-114: 💤 Low value

Consider checking both error messages for consistency.

The test for jetsonFlagEnable verifies the unsupported message but not the workaround message, while the jetsonExplicitEnable test above checks both. For consistency and completeness, consider asserting both JETSON_SANDBOX_GPU_UNSUPPORTED_MESSAGE and JETSON_SANDBOX_GPU_WORKAROUND_MESSAGE are present.
✅ Suggested assertion addition
 const jetsonFlagEnable = resolveSandboxGpuConfig(jetson, { flag: "enable", env: {} });
 expect(jetsonFlagEnable.mode).toBe("1");
 expect(jetsonFlagEnable.errors.join("\n")).toContain(
   JETSON_SANDBOX_GPU_UNSUPPORTED_MESSAGE,
 );
+expect(jetsonFlagEnable.errors.join("\n")).toContain(
+  JETSON_SANDBOX_GPU_WORKAROUND_MESSAGE,
+);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/onboard/sandbox-gpu-mode.test.ts` around lines 110 - 114, The test
for jetsonFlagEnable should assert both JETSON_SANDBOX_GPU_UNSUPPORTED_MESSAGE
and JETSON_SANDBOX_GPU_WORKAROUND_MESSAGE are present in the errors like the
jetsonExplicitEnable test does; update the assertion for
resolveSandboxGpuConfig(...) stored in jetsonFlagEnable to check that
jetsonFlagEnable.errors.join("\n") contains both
JETSON_SANDBOX_GPU_UNSUPPORTED_MESSAGE and JETSON_SANDBOX_GPU_WORKAROUND_MESSAGE
so the two tests are consistent.
src/lib/onboard/gateway-gpu-passthrough.ts (1)
144-147: Run targeted onboarding E2E coverage for this early-exit path.

Given this changes control flow before gateway inspection/cleanup, I’d run at least cloud-e2e, sandbox-operations-e2e, and channels-stop-start-e2e from the recommended nightly matrix on this branch.

As per coding guidelines, “E2E test recommendation: cloud-e2e … sandbox-operations-e2e … channels-stop-start-e2e …”.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/onboard/gateway-gpu-passthrough.ts` around lines 144 - 147, This
early-exit path (when gpuPassthrough && hostGpuPlatform === "jetson" in
gateway-gpu-passthrough.ts) bypasses gateway inspection/cleanup, so add targeted
E2E coverage: create/enable tests in the cloud-e2e, sandbox-operations-e2e, and
channels-stop-start-e2e suites that trigger gpuPassthrough=true and
hostGpuPlatform="jetson" (via env/fixture) and assert that
reportGpuPassthroughRecovery is invoked and the process exits as expected, and
also verify whether gateway inspection/cleanup behavior is executed or
intentionally skipped; ensure tests fail if the early exit accidentally skips
required cleanup.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/lib/onboard/gateway-gpu-passthrough.ts`:
- Around line 144-147: This early-exit path (when gpuPassthrough &&
hostGpuPlatform === "jetson" in gateway-gpu-passthrough.ts) bypasses gateway
inspection/cleanup, so add targeted E2E coverage: create/enable tests in the
cloud-e2e, sandbox-operations-e2e, and channels-stop-start-e2e suites that
trigger gpuPassthrough=true and hostGpuPlatform="jetson" (via env/fixture) and
assert that reportGpuPassthroughRecovery is invoked and the process exits as
expected, and also verify whether gateway inspection/cleanup behavior is
executed or intentionally skipped; ensure tests fail if the early exit
accidentally skips required cleanup.

In `@src/lib/onboard/sandbox-gpu-mode.test.ts`:
- Around line 110-114: The test for jetsonFlagEnable should assert both
JETSON_SANDBOX_GPU_UNSUPPORTED_MESSAGE and JETSON_SANDBOX_GPU_WORKAROUND_MESSAGE
are present in the errors like the jetsonExplicitEnable test does; update the
assertion for resolveSandboxGpuConfig(...) stored in jetsonFlagEnable to check
that jetsonFlagEnable.errors.join("\n") contains both
JETSON_SANDBOX_GPU_UNSUPPORTED_MESSAGE and JETSON_SANDBOX_GPU_WORKAROUND_MESSAGE
so the two tests are consistent.

In `@src/lib/onboard/sandbox-gpu-mode.ts`:
- Around line 48-49: Update the comment in sandbox-gpu-mode.ts to remove the
word "currently" and state that Jetson GPU sandbox passthrough is unsupported
(architectural constraint) rather than a temporary limitation; also ensure any
related messaging/constants like JETSON_SANDBOX_GPU_UNSUPPORTED_MESSAGE reflect
this permanent status so the comment and messages are consistent with the code's
behavior.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: acaca63e-a1c5-4033-bd27-c5ee19a4bdd1

📥 Commits

Reviewing files that changed from the base of the PR and between ace53da and 361578c.

📒 Files selected for processing (8)

src/lib/onboard.ts
src/lib/onboard/gateway-gpu-passthrough.test.ts
src/lib/onboard/gateway-gpu-passthrough.ts
src/lib/onboard/gpu-recovery.test.ts
src/lib/onboard/gpu-recovery.ts
src/lib/onboard/sandbox-gpu-mode.test.ts
src/lib/onboard/sandbox-gpu-mode.ts
src/lib/onboard/sandbox-gpu-preflight.ts

Jetson and Tegra hosts cannot use the OpenShell sandbox GPU passthrough path, so keep auto/default mode on CPU and fail forced GPU requests before gateway or sandbox creation. Recovery hints now direct Jetson users to CPU mode instead of destroy/recreate loops. Fixes NVIDIA#3710 Fixes NVIDIA#3473 Signed-off-by: Yimo Jiang <yimoj@nvidia.com>

coderabbitai

🧹 Nitpick comments (1)

src/lib/onboard.ts (1)
3283-3292: Run the recommended onboard E2Es for this path.

This touches core preflight and gateway-reuse behavior in src/lib/onboard.ts, so I'd run the targeted onboard E2Es before merge—at least cloud-e2e, sandbox-operations-e2e, and openshell-gateway-upgrade-e2e.

As per coding guidelines "src/lib/onboard.ts: This file contains core onboarding logic. Changes here affect the full sandbox creation and configuration flow."

Also applies to: 9346-9364, 9483-9497
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/onboard.ts` around lines 3283 - 3292, This change touches core
onboarding/preflight logic (functions nim.detectGpu, resolveSandboxGpuConfig,
exitOnSandboxGpuConfigErrors and the optedOutGpuPassthrough condition in
src/lib/onboard.ts) so before merging run the targeted onboard end-to-end
suites: cloud-e2e, sandbox-operations-e2e, and openshell-gateway-upgrade-e2e
(also verify the other modified regions around the file referenced in the
review). Fix any failures uncovered by these E2Es (adjust config handling,
flags, or gating logic in the referenced functions) and re-run until all three
suites pass.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/lib/onboard.ts`:
- Around line 3283-3292: This change touches core onboarding/preflight logic
(functions nim.detectGpu, resolveSandboxGpuConfig, exitOnSandboxGpuConfigErrors
and the optedOutGpuPassthrough condition in src/lib/onboard.ts) so before
merging run the targeted onboard end-to-end suites: cloud-e2e,
sandbox-operations-e2e, and openshell-gateway-upgrade-e2e (also verify the other
modified regions around the file referenced in the review). Fix any failures
uncovered by these E2Es (adjust config handling, flags, or gating logic in the
referenced functions) and re-run until all three suites pass.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: f50e3734-0711-474c-b14a-f330b0c1f23e

📥 Commits

Reviewing files that changed from the base of the PR and between 361578c and 480d048.

📒 Files selected for processing (8)

src/lib/onboard.ts
src/lib/onboard/gateway-gpu-passthrough.test.ts
src/lib/onboard/gateway-gpu-passthrough.ts
src/lib/onboard/gpu-recovery.test.ts
src/lib/onboard/gpu-recovery.ts
src/lib/onboard/sandbox-gpu-mode.test.ts
src/lib/onboard/sandbox-gpu-mode.ts
src/lib/onboard/sandbox-gpu-preflight.ts

yimoj self-assigned this May 21, 2026

coderabbitai Bot reviewed May 21, 2026

View reviewed changes

yimoj force-pushed the fix/3710-jetson-gpu-passthrough branch from d0a5a63 to ace53da Compare May 21, 2026 03:18

coderabbitai Bot reviewed May 21, 2026

View reviewed changes

yimoj force-pushed the fix/3710-jetson-gpu-passthrough branch from ace53da to 361578c Compare May 21, 2026 03:24

coderabbitai Bot reviewed May 21, 2026

View reviewed changes

yimoj force-pushed the fix/3710-jetson-gpu-passthrough branch from 361578c to 480d048 Compare May 21, 2026 03:32

coderabbitai Bot reviewed May 21, 2026

View reviewed changes

yimoj added the v0.0.48 Release target label May 21, 2026

This was referenced May 21, 2026

fix(onboard): reject host.docker.internal inference URLs #3804

Open

fix(onboard): detect Windows-host Ollama via process probe #3969

Open

cv added v0.0.49 Release target and removed v0.0.48 Release target labels May 21, 2026

github-actions Bot mentioned this pull request May 21, 2026

fix(onboard): suppress 'No active forward found' from best-effort forward stop #3997

Open

12 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(onboard): reject Jetson sandbox GPU passthrough#3965

fix(onboard): reject Jetson sandbox GPU passthrough#3965
yimoj wants to merge 1 commit into
NVIDIA:mainfrom
yimoj:fix/3710-jetson-gpu-passthrough

yimoj commented May 21, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 21, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

yimoj commented May 21, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Related Issue

Changes

Type of Change

Verification

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yimoj commented May 21, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 21, 2026 •

edited

Loading