Skip to content

feat(onboard): show managed vLLM by default on DGX Spark and Station#3921

Merged
ericksoa merged 5 commits into
mainfrom
feat/vllm-menu-options-dgx-spark-station
May 20, 2026
Merged

feat(onboard): show managed vLLM by default on DGX Spark and Station#3921
ericksoa merged 5 commits into
mainfrom
feat/vllm-menu-options-dgx-spark-station

Conversation

@zyang-dev
Copy link
Copy Markdown
Contributor

@zyang-dev zyang-dev commented May 20, 2026

Summary

Show the managed vLLM install/start option by default on DGX Spark and DGX Station during onboarding. This keeps generic Linux NVIDIA GPU hosts behind the existing experimental gate while making supported DGX local-inference paths easier to discover.

Changes

  • Added platform-aware default visibility for managed vLLM menu entries.
  • Centralized the default managed-vLLM platform list in vllm-menu.ts.
  • Added focused tests for DGX Spark, DGX Station, and generic Linux menu behavior.
  • Kept src/lib/onboard.ts line-budget neutral.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • make docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: zyang-dev 267119621+zyang-dev@users.noreply.github.com

Summary by CodeRabbit

  • New Features

    • Auto-detect already-running vLLM servers on localhost:8000
    • Platform-aware vLLM installation menu for DGX Spark and DGX Station
  • Improvements

    • Managed vLLM install/start options now appear by default on DGX platforms
    • Generic Linux NVIDIA GPU hosts require explicit opt-in via environment variables
  • Documentation

    • Updated guides to clarify vLLM availability and platform-specific behavior

Review Change Stack

Signed-off-by: zyang-dev <267119621+zyang-dev@users.noreply.github.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 20, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: da227390-44c4-49d5-8135-e1ff6ed057aa

📥 Commits

Reviewing files that changed from the base of the PR and between b1d3676 and 9bca801.

📒 Files selected for processing (10)
  • .agents/skills/nemoclaw-user-configure-inference/SKILL.md
  • .agents/skills/nemoclaw-user-configure-inference/references/inference-options.md
  • .agents/skills/nemoclaw-user-configure-security/references/best-practices.md
  • .agents/skills/nemoclaw-user-get-started/SKILL.md
  • .agents/skills/nemoclaw-user-overview/references/release-notes.md
  • .agents/skills/nemoclaw-user-reference/SKILL.md
  • .agents/skills/nemoclaw-user-reference/references/architecture.md
  • docs/about/release-notes.mdx
  • docs/security/best-practices.mdx
  • src/lib/onboard.ts
✅ Files skipped from review due to trivial changes (6)
  • .agents/skills/nemoclaw-user-reference/references/architecture.md
  • .agents/skills/nemoclaw-user-reference/SKILL.md
  • docs/about/release-notes.mdx
  • docs/security/best-practices.mdx
  • .agents/skills/nemoclaw-user-overview/references/release-notes.md
  • .agents/skills/nemoclaw-user-configure-inference/SKILL.md

📝 Walkthrough

Walkthrough

Platform-aware vLLM menu construction now gates the install-vllm entry based on detected GPU platform (DGX Spark and DGX Station by default, generic Linux NVIDIA behind experimental or env-var opt-in), in addition to the existing NEMOCLAW_PROVIDER and experimental flag controls. Documentation and examples are updated consistently across skill modules, guides, and release notes.

Changes

vLLM Platform-Aware Menu Gating

Layer / File(s) Summary
vLLM menu contract and platform gating logic
src/lib/onboard/vllm-menu.ts
Module docs add managed platform set (spark, station) and clarify env-var/platform precedence; BuildVllmMenuOptions type extends to accept optional platform?; env-var opt-in check refactored to multi-line form; install-vllm entry condition expanded to trigger when env opt-in is set, or when vllmProfile exists and either experimental is enabled or platform is in the managed set.
Platform-based menu gating test coverage
src/lib/onboard/vllm-menu.test.ts
Test import switched from compiled dist to local module; three new test cases assert install-vllm appears for DGX Spark and DGX Station when experimental is false, and is hidden for generic Linux NVIDIA when experimental is false.
Onboarding caller threads GPU platform
src/lib/onboard.ts
Onboarding flow passes gpu?.platform into buildVllmMenuEntries call; adjacent inline comment adjusted for clarity.
Documentation and config consistency
.agents/skills/*, docs/*, ci/platform-matrix.json
Docs across skill modules (configure-inference, get-started, configure-security), quickstart, inference guides, security best-practices, and release notes updated to reflect caveated (not experimental) status for managed vLLM; DGX Spark/Station marked as default-visible; generic Linux NVIDIA hosts require NEMOCLAW_EXPERIMENTAL=1 or NEMOCLAW_PROVIDER=install-vllm; env-var usage patterns and non-interactive command examples revised consistently. Reference headings refined ("Architecture Details"). CI provider matrix entry status changed from experimental to caveated.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • NVIDIA/NemoClaw#3790: Main PR extends the buildVllmMenuEntries and vLLM menu logic by adding platform-aware gating and updating entry selection behavior.
  • NVIDIA/NemoClaw#3417: Both PRs update vLLM onboarding/provider menu construction logic, including conditions for showing running vLLM and gating install-vllm entries via env and provider flags.

Suggested labels

Local Models, enhancement: inference, documentation

Suggested reviewers

  • ericksoa
  • jyaunches
  • cv

Poem

🐰 A platform-aware vLLM feast,
DGX Spark and Station blessed,
while generic Linux stands test—
with opt-in flags at the behest.
Documentation flows, clean and bright,
threading platform truth through the night! 🌙

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main feature: showing managed vLLM install/start by default on DGX Spark and Station platforms during onboarding.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/vllm-menu-options-dgx-spark-station

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.


Comment @coderabbitai help to get the list of available commands and usage tips.

@zyang-dev zyang-dev self-assigned this May 20, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 20, 2026

PR Review Advisor

Recommendation: blocked
Confidence: medium
Analyzed HEAD: 9bca8016220357763751afe2c8ef537444c27c90
Findings: 1 blocker(s), 2 warning(s), 0 suggestion(s)

This is an automated advisory review. A human maintainer must make the final merge decision.

Limitations: Review used the provided trusted deterministic PR context and supplied diff; no scripts, tests, package-manager commands, workflow dispatches, or network actions were executed.; CI, mergeability, review decision, and E2E status may change after this advisory result; this result reflects the supplied rollup for head SHA 9bca801.; No linked issue acceptance clauses were available because github.linkedIssues is empty.; CodeRabbit/review thread resolution state was not fully available; the rollup still shows CodeRabbit PENDING.; The diff was truncated to the supplied content; files were reviewed based on provided changed-file list, deterministic context, and visible diff excerpts.

Workflow run

Full advisor summary

PR Review Advisor

Base: origin/main
Head: HEAD
Analyzed SHA: 9bca8016220357763751afe2c8ef537444c27c90
Recommendation: blocked
Confidence: medium

The code change is small and unit-tested, but the PR is blocked by hard gates: mergeStateStatus=BLOCKED, reviewDecision=CHANGES_REQUESTED, many pending status contexts, and missing required onboarding E2E evidence for head SHA 9bca801.

Gate status

  • CI: pending — Status rollup for head SHA 9bca801 shows 14 non-completed/pending contexts, including get-pr-info, commit-lint, cli-parity, E2E recommendation, macos-e2e, PR review advisor, CodeQL jobs, unit-vitest-linux, checks, ShellCheck SARIF, build-sandbox-images, build-sandbox-images-arm64, and CodeRabbit.
  • Mergeability: fail — GitHub GraphQL reports mergeStateStatus=BLOCKED and reviewDecision=CHANGES_REQUESTED for PR feat(onboard): show managed vLLM by default on DGX Spark and Station #3921.
  • Review threads: unknown — No review thread state was available; GraphQL reviewThreads.nodes is empty, but CodeRabbit is still PENDING in the status rollup.
  • Risky code tested: warning — Risky areas detected: credentials/inference/network and onboarding/host glue. Unit tests were added for vLLM menu helper behavior, but runtime onboarding and managed-vLLM install/start discoverability require E2E confirmation.

🔴 Blockers

  • Hard gates are not satisfied for this head SHA: The PR is not currently merge-ready because GitHub reports the PR as blocked with changes requested, and required status contexts are still queued, in progress, or pending for the supplied head SHA.
    • Recommendation: Wait for required reviews and all required CI/status contexts to complete successfully for head SHA 9bca801 before considering merge.
    • Evidence: mergeStateStatus=BLOCKED; reviewDecision=CHANGES_REQUESTED; status rollup includes 14 pending/queued/in-progress contexts for 9bca801.

🟡 Warnings

🔵 Suggestions

  • None.

Acceptance coverage

  • unknown — No linked issues were found for this pull request.: github.linkedIssues is empty, so there are no linked issue acceptance clauses or linked issue comments to map literally.
  • met — Show the managed vLLM install/start option by default on DGX Spark and DGX Station during onboarding.: src/lib/onboard/vllm-menu.ts defines MANAGED_VLLM_DEFAULT_PLATFORMS as spark and station and emits install-vllm when opts.platform is in that allowlist. src/lib/onboard.ts passes platform: gpu?.platform into buildVllmMenuEntries. Tests cover DGX Spark install and DGX Station cached-image start behavior.
  • met — This keeps generic Linux NVIDIA GPU hosts behind the existing experimental gate while making supported DGX local-inference paths easier to discover.: src/lib/onboard/vllm-menu.test.ts adds a generic Linux negative test with platform linux, experimental false, and a matching profile, expecting no entries. The implementation only allows default managed vLLM for spark/station unless the user opts in or EXPERIMENTAL is true.
  • met — Added platform-aware default visibility for managed vLLM menu entries.: BuildVllmMenuOptions gains optional platform; buildVllmMenuEntries checks opts.platform against MANAGED_VLLM_DEFAULT_PLATFORMS; src/lib/onboard.ts supplies gpu?.platform.
  • met — Centralized the default managed-vLLM platform list in vllm-menu.ts.: src/lib/onboard/vllm-menu.ts adds MANAGED_VLLM_DEFAULT_PLATFORMS = new Set(["spark", "station"]).
  • met — Added focused tests for DGX Spark, DGX Station, and generic Linux menu behavior.: src/lib/onboard/vllm-menu.test.ts adds tests named "returns the install entry by default for DGX Spark", "returns the start entry by default for DGX Station when the image is already cached", and "keeps generic Linux managed vLLM behind EXPERIMENTAL".
  • met — Kept src/lib/onboard.ts line-budget neutral.: Diff stat shows src/lib/onboard.ts changed 4 lines with 2 insertions and 2 deletions; monolithDeltas reports headLines and baseLines both 10332 with delta 0.
  • unknownnpx prek run --all-files passes: The PR body checkbox is checked, but no completed trusted CI evidence for this exact command was available in the supplied rollup for head SHA 9bca801.
  • unknownnpm test passes: The PR body checkbox is checked, but unit-vitest-linux is QUEUED in the supplied status rollup for head SHA 9bca801.
  • met — Tests added or updated for new or changed behavior: src/lib/onboard/vllm-menu.test.ts was expanded from 119 to 162 lines and adds Spark, Station, and generic Linux gating cases.
  • met — No secrets, API keys, or credentials committed: The diff changes TypeScript menu logic/tests, docs, generated skills, and ci/platform-matrix.json; it does not add credential files, token literals, or API keys.
  • met — Docs updated for user-facing behavior changes: Docs and generated skills were updated across docs/inference/use-local-inference.mdx, docs/inference/inference-options.mdx, docs/get-started/quickstart.mdx, docs/security/best-practices.mdx, release notes, and corresponding .agents skill references to document DGX default managed vLLM and generic Linux opt-in behavior.

Security review

  • pass — Secrets and Credentials: No hardcoded secrets, API keys, passwords, tokens, credential JSON, PEM/key files, or connection strings were added. Existing documentation references HF_TOKEN and provider API-key environment variables as placeholders only.
  • pass — Input Validation and Data Sanitization: The changed helper trims and lowercases NEMOCLAW_PROVIDER and compares it to the exact literal install-vllm. Platform is typed as NvidiaPlatform and checked against a fixed allowlist Set containing spark and station. No command construction, URL parsing, path traversal surface, unsafe deserialization, eval, or HTML rendering is introduced.
  • pass — Authentication and Authorization: Not applicable — the diff does not add or modify HTTP endpoints, auth checks, authorization decisions, token validation, scope handling, or resource ownership logic.
  • pass — Dependencies and Third-Party Libraries: No new dependencies, dependency version changes, registry changes, installer downloads, or package-manager configuration changes are introduced.
  • pass — Error Handling and Logging: The helper may log a static note when NEMOCLAW_PROVIDER=install-vllm is overridden by an already-running vLLM instance. It does not log secret values, token contents, PII, or sensitive paths.
  • pass — Cryptography and Data Protection: Not applicable — no cryptographic operations, key management, hashing, encryption, TLS, or data-protection mechanisms are changed.
  • pass — Configuration and Security Headers: No Dockerfile, workflow trusted-code boundary, CORS/CSP/security header, port exposure, container user, or production config changes are made. The platform matrix/docs reclassify managed vLLM as caveated and the code only changes menu discoverability for allowlisted DGX platforms.
  • warning — Security Testing: Unit tests cover positive DGX Spark/Station default menu behavior and the negative generic Linux gate. However, this is a NemoClaw onboarding/inference/runtime path that can lead to pulling and starting managed vLLM containers, and required E2E evidence for the current head SHA is missing.
  • warning — Holistic Security Posture: The change intentionally broadens default availability of a managed local inference installer/start path on DGX Spark/Station. The implementation is small and allowlisted, but onboarding/install/network/credential paths are high risk in NemoClaw and should be validated with required E2E before merge.

Test / E2E status

  • Test depth: e2e_required — Runtime/sandbox/infrastructure paths need real execution coverage: .agents/skills/nemoclaw-user-configure-inference/SKILL.md, .agents/skills/nemoclaw-user-configure-inference/references/inference-options.md, .agents/skills/nemoclaw-user-configure-security/references/best-practices.md, .agents/skills/nemoclaw-user-get-started/SKILL.md, .agents/skills/nemoclaw-user-overview/references/release-notes.md, .agents/skills/nemoclaw-user-reference/SKILL.md, .agents/skills/nemoclaw-user-reference/references/architecture.md, ci/platform-matrix.json. The TypeScript unit tests validate helper logic, but they do not prove full onboarding provider selection, detected DGX platform plumbing, managed-vLLM install/start behavior, sandbox inference routing, or credential leak protections.
  • E2E Advisor: missing
  • Required E2E jobs: cloud-onboard-e2e
  • Missing for analyzed SHA: cloud-onboard-e2e

✅ What looks good

  • The implementation is small and centralized in src/lib/onboard/vllm-menu.ts rather than expanding the large src/lib/onboard.ts monolith.
  • The default platform set is allowlisted to spark and station, preserving the gate for generic Linux unless explicit opt-in or EXPERIMENTAL is used.
  • Tests include positive DGX Spark/DGX Station cases and a negative generic Linux case.
  • The test import now references the source helper directly instead of generated dist declarations, improving mocking/build purity.
  • Docs, generated skills, release notes, and the platform matrix were updated to describe the new DGX-default versus generic-Linux opt-in behavior.
  • No new dependencies, workflow changes, shell-string execution, Dockerfile changes, or credential-handling changes are introduced by this diff.

Review completeness

  • Review used the provided trusted deterministic PR context and supplied diff; no scripts, tests, package-manager commands, workflow dispatches, or network actions were executed.
  • CI, mergeability, review decision, and E2E status may change after this advisory result; this result reflects the supplied rollup for head SHA 9bca801.
  • No linked issue acceptance clauses were available because github.linkedIssues is empty.
  • CodeRabbit/review thread resolution state was not fully available; the rollup still shows CodeRabbit PENDING.
  • The diff was truncated to the supplied content; files were reviewed based on provided changed-file list, deterministic context, and visible diff excerpts.
  • Human maintainer review required: yes

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 20, 2026

E2E Advisor Recommendation

Required E2E: cloud-onboard-e2e, inference-routing-e2e, skill-agent-e2e
Optional E2E: docs-validation-e2e, gpu-e2e

Dispatch hint: cloud-onboard-e2e,inference-routing-e2e,skill-agent-e2e

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • cloud-onboard-e2e (high; live NVIDIA API and sandbox build, timeout 45 minutes): Required because onboarding source changed. This is the closest existing full onboard/sandbox lifecycle E2E and should catch regressions in non-interactive onboarding, provider registration, inference.local setup, and sandbox creation even though it uses the cloud provider rather than vLLM.
  • inference-routing-e2e (medium; some cases skip without optional provider keys, timeout 30 minutes): Required because this PR changes inference-provider selection behavior. Existing coverage validates inference routing through OpenShell, credential isolation, compatible endpoint handling, and provider error classification across the same onboarding/inference boundary.
  • skill-agent-e2e (medium/high; live agent validation, timeout 30 minutes): Required because .agents skill content changed. This validates that assistant skill assets are injected and usable in a real agent flow, reducing risk that updated inference/security guidance breaks user-facing assistant behavior.

Optional E2E

  • docs-validation-e2e (low/medium; documentation validation, timeout 15 minutes): Useful for the broad docs and skill-reference edits. It checks CLI/docs parity and markdown/link validity, but it is adjacent documentation confidence rather than direct runtime gating for the vLLM onboarding code path.
  • gpu-e2e (high; self-hosted GPU runner, Ollama local inference, timeout 30 minutes): Useful adjacent confidence for local GPU onboarding and local inference behavior. It covers Ollama rather than managed vLLM, so it does not directly validate the DGX Spark/Station vLLM menu change and should not be merge-blocking for this PR.

New E2E recommendations

  • managed-vllm-dgx-onboarding (high): No existing E2E job directly validates that managed vLLM install/start appears by default on DGX Spark or DGX Station when vLLM is not already running and NEMOCLAW_EXPERIMENTAL is unset.
    • Suggested test: Add a DGX Spark/Station onboarding scenario that runs the onboard provider-selection path with no localhost:8000 vLLM, asserts an install-vllm menu entry appears without NEMOCLAW_EXPERIMENTAL, and verifies the non-interactive NEMOCLAW_PROVIDER=install-vllm path reaches the managed-vLLM setup boundary safely.
  • managed-vllm-generic-linux-gating (medium): The intended behavior differs by platform: generic Linux NVIDIA GPU hosts should still hide managed vLLM unless NEMOCLAW_EXPERIMENTAL=1 or NEMOCLAW_PROVIDER=install-vllm is set. Current coverage is unit-level only in vllm-menu.test.ts.
    • Suggested test: Add an E2E or scenario-runner test for generic Linux NVIDIA GPU detection that asserts managed vLLM is hidden without opt-in, appears with NEMOCLAW_EXPERIMENTAL=1, and emits a clear no-profile/error path for explicit NEMOCLAW_PROVIDER=install-vllm when unsupported.
  • managed-vllm-runtime-smoke (high): Existing GPU/local-inference E2E covers Ollama, not vLLM. The changed path can pull/start a vLLM container, choose a model profile, and bake chat-completions routing into the sandbox; those runtime behaviors are not covered by an existing named E2E job.
    • Suggested test: Add an opt-in managed-vLLM smoke E2E on an appropriate NVIDIA GPU runner that starts or reuses the managed vLLM container, completes non-interactive onboarding, verifies the recorded model from /v1/models, and confirms sandbox inference goes through inference.local using chat/completions.

Dispatch hint

  • Workflow: nightly-e2e.yaml
  • jobs input: cloud-onboard-e2e,inference-routing-e2e,skill-agent-e2e

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/lib/onboard/vllm-menu.ts (1)

72-76: ⚡ Quick win

Consider removing the type assertion for better type safety.

The type assertion as NvidiaPlatform on line 75 bypasses the type checker when opts.platform is undefined. While the code works correctly at runtime (because Set.has(undefined) returns false), an explicit check would be more type-safe and clearer.

♻️ Proposed fix for type safety
  if (
    userChoseManagedVllm ||
    (opts.vllmProfile &&
-      (opts.experimental || MANAGED_VLLM_DEFAULT_PLATFORMS.has(opts.platform as NvidiaPlatform)))
+      (opts.experimental || (opts.platform && MANAGED_VLLM_DEFAULT_PLATFORMS.has(opts.platform))))
  ) {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/onboard/vllm-menu.ts` around lines 72 - 76, The condition uses a type
assertion (opts.platform as NvidiaPlatform) which bypasses type checking when
opts.platform can be undefined; update the condition in the if that checks
userChoseManagedVllm / opts.vllmProfile / opts.experimental so it only calls
MANAGED_VLLM_DEFAULT_PLATFORMS.has when opts.platform is present (e.g., change
the sub-expression to (opts.experimental || (opts.platform &&
MANAGED_VLLM_DEFAULT_PLATFORMS.has(opts.platform)))) so type safety is preserved
and undefined platforms aren’t asserted.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/lib/onboard/vllm-menu.ts`:
- Around line 72-76: The condition uses a type assertion (opts.platform as
NvidiaPlatform) which bypasses type checking when opts.platform can be
undefined; update the condition in the if that checks userChoseManagedVllm /
opts.vllmProfile / opts.experimental so it only calls
MANAGED_VLLM_DEFAULT_PLATFORMS.has when opts.platform is present (e.g., change
the sub-expression to (opts.experimental || (opts.platform &&
MANAGED_VLLM_DEFAULT_PLATFORMS.has(opts.platform)))) so type safety is preserved
and undefined platforms aren’t asserted.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 8006d60e-a33d-4c00-9e12-a9a74d0329b1

📥 Commits

Reviewing files that changed from the base of the PR and between e122450 and 019cf8e.

📒 Files selected for processing (3)
  • src/lib/onboard.ts
  • src/lib/onboard/vllm-menu.test.ts
  • src/lib/onboard/vllm-menu.ts

@zyang-dev zyang-dev added v0.0.47 Release target Platform: DGX Spark Support for DGX Spark Platform: Station Support for the NVIDIA DGX Station with GB300. labels May 20, 2026
zyang-dev added 2 commits May 20, 2026 13:27
…orm before testing the managed-vLLM default platform set.

Signed-off-by: zyang-dev <267119621+zyang-dev@users.noreply.github.com>
…nstead of relying on generated dist declarations.

Signed-off-by: zyang-dev <267119621+zyang-dev@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@ericksoa ericksoa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, the implementation path itself looks small and coherent: setupNim() now passes gpu?.platform, the menu helper allowlists only spark/station, generic Linux stays behind the experimental gate, and the focused helper tests cover those cases.\n\nI think this needs one doc follow-up before merge, though. The PR changes user-facing onboarding behavior for DGX Spark/Station, but current docs still say managed vLLM install/start requires NEMOCLAW_EXPERIMENTAL=1 or NEMOCLAW_PROVIDER=install-vllm. In particular, docs/inference/use-local-inference.mdx still says to set NEMOCLAW_EXPERIMENTAL=1 before selecting Install/Start vLLM, and docs/security/best-practices.mdx still says managed vLLM install/start is hidden by default and surfaced only via NEMOCLAW_EXPERIMENTAL=1. After this PR, that is no longer true on DGX Spark and DGX Station, so users following the docs/security guidance would get stale information.\n\nCould you update the affected docs (and any generated skill/reference text if that is the source of truth) to say DGX Spark and DGX Station surface managed vLLM by default while generic Linux remains gated?\n\nValidation I checked locally: git diff --check and vitest run src/lib/onboard/vllm-menu.test.ts both pass. Live CI is also green at head 3d5965fc1ddddd8819ecb3b820debbc75cff8275.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
ci/platform-matrix.json (1)

1-5: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add SPDX copyright and Apache-2.0 license metadata to this JSON source file.

Line 1 starts a JSON source file without SPDX metadata, which violates the repository-wide source-file requirement.

As per coding guidelines, "**/*.{js,ts,tsx,jsx,sh,yaml,yml,json,md,mdx}: Every source file must include an SPDX license header for copyright and Apache-2.0 license".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ci/platform-matrix.json` around lines 1 - 5, Add an SPDX header object at the
top of this JSON file containing copyright and license metadata: insert keys
like "spdxCopyright" (or "spdx": {"copyright":"...","license":"Apache-2.0"}) or
equivalent JSON fields before the existing "$comment" so the file includes the
required SPDX copyright and Apache-2.0 license information; ensure the new
fields are valid JSON properties (e.g., "spdx": {"copyright":"2026 YourOrg or
Contributors","license":"Apache-2.0"}) and placed above "version" to satisfy the
repository-wide source-file requirement while preserving existing keys such as
"$comment", "version", and "updated".
🧹 Nitpick comments (1)
docs/about/release-notes.md (1)

91-91: ⚡ Quick win

Split Line 91 into one sentence per source line.

Line 91 contains multiple sentences on a single line; please split them for diff readability and docs consistency.

As per coding guidelines, "docs/** formatting rule: One sentence per line in source (makes diffs readable)."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/about/release-notes.md` at line 91, The sentence "The onboarding
provider menu offers an already-running local vLLM server directly when
`localhost:8000` responds. Managed vLLM install and start options now appear by
default on DGX Spark and DGX Station, while generic Linux NVIDIA GPU hosts
remain behind the experimental opt-in." should be split so each sentence is on
its own source line for docs/about/release-notes.md; replace the single-line
paragraph with three separate lines: one for the onboarding provider/local vLLM
server statement, one for the managed vLLM install/start options defaulting on
DGX Spark and DGX Station, and one for the note about generic Linux NVIDIA GPU
hosts being experimental opt-in.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@ci/platform-matrix.json`:
- Around line 1-5: Add an SPDX header object at the top of this JSON file
containing copyright and license metadata: insert keys like "spdxCopyright" (or
"spdx": {"copyright":"...","license":"Apache-2.0"}) or equivalent JSON fields
before the existing "$comment" so the file includes the required SPDX copyright
and Apache-2.0 license information; ensure the new fields are valid JSON
properties (e.g., "spdx": {"copyright":"2026 YourOrg or
Contributors","license":"Apache-2.0"}) and placed above "version" to satisfy the
repository-wide source-file requirement while preserving existing keys such as
"$comment", "version", and "updated".

---

Nitpick comments:
In `@docs/about/release-notes.md`:
- Line 91: The sentence "The onboarding provider menu offers an already-running
local vLLM server directly when `localhost:8000` responds. Managed vLLM install
and start options now appear by default on DGX Spark and DGX Station, while
generic Linux NVIDIA GPU hosts remain behind the experimental opt-in." should be
split so each sentence is on its own source line for
docs/about/release-notes.md; replace the single-line paragraph with three
separate lines: one for the onboarding provider/local vLLM server statement, one
for the managed vLLM install/start options defaulting on DGX Spark and DGX
Station, and one for the note about generic Linux NVIDIA GPU hosts being
experimental opt-in.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 2cf3c9ab-e142-49ef-bf75-7ba8f0006e08

📥 Commits

Reviewing files that changed from the base of the PR and between 3d5965f and b1d3676.

📒 Files selected for processing (16)
  • .agents/skills/nemoclaw-user-configure-inference/SKILL.md
  • .agents/skills/nemoclaw-user-configure-inference/references/inference-options.md
  • .agents/skills/nemoclaw-user-configure-security/references/best-practices.md
  • .agents/skills/nemoclaw-user-get-started/SKILL.md
  • .agents/skills/nemoclaw-user-overview/references/release-notes.md
  • ci/platform-matrix.json
  • docs/about/release-notes.md
  • docs/about/release-notes.mdx
  • docs/get-started/quickstart.md
  • docs/get-started/quickstart.mdx
  • docs/inference/inference-options.md
  • docs/inference/inference-options.mdx
  • docs/inference/use-local-inference.md
  • docs/inference/use-local-inference.mdx
  • docs/security/best-practices.md
  • docs/security/best-practices.mdx
✅ Files skipped from review due to trivial changes (4)
  • docs/about/release-notes.mdx
  • .agents/skills/nemoclaw-user-overview/references/release-notes.md
  • .agents/skills/nemoclaw-user-configure-inference/references/inference-options.md
  • .agents/skills/nemoclaw-user-configure-inference/SKILL.md

@github-actions
Copy link
Copy Markdown
Contributor

Copy link
Copy Markdown
Contributor

@ericksoa ericksoa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up looks good now. The DGX Spark/Station managed-vLLM default behavior is reflected in the MDX docs, generated skill/reference text, and the platform matrix, while generic Linux remains gated behind NEMOCLAW_EXPERIMENTAL=1 or NEMOCLAW_PROVIDER=install-vllm.\n\nI also merged current main into the branch and resolved the generated-doc conflicts. Local validation on head 9bca801: git diff --cached --check before commit, python3 scripts/generate-platform-docs.py --check, generated skills compared cleanly against scripts/docs-to-skills.py output, vitest run src/lib/onboard/vllm-menu.test.ts, and npm run docs:strict.

@ericksoa ericksoa merged commit 75095f5 into main May 20, 2026
37 of 38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Platform: DGX Spark Support for DGX Spark Platform: Station Support for the NVIDIA DGX Station with GB300. v0.0.47 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants