test(e2e): migrate inference routing provider coverage by jyaunches · Pull Request #3903 · NVIDIA/NemoClaw

jyaunches · 2026-05-20T12:46:23Z

Summary

Adds inference-routing/provider E2E coverage primitives and domain suites for issue #3812, migrating legacy inference assertions into the scenario framework with parity metadata. The PR also records spec/test/validation artifacts and marks available validation results while leaving live PR/provider evidence blocked until PR context exists.

Related Issue

Fixes #3812

Changes

Added test/e2e/validation_suites/lib/inference_routing.sh helper coverage for chat completions, provider health, and Ollama auth-proxy checks.
Added inference domain suite scripts for routing, provider switching, Kimi compatibility, Ollama auth proxy, and model-router paths.
Updated test/e2e/validation_suites/suites.yaml and test/e2e/docs/parity-map.yaml so issue test(e2e): migrate inference routing and provider coverage #3812 legacy assertions are explicitly classified.
Added/updated scenario-framework tests and coverage reporting for inference-routing/provider parity.
Added VD workflow artifacts under specs/2026-05-20_inference-routing-provider-coverage/ with completion and validation markers.

Type of Change

Code change (feature, bug fix, or refactor)
Code change with doc updates
Doc only (prose changes, no code sample modifications)
Doc only (includes code sample changes)

Verification

npx prek run --all-files passes
npm test passes
Tests added or updated for new or changed behavior
No secrets, API keys, or credentials committed
Docs updated for user-facing behavior changes
make docs builds without warnings (doc changes only)
Doc pages follow the style guide (doc changes only)
New doc pages include SPDX header and frontmatter (new pages only)

Verification run:

npm test -- test/e2e/scenario-framework-tests/e2e-legacy-assertion-inventory.test.ts test/e2e/scenario-framework-tests/e2e-parity-map.test.ts test/e2e/scenario-framework-tests/e2e-lib-helpers.test.ts test/e2e/scenario-framework-tests/e2e-convention-lint.test.ts test/e2e/scenario-framework-tests/e2e-scenario-resolver.test.ts test/e2e/scenario-framework-tests/e2e-suite-runner.test.ts test/e2e/scenario-framework-tests/e2e-coverage-report.test.ts — passed, 71 tests.
Plan-only scenarios passed for ubuntu-repo-cloud-openclaw, gpu-repo-local-ollama-openclaw, ubuntu-repo-cloud-hermes, and brev-launchable-cloud-openclaw.

Known verification gaps:

npx prek run --all-files failed in this worktree due broad pre-existing/environment failures, including missing nemoclaw/dist/blueprint/private-networks.js, missing nemoclaw/node_modules/json5, and unrelated CLI test timeouts/assertions.
Full npm test was not rerun after the broad hook failure; targeted scenario-framework tests passed.
Live E2E/provider validation was not run because it requires suitable runner/provider secrets.

Signed-off-by: Julie Yaunches jyaunches@nvidia.com

Summary by CodeRabbit

Release Notes

Tests
- Expanded end-to-end testing infrastructure with new validation suites for inference services
- Added comprehensive test coverage for inference routing, model routing, authentication enforcement, and compatibility verification
- Enhanced test reporting and suite organization for improved coverage visibility

…erage

coderabbitai · 2026-05-20T12:46:36Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 3b111490-238f-4bdf-ab8d-79f0c4e7f5b5

📥 Commits

Reviewing files that changed from the base of the PR and between 1e6c959 and c35842d.

📒 Files selected for processing (2)

test/e2e/validation_suites/lib/inference_routing.sh
test/e2e/validation_suites/suites.yaml

📝 Walkthrough

Walkthrough

This PR completes the scenario framework migration of inference routing and provider E2E coverage. It introduces a Bash primitive library with reusable assertion helpers, nine new domain-specific validation scripts, updates suite configurations and parity mappings, and adds framework tests to validate coverage completeness.

Changes

Inference Routing E2E Coverage Migration

Layer / File(s)	Summary
Inference Routing Primitive Library `test/e2e/validation_suites/lib/inference_routing.sh`	New `inference_routing.sh` module exports three assertion functions for chat completion, health checks, and auth proxy validation. Includes internal helpers for sandbox context management, dry-run planning, sandboxed JSON curl requests, and HTTP status checking.
Domain Suite Validation Scripts `test/e2e/validation_suites/inference/{routing,switch,kimi-compatibility,model-router,ollama-auth-proxy}/*.sh`	Nine new Bash scripts across five inference domains invoke library assertions with stable `post-onboard.<domain>.<behavior>` IDs. Each script sources the inference_routing helper, sets strict error handling, and calls domain-specific assertion functions.
Suite Configuration & Parity Wiring `test/e2e/validation_suites/suites.yaml`, `test/e2e/docs/parity-map.yaml`, `test/e2e/runtime/resolver/coverage.ts`, `test/e2e/runtime/run-scenario.sh`	`suites.yaml` defines explicit step lists for five suite families replacing shared references. Five legacy test scripts in `parity-map.yaml` are reassigned from `providers-messaging` to `inference-routing-provider` bucket. Coverage report now displays sorted script lists per bucket. `model-router` suite added to optional-Docker-unavailable skip list.
Scenario Framework Validation Tests `test/e2e/scenario-framework-tests/e2e-lib-helpers.test.ts`, `test/e2e/scenario-framework-tests/e2e-parity-map.test.ts`, `test/e2e/scenario-framework-tests/e2e-scenario-additional-families.test.ts`, `test/e2e/scenario-framework-tests/e2e-coverage-report.test.ts`	New Vitest cases validate helper library behavior (strict sourcing, context failure modes, dry-run compatibility, secret redaction), real parity-map structure and issue `#3812` target script inclusion, inference suite family routing patterns, and coverage report domain visibility.

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

NVIDIA/NemoClaw#3601: Re-buckets the same test-model-router-provider-routed-inference.sh entry introduced in #3601 from providers-messaging to inference-routing-provider and extends related coverage assertions.
NVIDIA/NemoClaw#3731: Both PRs modify test/e2e/runtime/run-scenario.sh to adjust suite-skip logic when Docker is optional/unavailable, with this PR adding explicit model-router suite handling at the same control-flow point.

Suggested labels

E2E, enhancement: testing, CI/CD, VRDC, v0.0.46

🐰 With whiskers twitching bright and code so tight,
Nine scripts hop in, their routing set right,
The lib guides assertions through sandbox and test,
Inference coverage migrates—no requests denied, no rest!
Five domains unified in scenario's quest. 🚀

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'test(e2e): migrate inference routing provider coverage' clearly and concisely describes the main change—migrating inference routing E2E coverage to the scenario framework.
Linked Issues check	✅ Passed	All linked issue objectives are met: domain primitive library added [`#3812`], suite entries extended [`#3812`], stable assertion IDs implemented [`#3812`], parity-map updated [`#3812`], plan-only compatibility preserved [`#3812`], and framework tests passing [`#3812`].
Out of Scope Changes check	✅ Passed	All changes directly support the linked issue objectives—adding inference routing domain primitives, scenario suites, validation scripts, and framework tests. No unrelated changes detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch issue-3812-migrate-inference-routing-provider-coverage

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-05-20T12:46:57Z

PR Review Advisor

Recommendation: info only
Confidence: low
Analyzed HEAD: c35842daeddc782beb8c8e37e60ea94f3b33dffc
Findings: 0 blocker(s), 1 warning(s), 0 suggestion(s)

This is an automated advisory review. A human maintainer must make the final merge decision.

Limitations: Advisor execution failed: Could not configure advisor model openai/openai/gpt-5.5

Workflow run

Full advisor summary

PR Review Advisor

Base: origin/main
Head: HEAD
Analyzed SHA: c35842daeddc782beb8c8e37e60ea94f3b33dffc
Recommendation: info only
Confidence: low

PR review advisor failed: Could not configure advisor model openai/openai/gpt-5.5

Gate status

CI: pending — 12 status context(s) appear pending.
Mergeability: fail — mergeStateStatus=BLOCKED
Review threads: fail — 3 unresolved review thread(s).
Risky code tested: warning — Risky areas detected (credentials/inference/network); test files changed, but coverage still needs semantic review.

🔴 Blockers

None.

🟡 Warnings

PR review advisor unavailable: The automated advisor could not complete: Could not configure advisor model openai/openai/gpt-5.5
- Recommendation: Re-run the PR Review Advisor or perform a manual review.
- Evidence: Could not configure advisor model openai/openai/gpt-5.5

🔵 Suggestions

None.

Acceptance coverage

No linked acceptance clauses were analyzed.

Security review

warning — Secrets and Credentials: Advisor unavailable; human review required.
warning — Input Validation and Data Sanitization: Advisor unavailable; human review required.
warning — Authentication and Authorization: Advisor unavailable; human review required.
warning — Dependencies and Third-Party Libraries: Advisor unavailable; human review required.
warning — Error Handling and Logging: Advisor unavailable; human review required.
warning — Cryptography and Data Protection: Advisor unavailable; human review required.
warning — Configuration and Security Headers: Advisor unavailable; human review required.
warning — Security Testing: Advisor unavailable; human review required.
warning — Holistic Security Posture: Advisor unavailable; human review required.

Test / E2E status

Test depth: unit_sufficient — Changes are limited to tests, documentation, or metadata that cannot affect runtime behavior directly.
E2E Advisor: not_found (not found)

✅ What looks good

No positives were identified by the advisor.

Review completeness

Advisor execution failed: Could not configure advisor model openai/openai/gpt-5.5
Human maintainer review required: yes

github-actions · 2026-05-20T12:49:27Z

E2E Advisor Recommendation

Required E2E: scenario-runner-ubuntu-inference-provider-suites, scenario-runner-gpu-ollama-auth-proxy-suite, model-router-provider-routed-inference-e2e
Optional E2E: inference-routing-e2e, openclaw-inference-switch-e2e, kimi-inference-compat-e2e, ollama-proxy-e2e

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

scenario-runner-ubuntu-inference-provider-suites (high): Runs the scenario-based validation-suite path that this PR modifies for inference-routing, inference-switch, Kimi compatibility, and Model Router suite registration/helper execution.
scenario-runner-gpu-ollama-auth-proxy-suite (high): Exercises the new ollama-auth-proxy auth-enforcement step against a local Ollama/OpenClaw scenario, covering the auth security boundary and token handling helper paths.
model-router-provider-routed-inference-e2e (high): The PR adds Model Router validation-suite steps and optional-Docker skip classification; the existing regression job is the live coverage for provider-routed Model Router inference with NVIDIA_API_KEY.

Optional E2E

inference-routing-e2e (medium): Legacy/nightly inference routing job provides deeper parity coverage for real provider route health, credential isolation, and failure classification adjacent to the new migrated suite steps.
openclaw-inference-switch-e2e (high): Validates the full live nemoclaw inference set flow, openclaw.json patching, route persistence, and post-switch inference beyond the lightweight migrated suite assertions.
kimi-inference-compat-e2e (high): Provides hermetic Kimi-compatible endpoint coverage with OpenClaw agent/tool-call behavior, complementing the new Kimi suite health/model-route checks.
ollama-proxy-e2e (medium): Full standalone Ollama auth proxy E2E is useful as an additional guard for token auth, persistence, recovery, and real inference after adding auth-enforcement suite coverage.

New E2E recommendations

scenario-based inference/provider coverage (high): The repository has migrated suite ids for inference-routing, inference-switch, Kimi compatibility, Model Router, and Ollama auth proxy, but no dedicated setup_scenarios/test_plans include these domain-specific suites by default; they currently rely on suite_filter dispatch.
- Suggested test: Add explicit scenario-based plans for inference-routing-provider and ollama-auth-proxy suites so CI can dispatch stable named scenarios without ad hoc E2E_SUITE_FILTER values.
Model Router scenario metadata (medium): The new model-router suite can be selected by filter, but there is no dedicated onboarding profile/scenario that declares Model Router provider semantics in scenarios.yaml.
- Suggested test: Add a Model Router onboarding profile and scenario that runs the model-router suite against the provider-routed configuration with NVIDIA_API_KEY.
optional Docker skip behavior (low): run-scenario.sh now skips model-router when Docker is optional/unavailable, but there is no live macOS/optional-Docker scenario coverage specifically asserting that model-router is reported as skipped rather than executed.
- Suggested test: Add a lightweight scenario-runner dry-run or macOS optional-Docker check for model-router suite filtering and skip reporting.

coderabbitai

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@specs/2026-05-20_inference-routing-provider-coverage/spec.md`:
- Line 1: Add a top-of-file SPDX license header to this new Markdown spec:
insert an HTML comment at the very beginning containing the copyright holder and
the SPDX identifier for Apache-2.0 (e.g., include a copyright line and
"SPDX-License-Identifier: Apache-2.0") so the file complies with the required
header rule.

In `@specs/2026-05-20_inference-routing-provider-coverage/tests.md`:
- Line 1: This Markdown spec is missing the required SPDX header; add an HTML
comment at the top of the file containing both the SPDX copyright and license
tags, e.g. include a first-line block like <!-- SPDX-FileCopyrightText: 2026
YourOrganizationName --> and <!-- SPDX-License-Identifier: Apache-2.0 -->
(replace YourOrganizationName with the appropriate copyright holder) so the file
(specs/2026-05-20_inference-routing-provider-coverage/tests.md) has the required
SPDX copyright and Apache-2.0 license header.

In `@specs/2026-05-20_inference-routing-provider-coverage/validation.md`:
- Line 1: The file validation.md is missing the required SPDX license header;
add the standard two-line SPDX header at the very top of the file for copyright
and Apache-2.0 license (include SPDX-FileCopyrightText with the project or
copyright owner and SPDX-License-Identifier: Apache-2.0) so the top of
specs/2026-05-20_inference-routing-provider-coverage/validation.md contains the
required SPDX metadata before any content.

In `@test/e2e/validation_suites/lib/inference_routing.sh`:
- Around line 100-120: The function e2e_inference_routing_assert_auth_proxy
doesn't vary the request for different modes and never marks success; update
e2e_inference_routing_assert_auth_proxy to (1) choose and inject the appropriate
authentication for the request based on the mode variable (e.g., add a valid
Authorization header when mode == "valid", omit or send an invalid token when
mode == "invalid" or "unauthenticated") by passing the header/context into the
call to _e2e_inference_status (or by calling a variant that accepts headers),
and (2) call e2e_pass after the case check succeeds; keep the existing status
checks for valid vs invalid modes and ensure the function still returns non-zero
on unknown mode.

In `@test/e2e/validation_suites/suites.yaml`:
- Around line 91-95: Prepend the required SPDX license header to the top of this
YAML file (above the existing "steps" block) so every source includes copyright
and license info; add a copyright line (e.g., "Copyright (c) <YEAR> <Owner>")
followed by "SPDX-License-Identifier: Apache-2.0" as the file header, ensuring
it appears before entries like "steps" and the ids "proxy-reachable" and
"auth-enforcement".

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 60cb7ccd-5c05-4844-9bfc-8cb084c912e0

📥 Commits

Reviewing files that changed from the base of the PR and between ca045a9 and 1e6c959.

📒 Files selected for processing (21)

specs/2026-05-20_inference-routing-provider-coverage/spec.md
specs/2026-05-20_inference-routing-provider-coverage/tests.md
specs/2026-05-20_inference-routing-provider-coverage/validation.md
test/e2e/docs/parity-map.yaml
test/e2e/runtime/resolver/coverage.ts
test/e2e/runtime/run-scenario.sh
test/e2e/scenario-framework-tests/e2e-coverage-report.test.ts
test/e2e/scenario-framework-tests/e2e-lib-helpers.test.ts
test/e2e/scenario-framework-tests/e2e-parity-map.test.ts
test/e2e/scenario-framework-tests/e2e-scenario-additional-families.test.ts
test/e2e/validation_suites/inference/kimi-compatibility/00-plugin-wiring.sh
test/e2e/validation_suites/inference/kimi-compatibility/01-kimi-compatible-models-route.sh
test/e2e/validation_suites/inference/model-router/00-healthy-endpoint.sh
test/e2e/validation_suites/inference/model-router/01-provider-routed-completion.sh
test/e2e/validation_suites/inference/ollama-auth-proxy/01-auth-enforcement.sh
test/e2e/validation_suites/inference/routing/00-inference-local-chat-completion.sh
test/e2e/validation_suites/inference/routing/01-provider-route-health.sh
test/e2e/validation_suites/inference/switch/00-route-state-updated.sh
test/e2e/validation_suites/inference/switch/01-switched-inference-local-chat.sh
test/e2e/validation_suites/lib/inference_routing.sh
test/e2e/validation_suites/suites.yaml

…ference-routing-provider-coverage

wscurran · 2026-05-20T16:01:59Z

✨Related open issues:

#3812 test(e2e): migrate inference routing and provider coverage

jyaunches added 21 commits May 20, 2026 07:32

Simplify inference routing coverage spec

38da01a

Add test specification for inference routing coverage

99d7fbe

Add validation plan for inference routing coverage

4c3b8a6

Approve validation plan for 2026-05-20_inference-routing-provider-cov…

383a732

…erage

Apply design review for inference routing coverage spec

120671f

Apply implementation review for inference routing coverage spec

6d66658

test: Add failing tests for Phase 1

e3002b4

Mark Phase 1 as completed [e3002b4]

f011049

test: Add failing tests for Phase 2

d8b955a

feat: Implement Phase 2 - inference routing primitives

6c96e7d

Mark Phase 2 as completed [6c96e7d]

9663659

test: Add failing tests for Phase 3

9cb5678

feat: Implement Phase 3 - domain inference suites

4b03eef

Mark Phase 3 as completed [4b03eef]

a9c17cc

test: Add failing tests for Phase 4

b252a09

feat: Implement Phase 4 - parity coverage summary

fc960f1

Mark Phase 4 as completed [fc960f1]

8b0b763

Mark Phase 5 as completed [fc960f1]

a6d4d39

test(e2e): mark inference routing validation results

5f08f8f

chore(spec): normalize validation markdown

1581b64

chore: apply hook formatting

1e6c959

jyaunches self-assigned this May 20, 2026

coderabbitai Bot reviewed May 20, 2026

View reviewed changes

jyaunches added 4 commits May 20, 2026 09:54

Merge remote-tracking branch 'origin/main' into issue-3812-migrate-in…

78d3a50

…ference-routing-provider-coverage

Merge remote-tracking branch 'origin/main' into issue-3812-migrate-in…

356bca5

…ference-routing-provider-coverage

fix(ci): remove ignored specs from PR

59366c8

fix(e2e): address inference review feedback

c35842d

wscurran added E2E End-to-end testing — Brev infrastructure, test cases, nightly failures, and coverage gaps enhancement: testing Use this label to identify requests to improve NemoClaw test coverage. fix labels May 20, 2026

wscurran added the enhancement: inference Items related to running (local or hosted) inference models from NemoClaw. label May 20, 2026

jyaunches added the v0.0.47 Release target label May 20, 2026

This was referenced May 20, 2026

test(e2e): fix current nightly failures #3926

Merged

fix(e2e): use sandbox subcommands in scenario suites #3927

Open

cv approved these changes May 20, 2026

View reviewed changes

github-actions Bot mentioned this pull request May 21, 2026

test(e2e): migrate security policy credential suites #3905

Merged

12 tasks

cv added v0.0.49 Release target and removed v0.0.47 Release target labels May 21, 2026

github-actions Bot mentioned this pull request May 21, 2026

fix(e2e): run full repo scenario cli build #4002

Open

Conversation

jyaunches commented May 20, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Related Issue

Changes

Type of Change

Verification

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Possibly related PRs

Suggested labels

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Advisor

PR Review Advisor

Gate status

🔴 Blockers

🟡 Warnings

🔵 Suggestions

Acceptance coverage

Security review

Test / E2E status

✅ What looks good

Review completeness

Uh oh!

github-actions Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Advisor Recommendation

E2E Recommendation Advisor

Required E2E

Optional E2E

New E2E recommendations

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

wscurran commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jyaunches commented May 20, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 20, 2026 •

edited

Loading

github-actions Bot commented May 20, 2026 •

edited

Loading

github-actions Bot commented May 20, 2026 •

edited

Loading