Skip to content

fix: report Stopped phase when InferenceService.spec.replicas=0 on Metal path#498

Merged
Defilan merged 1 commit into
defilantech:mainfrom
Defilan:fix/issue-489-metal-inferenceservice-scaled-to-replica
May 19, 2026
Merged

fix: report Stopped phase when InferenceService.spec.replicas=0 on Metal path#498
Defilan merged 1 commit into
defilantech:mainfrom
Defilan:fix/issue-489-metal-inferenceservice-scaled-to-replica

Conversation

@Defilan
Copy link
Copy Markdown
Member

@Defilan Defilan commented May 19, 2026

What

Report the Stopped phase on an InferenceService scaled to spec.replicas=0, instead of Creating.

Why

Fixes #489

When a Metal InferenceService is scaled to spec.replicas=0, the metal-agent correctly tears down the runtime process, but the operator kept reporting Creating. determinePhase had no replicas==0 branch, so a deliberately-stopped service was indistinguishable from one still coming up.

How

  • determinePhase (scheduling.go): add an early return — when desiredReplicas == 0 && readyReplicas == 0, return Stopped. It sits after the readyReplicas > 0 -> Progressing check and before the Metal/generic branches, so it covers both paths.
  • model_controller.go: add the PhaseStopped constant alongside the existing phase constants.
  • Stopped is already in the phase field's kubebuilder enum (inferenceservice_types.go), so no CRD regeneration is needed.
  • Two regression tests in inferenceservice_reconcile_test.go cover the generic and Metal code paths.

Checklist

  • Tests added/updated
  • make test passes locally
  • make lint passes locally
  • Commit messages follow conventional commits
  • All commits are signed off (git commit -s) per DCO
  • Documentation updated — n/a, no user-facing doc change

…tal path

With spec.replicas=0 the metal-agent correctly tears down the runtime process,
but the operator's determinePhase function had no replicas==0 branch on the
Metal path, so it always returned Creating/WaitingForMetalAgent regardless of
whether the user had intentionally stopped the service.

The fix adds an early return in determinePhase: when desiredReplicas==0 and
readyReplicas==0 it now returns PhaseStopped instead of falling through to
the Metal Creating path. PhaseStopped is defined alongside the other phase
constants in model_controller.go. Two regression tests cover both the generic
and Metal code paths.

Fixes defilantech#489

Signed-off-by: Christopher Maher <chris@mahercode.io>
@codecov
Copy link
Copy Markdown

codecov Bot commented May 19, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@Defilan Defilan merged commit 7787239 into defilantech:main May 19, 2026
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Metal InferenceService scaled to replicas=0 reports Creating, not Stopped

1 participant