Enhancement: Add a ticker to check job completion as a fallback#347
Enhancement: Add a ticker to check job completion as a fallback#347feichashao wants to merge 2 commits into
Conversation
WalkthroughWaitForJobCompletion now polls job status every 10s via a time.Ticker as a fallback alongside the existing Kubernetes watch. Both watch events and ticker-driven fetches use a new checkJobCompletion helper that returns nil on success, an error on failure, or a package sentinel error (errJobIncomplete) to indicate "keep waiting." ChangesJob Completion Detection with Polling Fallback
Sequence DiagramsequenceDiagram
participant Caller as Client
participant Watch as Kube Watch
participant API as Kubernetes API
participant Ticker as Poll Ticker
Caller->>Watch: Start watch for Job events
Caller->>Ticker: Start 10s ticker
alt Watch event arrives
Watch->>Caller: Job event
Caller->>Caller: checkJobCompletion(event.Job)
alt complete
Caller-->>Caller: return nil
else failed
Caller-->>Caller: return error
else incomplete
Caller-->>Caller: continue waiting
end
else Ticker tick
Ticker->>API: Get Job
API->>Caller: Job object
Caller->>Caller: checkJobCompletion(job)
alt complete
Caller-->>Caller: return nil
else failed
Caller-->>Caller: return error
else incomplete
Caller-->>Caller: continue waiting
end
end
Caller->>Caller: ctx.Done() -> return ctx error
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes 🚥 Pre-merge checks | ✅ 11 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (11 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/clients/kube/kube.go`:
- Around line 151-158: The watch receive from watcher.ResultChan() can return a
closed channel causing a nil event and panic when accessing event.Object; change
the single-value receive to a two-value receive (e.g., event, ok :=
<-watcher.ResultChan()) and if !ok treat the watch as closed (stop using the
watcher, break/continue to fall back to the existing ticker-based polling) so
checkJobCompletion(job) is only called when event != nil and the type assertion
to *batchv1.Job succeeds; ensure you still compare the returned error to
errJobIncomplete and return other errors as before.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 3938f4d0-679d-4c55-8350-2f959af8840f
📒 Files selected for processing (2)
pkg/clients/kube/kube.gopkg/clients/kube/kube_test.go
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #347 +/- ##
==========================================
+ Coverage 33.51% 34.40% +0.89%
==========================================
Files 30 30
Lines 2256 2270 +14
==========================================
+ Hits 756 781 +25
+ Misses 1460 1447 -13
- Partials 40 42 +2
🚀 New features to boost your workflow:
|
|
@feichashao: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Enhancement
What does this PR do? / Related Issues / Jira
Assisted by Claude Code.
This PR adds a ticker to check the job completion.
When using
osdctl network verify-egresswith pod mode, it hangs when waiting for job completion:From the cluster, I can see the job has already completed.
It could be some issues that lead to the
watchbroken, and theWaitForJobCompletionkeeps in a loop.This PR adds a ticker to get the job status every 10s, as a fallback.
After this change, the verify-egress can succeed:
Checklist
Reviewer's Checklist
How to test this PR locally / Special Instructions
Logs
Summary by CodeRabbit