Change exception treatment on incremental snapshot wait#12665
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## 4.22 #12665 +/- ##
=========================================
Coverage 17.61% 17.61%
- Complexity 15664 15665 +1
=========================================
Files 5917 5917
Lines 531402 531402
Branches 64971 64971
=========================================
Hits 93596 93596
Misses 427252 427252
Partials 10554 10554
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR addresses a bug in the KVM incremental volume snapshot creation process where thread interruptions during the wait for Libvirt's backup-begin command would cause snapshot failures. The change modifies the exception handling to log interruptions and continue waiting instead of throwing an exception.
Changes:
- Modified exception handling in
waitForBackup()to log InterruptedException at trace level and continue waiting instead of throwing CloudRuntimeException
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@blueorangutan package |
|
@JoaoJandre a [SL] Jenkins job has been kicked to build packages. It will be bundled with no SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 16916 |
|
@JoaoJandre Can you rebase with 4.22 if this fix can go in 4.22.1? any advise on testing this? |
Yes, I can do it soon. There is not much to test here. I was having a problem with this situation in my environment and decided to patch it. I don't know how to reproduce, as I am not sure why my threads were getting interrupted sometimes 🤷♂️ . |
|
@blueorangutan test |
|
@sureshanaparti a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests |
|
[SF] Trillian test result (tid-15575)
|
…/storage/KVMStorageProcessor.java Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
aef2026 to
99d3ce4
Compare
|
@blueorangutan package |
|
@JoaoJandre a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 17052 |
|
@sureshanaparti could we run the CI again? |
note that there are existing errors, but will do |
|
@blueorangutan test |
|
@DaanHoogland a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests |
|
[SF] Trillian test result (tid-15594)
|
* 4.22:
VM Deployment using snapshot in new zone (#13178)
Change exception treatment on incremental snapshot wait (#12665)
Move checkRoleEscalation outside DB transaction in createAccount (#13044)
Fix/flasharray delete rename destroy patch conflict (#13049)
Fix VPC network offerings listing in isolated network creation form (#12645)
systemvm: accept ipv6 established/related return traffic (#13173)
update debian change log
Updating pom.xml version numbers for release 4.22.2.0-SNAPSHOT
Updating pom.xml version numbers for release 4.22.1.0
Update suse15 packaging spec, use qemu-ovmf-x86_64 package instead of edk2-ovmf for agent (#13133)
Change disk-only VM snapshot removal message (#11182)
Update mysql java connector version to 8.4.0 (matching version for MySQL 8.4) (#12640)
adaptive: honor user-provided capacityBytes when provider stats are unavailable (#13059)
Flexibilize public IP selection (#11076)
Description
During the KVM incremental volume snapshot creation process, ACS waits for Libvirt's
backup-begincommand to finish; during this period, there is a possibility that the thread that is waiting is interrupted, if that happens, an exception is thrown and the snapshot fails.This PR changes the exception handling so that the thread only informs that it was interrupted in the logs and goes back to waiting for the Libvirt process to finish.
Types of changes
Feature/Enhancement Scale or Bug Severity
Feature/Enhancement Scale
Bug Severity
Screenshots (if appropriate):
How Has This Been Tested?
How did you try to break this feature and the system with this change?