Skip to content

USHIFT-6716: Add el10 bootc release scenarios#6401

Open
vanhalenar wants to merge 5 commits intoopenshift:mainfrom
vanhalenar:el10-release-scenarios
Open

USHIFT-6716: Add el10 bootc release scenarios#6401
vanhalenar wants to merge 5 commits intoopenshift:mainfrom
vanhalenar:el10-release-scenarios

Conversation

@vanhalenar
Copy link
Copy Markdown
Contributor

No description provided.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 25, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Mar 25, 2026

@vanhalenar: This pull request references USHIFT-6716 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 25, 2026

Walkthrough

Adds three containerfile templates and ~30 new bootc RHEL 10.2 scenario scripts (standard, specialized, multi-config, published-images, upgrade flows), and removes two older standard scenario scripts. New files implement image build steps, service/tuned/FIPS configuration, VM lifecycle hooks, and test execution orchestration.

Changes

Cohort / File(s) Summary
Container Image Blueprints
test/image-blueprints-bootc/layer4-release/group1/rhel102-bootc-brew-nightly-with-optional.containerfile, test/image-blueprints-bootc/layer4-release/group2/rhel102-bootc-brew-lrel-fips.containerfile, test/image-blueprints-bootc/layer4-release/group2/rhel102-bootc-brew-lrel-tuned.containerfile
New containerfiles: nightly-with-optional sets up RPM repos, installs packages (including optional lists), configures firewall and services, and runs a prep script; FIPS variant enables FIPS kernel args and crypto policy; tuned variant configures microshift/tuned settings and tuned variables.
Scenario Scripts — Standard & Nightly
test/scenarios-bootc/el10/releases/el102-lrel@configuration.sh, ...@iso-standard1.sh, ...@iso-standard2.sh, ...@nightly-brew-standard1.sh, ...@nightly-brew-standard2.sh
New standard lifecycle scripts: define start_image, implement scenario_create_vms/scenario_remove_vms/scenario_run_tests, validate image availability, prepare kickstart, launch/remove VMs, and invoke test suites (standard1/standard2).
Scenario Scripts — Specialized (FIPS, IPv6, Dual-stack, TLS, Low-latency, AI, Multi-NIC, OSConfig, Storage/Telemetry)
test/scenarios-bootc/el10/releases/el102-lrel@ai-model-serving-online.sh, ...@dual-stack.sh, ...@fips.sh, ...@ipv6.sh, ...@low-latency.sh, ...@multi-nic.sh, ...@osconfig-router.sh, ...@storage-telemetry.sh, ...@tlsv13-standard1.sh
New specialized scenarios with platform gating (aarch64 skips), IPv6 handling (bridge IP, bracketed URLs), FIPS/crypto policy steps, tuned wait/boot-cycle checks, TLS v1.3 drop-ins, multi-NIC IP extraction, and custom test variable injection.
Scenario Scripts — Multi-config & Published-images
...@multi-config-standard1.sh, ...@multi-config-standard2.sh, ...@published-images-standard1.sh, ...@published-images-standard2.sh
Multi-config scripts include IPv6 kickstart preparation, TLS drop-in application, LVMS workload create/validate/cleanup, boot-cycle waits, and oc/kubeconfig setup. Published-images scripts compute release image URLs from BREW_LREL_RELEASE_VERSION, enable sigstore verification, and add firewall configuration steps.
Scenario Scripts — Upgrade / LVMS Flows
test/scenarios-bootc/el10/releases/el96-y2@el102-lrel@lvms.sh, ...@el102-lrel@standard1.sh, ...@standard2.sh, test/scenarios-bootc/el10/releases/el98-lrel@el102-lrel@lvms.sh, ...@standard1.sh, ...@standard2.sh, test/scenarios-bootc/el10/releases/el96-y1@el102-lrel@lvms.sh, ...@standard1.sh, ...@standard2.sh
New upgrade and LVMS-focused scenarios: set deterministic ordering, define start_image/dest_image, gate on both images, perform pre/post-upgrade LVMS workload operations, run upgrade suites and follow-up standard suites, and run Ginkgo tests in some flows.
Removed Scenarios
test/scenarios-bootc/el10/releases/el102-lrel@standard1.sh, test/scenarios-bootc/el10/releases/el102-lrel@standard2.sh
Deleted older standard1/standard2 scenario scripts (replaced/superseded by new variants with stricter image validation and additional setup).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot requested review from copejon and jogeo March 25, 2026 14:53
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 25, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: vanhalenar
Once this PR has been reviewed and has the lgtm label, please assign copejon for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@vanhalenar
Copy link
Copy Markdown
Contributor Author

/test ?

@vanhalenar
Copy link
Copy Markdown
Contributor Author

/test e2e-aws-tests-bootc-release-el10
/test e2e-aws-tests-bootc-release-el9

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🧹 Nitpick comments (1)
test/scenarios-bootc/el10/releases/el102-lrel@ipv6.sh (1)

21-28: Harden restoration of MIRROR_REGISTRY_URL on early exit.

If the function exits between Line 22 and Line 28, the overridden global can leak to later steps. Use a trap for guaranteed restore.

Proposed hardening
-    local original_mirror_registry_url="${MIRROR_REGISTRY_URL}"
-    MIRROR_REGISTRY_URL="$(hostname):${MIRROR_REGISTRY_PORT}/microshift"
+    local original_mirror_registry_url="${MIRROR_REGISTRY_URL}"
+    trap 'MIRROR_REGISTRY_URL="${original_mirror_registry_url}"' RETURN
+    MIRROR_REGISTRY_URL="$(hostname):${MIRROR_REGISTRY_PORT}/microshift"
@@
-    # Restore original MIRROR_REGISTRY_URL for runtime use
-    MIRROR_REGISTRY_URL="${original_mirror_registry_url}"
+    # restored by trap on function return
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/scenarios-bootc/el10/releases/el102-lrel`@ipv6.sh around lines 21 - 28,
The MIRROR_REGISTRY_URL override can leak if the script exits early; before
setting MIRROR_REGISTRY_URL="$(hostname):${MIRROR_REGISTRY_PORT}/microshift"
save the original in original_mirror_registry_url and install a trap on EXIT
that restores MIRROR_REGISTRY_URL="${original_mirror_registry_url}" and clears
the trap, then call prepare_kickstart (function name) and finally explicitly
restore and remove the trap after the call to avoid double-restores—this
guarantees the global is restored on any early exit.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@test/image-blueprints-bootc/layer4-release/group1/rhel102-bootc-brew-nightly-with-optional.containerfile`:
- Around line 13-14: The containerfile COPY references
microshift-fast-datapath-rhel9.repo while building an EL10 image; create a
microshift-fast-datapath-rhel10.repo with the EL10 repository definitions and
replace references to microshift-fast-datapath-rhel9.repo in all EL10
containerfiles (or, if reuse is intentional, add documentation/comments
explaining why the EL9 repo is compatible) so that the COPY line and package
streams use an EL10-specific fast-datapath repo.

In `@test/scenarios-bootc/el10/releases/el102-lrel`@fips.sh:
- Around line 7-12: The check_platform function currently hardcodes record_junit
"setup" which misreports skips when invoked from other hooks; modify
check_platform to accept a hook identifier parameter (e.g., check_platform
"$HOOK_NAME") and use that parameter in the record_junit call instead of the
hardcoded "setup", and update all call sites (the places invoking check_platform
from teardown/test hooks) to pass the correct hook string (for example
"teardown" or "test" combined with "scenario_create_vms") so the recorded JUnit
entry matches the actual hook context.

In `@test/scenarios-bootc/el10/releases/el102-lrel`@iso-standard1.sh:
- Line 11: Replace the incorrect use of the source image variable start_image
when calling launch_vm with the intended VM profile name: change the call
launch_vm "${start_image}" --vm_vcpus 4 to launch_vm rhel102-bootc --vm_vcpus 4
so the VM uses the rhel102-bootc profile; apply the same fix in the sibling
script el102-lrel@iso-standard2.sh where launch_vm is passed start_image.

In `@test/scenarios-bootc/el10/releases/el102-lrel`@iso-standard2.sh:
- Line 11: The call to launch_vm is using the source image variable start_image
but must pass the boot blueprint name; change the first argument of launch_vm
from ${start_image} to the boot profile name rhel102-bootc (keeping the existing
--vm_vcpus 4 flag) so launch_vm("rhel102-bootc", ...) is invoked and the correct
target is provisioned.

In `@test/scenarios-bootc/el10/releases/el102-lrel`@multi-config-standard1.sh:
- Around line 21-22: The image-check logic in scenario_create_vms is
inconsistent with the standard2 variant: it calls exit_if_image_not_found (hard
fail) while the other script uses does_image_exist (soft skip); pick the
intended behavior and make both scripts match—either replace
exit_if_image_not_found with does_image_exist in scenario_create_vms to perform
a soft skip, or change the other script to use exit_if_image_not_found for a
hard failure; update the call in scenario_create_vms and any related
documentation/comments so both variants use the same image-checking function
(exit_if_image_not_found or does_image_exist).

In `@test/scenarios-bootc/el10/releases/el102-lrel`@multi-config-standard2.sh:
- Around line 96-102: The comment "Validate LVMS still works after all tests" is
misleading because the run_command_on_vm call to execute
scripts/lvms-helpers/checkWorkloadExists.sh runs before run_tests; either move
the two lines that invoke checkWorkloadExists.sh and cleanupWorkload.sh (the
run_command_on_vm host1 'bash -s' invocations) to after the run_tests invocation
so validation truly occurs post-tests, or update the echo/comment text to
accurately state this is a pre-test validation; locate these calls by the
run_command_on_vm usage and the referenced scripts checkWorkloadExists.sh and
cleanupWorkload.sh and adjust accordingly.

In `@test/scenarios-bootc/el10/releases/el102-lrel`@multi-nic.sh:
- Around line 25-26: The script extracts NIC IPs using head/tail which can pick
the wrong addresses when manage-vm.sh ip returns more or fewer entries; change
the logic that sets vm_ip1 and vm_ip2 to select by interface/network name (e.g.,
call "${ROOTDIR}/scripts/devenv-builder/manage-vm.sh" ip -n "${vmname}" <ifname>
or parse the output to find entries matching the desired interface labels) and
validate the returned list has exactly the expected IPs before calling
run_tests; update the assignments for vm_ip1 and vm_ip2 and add a guard that
errors out with a clear message if the expected addresses are not found for
vmname.

In `@test/scenarios-bootc/el10/releases/el98-lrel`@el102-lrel@standard1.sh:
- Line 17: The VM launch target in the launch_vm invocation is wrong: replace
the image identifier used in the launch_vm command (currently "rhel96-bootc")
with the EL9.8 start image identifier (e.g., "rhel98-bootc") so the VM profile
matches the declared EL9.8 start image; update the single call to launch_vm
rhel96-bootc --vm_vcpus 4 to use the EL9.8 image name instead.

In `@test/scenarios-bootc/el10/releases/el98-lrel`@el102-lrel@standard2.sh:
- Line 17: The launch call uses the wrong base image: replace the launch_vm
invocation that currently specifies "rhel96-bootc" with the EL9.8 base image
"rhel98-bootc" so the scenario tests the correct upgrade path; update the single
invocation of launch_vm (the line containing launch_vm rhel96-bootc) to use
rhel98-bootc.

---

Nitpick comments:
In `@test/scenarios-bootc/el10/releases/el102-lrel`@ipv6.sh:
- Around line 21-28: The MIRROR_REGISTRY_URL override can leak if the script
exits early; before setting
MIRROR_REGISTRY_URL="$(hostname):${MIRROR_REGISTRY_PORT}/microshift" save the
original in original_mirror_registry_url and install a trap on EXIT that
restores MIRROR_REGISTRY_URL="${original_mirror_registry_url}" and clears the
trap, then call prepare_kickstart (function name) and finally explicitly restore
and remove the trap after the call to avoid double-restores—this guarantees the
global is restored on any early exit.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 234be6cf-4d6a-4d72-8261-a0a977b95f69

📥 Commits

Reviewing files that changed from the base of the PR and between 8882ae3 and 1858af2.

📒 Files selected for processing (29)
  • test/image-blueprints-bootc/layer4-release/group1/rhel102-bootc-brew-nightly-with-optional.containerfile
  • test/image-blueprints-bootc/layer4-release/group2/rhel102-bootc-brew-lrel-fips.containerfile
  • test/image-blueprints-bootc/layer4-release/group2/rhel102-bootc-brew-lrel-tuned.containerfile
  • test/scenarios-bootc/el10/releases/el102-lrel@ai-model-serving-online.sh
  • test/scenarios-bootc/el10/releases/el102-lrel@configuration.sh
  • test/scenarios-bootc/el10/releases/el102-lrel@dual-stack.sh
  • test/scenarios-bootc/el10/releases/el102-lrel@fips.sh
  • test/scenarios-bootc/el10/releases/el102-lrel@ipv6.sh
  • test/scenarios-bootc/el10/releases/el102-lrel@iso-standard1.sh
  • test/scenarios-bootc/el10/releases/el102-lrel@iso-standard2.sh
  • test/scenarios-bootc/el10/releases/el102-lrel@low-latency.sh
  • test/scenarios-bootc/el10/releases/el102-lrel@multi-config-standard1.sh
  • test/scenarios-bootc/el10/releases/el102-lrel@multi-config-standard2.sh
  • test/scenarios-bootc/el10/releases/el102-lrel@multi-nic.sh
  • test/scenarios-bootc/el10/releases/el102-lrel@nightly-brew-standard1.sh
  • test/scenarios-bootc/el10/releases/el102-lrel@nightly-brew-standard2.sh
  • test/scenarios-bootc/el10/releases/el102-lrel@osconfig-router.sh
  • test/scenarios-bootc/el10/releases/el102-lrel@published-images-standard1.sh
  • test/scenarios-bootc/el10/releases/el102-lrel@published-images-standard2.sh
  • test/scenarios-bootc/el10/releases/el102-lrel@standard1.sh
  • test/scenarios-bootc/el10/releases/el102-lrel@standard2.sh
  • test/scenarios-bootc/el10/releases/el102-lrel@storage-telemetry.sh
  • test/scenarios-bootc/el10/releases/el102-lrel@tlsv13-standard1.sh
  • test/scenarios-bootc/el10/releases/el96-y2@el102-lrel@lvms.sh
  • test/scenarios-bootc/el10/releases/el96-y2@el102-lrel@standard1.sh
  • test/scenarios-bootc/el10/releases/el96-y2@el102-lrel@standard2.sh
  • test/scenarios-bootc/el10/releases/el98-lrel@el102-lrel@lvms.sh
  • test/scenarios-bootc/el10/releases/el98-lrel@el102-lrel@standard1.sh
  • test/scenarios-bootc/el10/releases/el98-lrel@el102-lrel@standard2.sh
💤 Files with no reviewable changes (2)

Comment on lines +13 to +14
COPY ./bootc-images/$USHIFT_RPM_REPO_NAME.repo ./bootc-images/microshift-fast-datapath-rhel9.repo ./bootc-images/microshift-rhocp-y.repo \
/etc/yum.repos.d/
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Available fast-datapath repo files:"
fd 'microshift-fast-datapath-.*\.repo$' . || true

echo
echo "Containerfiles referencing fast-datapath repo files:"
rg -n -C2 'microshift-fast-datapath-rhel(9|10)\.repo' .

Repository: openshift/microshift

Length of output: 17556


Create or use an EL10-specific fast-datapath repo.

Line 13 uses microshift-fast-datapath-rhel9.repo in an EL10.2 image. No microshift-fast-datapath-rhel10.repo exists in the codebase, and all EL10 containerfiles share this pattern. This major-version mismatch can cause wrong package stream selection and break the build.

Either create microshift-fast-datapath-rhel10.repo and update all EL10 containerfiles to reference it, or document why the EL9 repo is intentionally reused for EL10.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@test/image-blueprints-bootc/layer4-release/group1/rhel102-bootc-brew-nightly-with-optional.containerfile`
around lines 13 - 14, The containerfile COPY references
microshift-fast-datapath-rhel9.repo while building an EL10 image; create a
microshift-fast-datapath-rhel10.repo with the EL10 repository definitions and
replace references to microshift-fast-datapath-rhel9.repo in all EL10
containerfiles (or, if reuse is intentional, add documentation/comments
explaining why the EL9 repo is compatible) so that the COPY line and package
streams use an EL10-specific fast-datapath repo.

Comment on lines +7 to +12
check_platform() {
if [[ "${UNAME_M}" =~ aarch64 ]] ; then
record_junit "setup" "scenario_create_vms" "SKIPPED"
exit 0
fi
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

check_platform records the wrong JUnit hook when called outside setup.

Line 9 always records setup/scenario_create_vms, but this function is also called from teardown/test hooks (Line 23, Line 30), which can produce misleading skip reporting on aarch64.

Suggested fix pattern
-check_platform() {
-    if [[ "${UNAME_M}" =~ aarch64 ]] ; then
-        record_junit "setup" "scenario_create_vms" "SKIPPED"
+check_platform() {
+    local -r record_setup_skip=${1:-false}
+    if [[ "${UNAME_M}" =~ aarch64 ]] ; then
+        if "${record_setup_skip}"; then
+            record_junit "setup" "scenario_create_vms" "SKIPPED"
+        fi
         exit 0
     fi
 }
@@
-scenario_create_vms() {
-    check_platform
+scenario_create_vms() {
+    check_platform true
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/scenarios-bootc/el10/releases/el102-lrel`@fips.sh around lines 7 - 12,
The check_platform function currently hardcodes record_junit "setup" which
misreports skips when invoked from other hooks; modify check_platform to accept
a hook identifier parameter (e.g., check_platform "$HOOK_NAME") and use that
parameter in the record_junit call instead of the hardcoded "setup", and update
all call sites (the places invoking check_platform from teardown/test hooks) to
pass the correct hook string (for example "teardown" or "test" combined with
"scenario_create_vms") so the recorded JUnit entry matches the actual hook
context.

exit_if_image_not_found "${start_image}"

prepare_kickstart host1 kickstart-bootc.ks.template "${start_image}"
launch_vm "${start_image}" --vm_vcpus 4
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "launch_vm helper definition(s):"
rg -n -C4 --type=sh '^\s*(function\s+)?launch_vm\s*\(\)' . || true

echo
echo "EL10 bootc launch_vm call sites:"
rg -n -C1 --type=sh '^\s*launch_vm\s+' test/scenarios-bootc/el10/releases

Repository: openshift/microshift

Length of output: 9344


🏁 Script executed:

head -20 test/scenarios-bootc/el10/releases/el102-lrel@iso-standard1.sh

Repository: openshift/microshift

Length of output: 492


🏁 Script executed:

head -20 test/scenarios-bootc/el10/releases/el102-lrel@iso-standard2.sh

Repository: openshift/microshift

Length of output: 492


Pass the correct VM profile to launch_vm on line 11.

The variable ${start_image} is rhel102-bootc-brew-lrel-optional, which is the source image for provisioning. However, all peer EL10 bootc scenarios pass the VM profile name rhel102-bootc to launch_vm. Passing the source image name instead will cause the VM launch to fail or target the wrong profile.

Change launch_vm "${start_image}" --vm_vcpus 4 to launch_vm rhel102-bootc --vm_vcpus 4. (Same issue in el102-lrel@iso-standard2.sh.)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/scenarios-bootc/el10/releases/el102-lrel`@iso-standard1.sh at line 11,
Replace the incorrect use of the source image variable start_image when calling
launch_vm with the intended VM profile name: change the call launch_vm
"${start_image}" --vm_vcpus 4 to launch_vm rhel102-bootc --vm_vcpus 4 so the VM
uses the rhel102-bootc profile; apply the same fix in the sibling script
el102-lrel@iso-standard2.sh where launch_vm is passed start_image.

exit_if_image_not_found "${start_image}"

prepare_kickstart host1 kickstart-bootc.ks.template "${start_image}"
launch_vm "${start_image}" --vm_vcpus 4
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "launch_vm helper definition(s):"
rg -n -C4 --type=sh '^\s*(function\s+)?launch_vm\s*\(\)' . || true

echo
echo "EL10 bootc launch_vm call sites:"
rg -n -C1 --type=sh '^\s*launch_vm\s+' test/scenarios-bootc/el10/releases

Repository: openshift/microshift

Length of output: 9344


🏁 Script executed:

sed -n '725,750p' test/bin/scenario.sh

Repository: openshift/microshift

Length of output: 841


Pass rhel102-bootc as the first argument to launch_vm, not ${start_image}.

Line 11 passes ${start_image} (the source image name), but launch_vm expects the bootc profile name (rhel102-bootc) as its first boot_blueprint argument, consistent with all other EL10 bootc scenarios. Using the wrong argument will provision the wrong target and break the test flow.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/scenarios-bootc/el10/releases/el102-lrel`@iso-standard2.sh at line 11,
The call to launch_vm is using the source image variable start_image but must
pass the boot blueprint name; change the first argument of launch_vm from
${start_image} to the boot profile name rhel102-bootc (keeping the existing
--vm_vcpus 4 flag) so launch_vm("rhel102-bootc", ...) is invoked and the correct
target is provisioned.

Comment on lines +21 to +22
scenario_create_vms() {
exit_if_image_not_found "${start_image}"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if other multi-config scenarios use consistent image checks
rg -n 'exit_if_image_not_found|does_image_exist' test/scenarios-bootc/el10/releases/

Repository: openshift/microshift

Length of output: 10234


Inconsistent image-checking between standard1 and standard2 variants.

el102-lrel@multi-config-standard1.sh uses exit_if_image_not_found (hard failure) while el102-lrel@multi-config-standard2.sh uses does_image_exist with soft skip. Align both if they should behave identically.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/scenarios-bootc/el10/releases/el102-lrel`@multi-config-standard1.sh
around lines 21 - 22, The image-check logic in scenario_create_vms is
inconsistent with the standard2 variant: it calls exit_if_image_not_found (hard
fail) while the other script uses does_image_exist (soft skip); pick the
intended behavior and make both scripts match—either replace
exit_if_image_not_found with does_image_exist in scenario_create_vms to perform
a soft skip, or change the other script to use exit_if_image_not_found for a
hard failure; update the call in scenario_create_vms and any related
documentation/comments so both variants use the same image-checking function
(exit_if_image_not_found or does_image_exist).

Comment on lines +96 to +102
# Validate LVMS still works after all tests
echo "INFO: Validating LVMS workloads after tests..."
run_command_on_vm host1 'bash -s' < "${TESTDIR}/../scripts/lvms-helpers/checkWorkloadExists.sh"

# Cleanup LVMS workloads
echo "INFO: Cleaning up LVMS workloads..."
run_command_on_vm host1 'bash -s' < "${TESTDIR}/../scripts/lvms-helpers/cleanupWorkload.sh"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Misleading comment: LVMS validation runs before tests, not after.

The comment "Validate LVMS still works after all tests" is inaccurate—this check runs before run_tests. If post-test validation is intended, move lines 97-98 after line 108. Otherwise, update the comment to reflect current behavior.

Suggested comment fix if current order is intentional
-    # Validate LVMS still works after all tests
-    echo "INFO: Validating LVMS workloads after tests..."
+    # Validate LVMS workloads exist before cleanup
+    echo "INFO: Validating LVMS workloads..."
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Validate LVMS still works after all tests
echo "INFO: Validating LVMS workloads after tests..."
run_command_on_vm host1 'bash -s' < "${TESTDIR}/../scripts/lvms-helpers/checkWorkloadExists.sh"
# Cleanup LVMS workloads
echo "INFO: Cleaning up LVMS workloads..."
run_command_on_vm host1 'bash -s' < "${TESTDIR}/../scripts/lvms-helpers/cleanupWorkload.sh"
# Validate LVMS workloads exist before cleanup
echo "INFO: Validating LVMS workloads..."
run_command_on_vm host1 'bash -s' < "${TESTDIR}/../scripts/lvms-helpers/checkWorkloadExists.sh"
# Cleanup LVMS workloads
echo "INFO: Cleaning up LVMS workloads..."
run_command_on_vm host1 'bash -s' < "${TESTDIR}/../scripts/lvms-helpers/cleanupWorkload.sh"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/scenarios-bootc/el10/releases/el102-lrel`@multi-config-standard2.sh
around lines 96 - 102, The comment "Validate LVMS still works after all tests"
is misleading because the run_command_on_vm call to execute
scripts/lvms-helpers/checkWorkloadExists.sh runs before run_tests; either move
the two lines that invoke checkWorkloadExists.sh and cleanupWorkload.sh (the
run_command_on_vm host1 'bash -s' invocations) to after the run_tests invocation
so validation truly occurs post-tests, or update the echo/comment text to
accurately state this is a pre-test validation; locate these calls by the
run_command_on_vm usage and the referenced scripts checkWorkloadExists.sh and
cleanupWorkload.sh and adjust accordingly.

Comment on lines +25 to +26
local -r vm_ip1=$("${ROOTDIR}/scripts/devenv-builder/manage-vm.sh" ip -n "${vmname}" | head -1)
local -r vm_ip2=$("${ROOTDIR}/scripts/devenv-builder/manage-vm.sh" ip -n "${vmname}" | tail -1)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "manage-vm.sh ip implementation:"
rg -n -C6 '^\s*ip\)|domifaddr' scripts/devenv-builder/manage-vm.sh || true

echo
echo "Other call sites using manage-vm.sh ip:"
rg -n -C2 --type=sh 'manage-vm\.sh"\s+ip\s+-n|USHIFT_HOST_IP[12]' .

Repository: openshift/microshift

Length of output: 9789


Use interface/network names instead of head/tail for NIC selection.

The helper manage-vm.sh ip iterates IPv4 and IPv6 families, returning only addresses that respond to ping. It does not guarantee exactly 2 IPs—output depends on VM interfaces and ping success. Using head -1 and tail -1 risks selecting wrong NICs if the VM has extra addresses or if timeout skips an expected address. This pattern appears across all multi-nic scenarios identically.

Select addresses by interface name or validate the exact IP count before passing to run_tests.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/scenarios-bootc/el10/releases/el102-lrel`@multi-nic.sh around lines 25 -
26, The script extracts NIC IPs using head/tail which can pick the wrong
addresses when manage-vm.sh ip returns more or fewer entries; change the logic
that sets vm_ip1 and vm_ip2 to select by interface/network name (e.g., call
"${ROOTDIR}/scripts/devenv-builder/manage-vm.sh" ip -n "${vmname}" <ifname> or
parse the output to find entries matching the desired interface labels) and
validate the returned list has exactly the expected IPs before calling
run_tests; update the assignments for vm_ip1 and vm_ip2 and add a guard that
errors out with a clear message if the expected addresses are not found for
vmname.

@vanhalenar
Copy link
Copy Markdown
Contributor Author

/test e2e-aws-tests-bootc-release-el10
/test e2e-aws-tests-bootc-release-el9

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/scenarios-bootc/el10/releases/el102-lrel`@tlsv13-standard2.sh:
- Around line 14-18: The teardown function scenario_remove_vms should not be
gated by registry/image checks: remove or relocate the call to
exit_if_image_not_found so it cannot prevent remove_vm host1 from running;
specifically, either delete the exit_if_image_not_found "${start_image}" line
from scenario_remove_vms or change it to a non-fatal check/log (e.g., warn) so
that remove_vm host1 always executes to avoid leaked VMs.

In `@test/scenarios-bootc/el10/releases/el96-y1`@el102-lrel@lvms.sh:
- Around line 20-23: The teardown function scenario_remove_vms currently calls
exit_if_image_not_found("${start_image}") and
exit_if_image_not_found("${dest_image}") which can abort the teardown and leave
host1 running; change the logic so image-not-found checks do not call exit
during teardown (either remove these exit_* calls from scenario_remove_vms or
replace them with non-fatal checks that log a warning), and ensure remove_vm
host1 is always executed unconditionally at the end of scenario_remove_vms so
the VM is always cleaned up even if image lookups fail.
- Around line 42-55: The retry counter `iter` in checkLvmsResources.sh is
initialized once before the "for pod in" loop so all pods share retries; move
the initialization (iter=24) inside the loop so each pod starts with its own
retry budget, i.e., locate the "for pod in" loop in checkLvmsResources.sh and
add/reset iter=24 at the top of that loop body (before the per-pod retry/wait
logic) to ensure each pod gets independent retries.

In `@test/scenarios-bootc/el10/releases/el96-y1`@el102-lrel@standard1.sh:
- Around line 20-25: The teardown function scenario_remove_vms must not abort
based on image lookup; remove or relocate the exit_if_image_not_found calls so
VM cleanup always runs — ensure scenario_remove_vms calls remove_vm host1
unconditionally (or only guards that don't call exit), and if image validation
is still needed keep it in setup code instead of inside scenario_remove_vms.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: d1b8e79b-c036-4cd6-bd7a-d86c3d378d48

📥 Commits

Reviewing files that changed from the base of the PR and between 1858af2 and 91a513d.

📒 Files selected for processing (6)
  • test/scenarios-bootc/el10/releases/el102-lrel@tlsv13-standard2.sh
  • test/scenarios-bootc/el10/releases/el96-y1@el102-lrel@lvms.sh
  • test/scenarios-bootc/el10/releases/el96-y1@el102-lrel@standard1.sh
  • test/scenarios-bootc/el10/releases/el96-y1@el102-lrel@standard2.sh
  • test/scenarios-bootc/el10/releases/el98-lrel@el102-lrel@standard1.sh
  • test/scenarios-bootc/el10/releases/el98-lrel@el102-lrel@standard2.sh
🚧 Files skipped from review as they are similar to previous changes (2)

Comment on lines +14 to +18
scenario_remove_vms() {
exit_if_image_not_found "${start_image}"

remove_vm host1
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Do not gate teardown on image availability.

On Line 15, exit_if_image_not_found can skip remove_vm host1 during registry/image lookup failures, which risks leaked VMs between retries.

Suggested fix
 scenario_remove_vms() {
-    exit_if_image_not_found "${start_image}"
-
     remove_vm host1
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
scenario_remove_vms() {
exit_if_image_not_found "${start_image}"
remove_vm host1
}
scenario_remove_vms() {
remove_vm host1
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/scenarios-bootc/el10/releases/el102-lrel`@tlsv13-standard2.sh around
lines 14 - 18, The teardown function scenario_remove_vms should not be gated by
registry/image checks: remove or relocate the call to exit_if_image_not_found so
it cannot prevent remove_vm host1 from running; specifically, either delete the
exit_if_image_not_found "${start_image}" line from scenario_remove_vms or change
it to a non-fatal check/log (e.g., warn) so that remove_vm host1 always executes
to avoid leaked VMs.

Comment on lines +20 to +23
scenario_remove_vms() {
exit_if_image_not_found "${start_image}"
exit_if_image_not_found "${dest_image}"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Make teardown unconditional.

Line 21 and Line 22 can abort cleanup before remove_vm host1. If image lookup fails during teardown, the VM is left behind and CI resources leak.

Suggested fix
 scenario_remove_vms() {
-    exit_if_image_not_found "${start_image}"
-    exit_if_image_not_found "${dest_image}"
-
     remove_vm host1
 }

As per coding guidelines, "Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
scenario_remove_vms() {
exit_if_image_not_found "${start_image}"
exit_if_image_not_found "${dest_image}"
scenario_remove_vms() {
remove_vm host1
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/scenarios-bootc/el10/releases/el96-y1`@el102-lrel@lvms.sh around lines
20 - 23, The teardown function scenario_remove_vms currently calls
exit_if_image_not_found("${start_image}") and
exit_if_image_not_found("${dest_image}") which can abort the teardown and leave
host1 running; change the logic so image-not-found checks do not call exit
during teardown (either remove these exit_* calls from scenario_remove_vms or
replace them with non-fatal checks that log a warning), and ensure remove_vm
host1 is always executed unconditionally at the end of scenario_remove_vms so
the VM is always cleaned up even if image lookups fail.

Comment on lines +42 to +55
run_command_on_vm host1 'bash -s' < "${TESTDIR}/../scripts/lvms-helpers/checkLvmsResources.sh"

# Perform upgrade and validate
run_tests host1 \
--variable "TARGET_REF:${dest_image}" \
--variable "BOOTC_REGISTRY:${MIRROR_REGISTRY_URL}" \
suites/upgrade/upgrade-successful.robot

# Post-upgrade: Validate LVMS workloads survived the upgrade
echo "INFO: Checking LVMS workloads survived upgrade..."
run_command_on_vm host1 'bash -s' < "${TESTDIR}/../scripts/lvms-helpers/checkWorkloadExists.sh"

echo "INFO: Checking LVMS resources after upgrade..."
run_command_on_vm host1 'bash -s' < "${TESTDIR}/../scripts/lvms-helpers/checkLvmsResources.sh"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify retry counter scope in the helper used at Line 42 and Line 55.
# Expected: if `iter=` is outside `for pod ...`, retries are shared across pods (risk).

fd -i 'checkLvmsResources.sh' --exec sed -n '1,140p' {}
rg -n -C3 'iter=|for pod in|while \[\[ "\$\{result\}" != "Running"' scripts/lvms-helpers/checkLvmsResources.sh

Repository: openshift/microshift

Length of output: 2519


Fix shared retry counter in checkLvmsResources.sh pod check loop.

The iter variable at line 11 is initialized once before the for pod in loop, causing all pods to share the same retry counter. Each pod that fails consumes retries, leaving fewer attempts for subsequent pods. This causes the scenario to be flaky—later pods may not be waited on correctly if earlier pods exhaust retries. Move iter=24 inside the loop to reset retries for each pod.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/scenarios-bootc/el10/releases/el96-y1`@el102-lrel@lvms.sh around lines
42 - 55, The retry counter `iter` in checkLvmsResources.sh is initialized once
before the "for pod in" loop so all pods share retries; move the initialization
(iter=24) inside the loop so each pod starts with its own retry budget, i.e.,
locate the "for pod in" loop in checkLvmsResources.sh and add/reset iter=24 at
the top of that loop body (before the per-pod retry/wait logic) to ensure each
pod gets independent retries.

Comment on lines +20 to +25
scenario_remove_vms() {
exit_if_image_not_found "${start_image}"
exit_if_image_not_found "${dest_image}"

remove_vm host1
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Teardown should not depend on image availability

scenario_remove_vms() currently exits early if image lookup fails (Line 21-Line 22), which can skip VM cleanup and leak CI resources. Cleanup should run unconditionally.

Suggested fix
 scenario_remove_vms() {
-    exit_if_image_not_found "${start_image}"
-    exit_if_image_not_found "${dest_image}"
-
     remove_vm host1
 }

As per coding guidelines, "Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
scenario_remove_vms() {
exit_if_image_not_found "${start_image}"
exit_if_image_not_found "${dest_image}"
remove_vm host1
}
scenario_remove_vms() {
remove_vm host1
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/scenarios-bootc/el10/releases/el96-y1`@el102-lrel@standard1.sh around
lines 20 - 25, The teardown function scenario_remove_vms must not abort based on
image lookup; remove or relocate the exit_if_image_not_found calls so VM cleanup
always runs — ensure scenario_remove_vms calls remove_vm host1 unconditionally
(or only guards that don't call exit), and if image validation is still needed
keep it in setup code instead of inside scenario_remove_vms.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 30, 2026

@vanhalenar: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-tests-bootc-release-el10 91a513d link true /test e2e-aws-tests-bootc-release-el10
ci/prow/e2e-aws-tests-bootc-release-el9 91a513d link true /test e2e-aws-tests-bootc-release-el9

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants