Skip to content

Latest commit

 

History

History
239 lines (189 loc) · 9.44 KB

File metadata and controls

239 lines (189 loc) · 9.44 KB

Plan: Test Configuration Bundle — Operand Installation for Certsuite

Context

Issue #1 defines a standardized "Test Configuration Bundle" that operator owners provide so certification testers can deploy operands and validate them at scale (50+ operators) without domain expertise.

Problem: The certsuite script (run-basic-batch-operators-test.sh) installs operators but has no mechanism to install operands — except one hardcoded special case for multicluster-engine (lines 776-789). Without operands deployed, many certsuite tests can't exercise real workloads.

Solution: Two pieces of work:

  1. This repo (TestConfigurationBundle) — stores test bundles for operators, with a mapping file to look them up by name
  2. Certsuite script modification — after installing an operator and its CSV succeeds, look up its test bundle, apply CRs to deploy operands, and validate readiness before running tests

Decisions

  • Repo layout: Flat mapping file (operator-map.yaml) maps operator names to bundle directories
  • Certsuite integration: Modify run-basic-batch-operators-test.sh directly — clone this repo at script start, look up bundles per operator
  • Missing bundles: Optional/skip gracefully — if no bundle exists for an operator, log a warning and continue as-is
  • First operator: ocs-operator as the proof-of-concept bundle

Step 1: Set up TestConfigurationBundle repo structure

Create the following layout in this repo:

TestConfigurationBundle/
├── operator-map.yaml                    # Maps operator names to bundle dirs
├── bundles/
│   └── ocs-operator/
│       ├── 00-prerequisites.yaml        # Namespaces, RBAC, ConfigMaps (if needed)
│       ├── 01-operand-crs.yaml          # Operand CRs (default uncommented, others commented out)
│       ├── 02-validate.sh               # Script that exits 0 when operands are healthy
│       ├── metadata.yaml                # Operator info, requirements, health checks
│       └── teardown.yaml                # Cleanup resources (optional)
├── plan.md
└── CLAUDE.md

operator-map.yaml format:

operators:
  ocs-operator:
    bundle_path: bundles/ocs-operator
    description: "OpenShift Container Storage operator"

Sub-tasks:

  1. Create operator-map.yaml with ocs-operator entry
  2. Research ocs-operator CRD schema and create the bundle files:
    • metadata.yaml — operator name, channel, namespace, catalogSource, hardware requirements, health checks
    • 01-operand-crs.yaml — the CRs needed to deploy OCS operands (StorageCluster, etc.)
    • 02-validate.sh — check that OCS pods/operands reach Ready state
    • 00-prerequisites.yaml — any prerequisite resources (if needed)
    • teardown.yaml — cleanup
  3. Create CLAUDE.md for the repo

Step 2: Fork and modify certsuite script

Fork redhat-best-practices-for-k8s/certsuite and modify script/run-basic-batch-operators-test.sh.

2a. Add new input variable and clone logic (top of script)

Add near the existing INPUTS section:

# Test Configuration Bundle repo
TEST_BUNDLE_REPO="${TEST_BUNDLE_REPO:-https://github.com/opdev/TestConfigurationBundle.git}"
TEST_BUNDLE_BRANCH="${TEST_BUNDLE_BRANCH:-main}"
TEST_BUNDLE_DIR=""

Add a function to clone the repo at script start:

clone_test_bundles() {
    TEST_BUNDLE_DIR=$(mktemp -d)
    echo_color "$BLUE" "Cloning test bundle repo: $TEST_BUNDLE_REPO (branch: $TEST_BUNDLE_BRANCH)"
    if git clone --depth 1 --branch "$TEST_BUNDLE_BRANCH" "$TEST_BUNDLE_REPO" "$TEST_BUNDLE_DIR" 2>>"$LOG_FILE_PATH"; then
        echo_color "$BLUE" "Test bundle repo cloned to $TEST_BUNDLE_DIR"
    else
        echo_color "$RED" "Warning: Failed to clone test bundle repo. Operand installation will be skipped."
        TEST_BUNDLE_DIR=""
    fi
}

2b. Add operand installation function

install_operands() {
    local package_name=$1
    local namespace=$2
    local bundle_dir=""

    # Skip if test bundle repo not available
    if [ -z "$TEST_BUNDLE_DIR" ]; then
        echo_color "$GREY" "No test bundle repo available, skipping operand install for $package_name"
        return 0
    fi

    # Look up bundle path from operator-map.yaml
    bundle_dir=$(yq -r ".operators.\"$package_name\".bundle_path // empty" "$TEST_BUNDLE_DIR/operator-map.yaml" 2>/dev/null)
    if [ -z "$bundle_dir" ]; then
        echo_color "$GREY" "No test bundle found for $package_name, skipping operand install"
        return 0
    fi

    bundle_dir="$TEST_BUNDLE_DIR/$bundle_dir"

    echo_color "$BLUE" "Installing operands for $package_name from test bundle"

    # Apply prerequisites if present
    if [ -f "$bundle_dir/00-prerequisites.yaml" ]; then
        echo_color "$BLUE" "Applying prerequisites for $package_name"
        oc apply -f "$bundle_dir/00-prerequisites.yaml" 2>>"$LOG_FILE_PATH" || {
            echo_color "$RED" "Failed to apply prerequisites for $package_name"
            return 1
        }
    fi

    # Apply operand CRs
    if [ -f "$bundle_dir/01-operand-crs.yaml" ]; then
        echo_color "$BLUE" "Applying operand CRs for $package_name"
        oc apply -f "$bundle_dir/01-operand-crs.yaml" 2>>"$LOG_FILE_PATH" || {
            echo_color "$RED" "Failed to apply operand CRs for $package_name"
            return 1
        }
    fi

    # Run validation script if present
    if [ -x "$bundle_dir/02-validate.sh" ]; then
        local timeout
        timeout=$(yq -r '.health_check.timeout_seconds // 300' "$bundle_dir/metadata.yaml" 2>/dev/null)
        echo_color "$BLUE" "Running validation for $package_name (timeout: ${timeout}s)"
        if timeout "$timeout" "$bundle_dir/02-validate.sh" "$namespace" 2>>"$LOG_FILE_PATH"; then
            echo_color "$GREEN" "Operand validation passed for $package_name"
        else
            echo_color "$RED" "Operand validation failed for $package_name"
            return 1
        fi
    fi

    return 0
}

2c. Add teardown function

teardown_operands() {
    local package_name=$1
    local bundle_dir=""

    if [ -z "$TEST_BUNDLE_DIR" ]; then
        return 0
    fi

    bundle_dir=$(yq -r ".operators.\"$package_name\".bundle_path // empty" "$TEST_BUNDLE_DIR/operator-map.yaml" 2>/dev/null)
    if [ -z "$bundle_dir" ] || [ ! -f "$TEST_BUNDLE_DIR/$bundle_dir/teardown.yaml" ]; then
        return 0
    fi

    echo_color "$BLUE" "Tearing down operands for $package_name"
    oc delete -f "$TEST_BUNDLE_DIR/$bundle_dir/teardown.yaml" --ignore-not-found 2>>"$LOG_FILE_PATH" || true
}

2d. Injection points in the main loop

  1. Clone at script start — call clone_test_bundles after catalog creation and before the operator loop

  2. Install operands — inject after the existing hardcoded multicluster-engine block (which stays intact):

    # Generic operand installation from test bundle
    install_operands "$actual_package_name" "$ns" || {
        report_failure "$status" "$ns" "$actual_package_name" "$skip_cleanup" "Operand installation failed"
        continue
    }
  3. Teardown operands — inject before operator uninstall (before line 880):

    teardown_operands "$actual_package_name"
  4. Cleanup temp dir — at end of script, after catalog deletion:

    if [ -n "$TEST_BUNDLE_DIR" ]; then
        rm -rf "$TEST_BUNDLE_DIR"
    fi

2e. Dependencies

The script will need yq available to parse operator-map.yaml. Check if yq is already present in the CI environment; if not, add an install step or use a jq-based alternative with a converted JSON mapping file.


Step 3: Create the ocs-operator test bundle

Research the ocs-operator to create proper bundle files:

  1. Find the CRD schema: oc get crd storageclusters.ocs.openshift.io -o yaml (or from the operator repo)
  2. Create metadata.yaml: operator name, channel (stable), namespace (openshift-storage), catalogSource, health check conditions for StorageCluster
  3. Create 01-operand-crs.yaml: A minimal StorageCluster CR that deploys OCS operands
  4. Create 02-validate.sh: Wait for StorageCluster to reach Ready phase, check that OCS pods are running
  5. Create 00-prerequisites.yaml: Any required ConfigMaps or Secrets (if any)
  6. Create teardown.yaml: Delete the StorageCluster CR

Step 4: Test end-to-end

  1. Push the TestConfigurationBundle repo with the ocs-operator bundle
  2. Run the modified certsuite script with ocs-operator as the target
  3. Verify:
    • Script clones TestConfigurationBundle repo
    • Looks up ocs-operator in operator-map.yaml
    • Applies 01-operand-crs.yaml after operator CSV succeeds
    • Runs 02-validate.sh and it passes
    • Certsuite tests run against the live operands
    • Teardown cleans up the operand CRs
    • Operator is uninstalled normally

Files to create/modify

File Repo Action
operator-map.yaml TestConfigurationBundle Create
bundles/ocs-operator/metadata.yaml TestConfigurationBundle Create
bundles/ocs-operator/01-operand-crs.yaml TestConfigurationBundle Create
bundles/ocs-operator/02-validate.sh TestConfigurationBundle Create
bundles/ocs-operator/00-prerequisites.yaml TestConfigurationBundle Create (if needed)
bundles/ocs-operator/teardown.yaml TestConfigurationBundle Create
CLAUDE.md TestConfigurationBundle Create
script/run-basic-batch-operators-test.sh certsuite (fork) Modify