Upgrade Toolkit is the primary component of Harvester Upgrade V2. It includes the Upgrade Manager and other auxiliary components that work together to facilitate the Upgrade V2 mechanics.
Upgrade Toolkit is packaged as a Helm chart, Harvester Upgrade Manager. You can install it via Helm:
helm upgrade --install harvester-upgrade-manager harvester-upgrade-manager \
--repo=https://charts.harvesterhci.io \
--namespace=harvester-systemCreate a Version CR in the harvester-system namespace. This is almost the same as before.
cat <<EOF | kubectl apply -f -
apiVersion: harvesterhci.io/v1beta1
kind: Version
metadata:
name: master-head
namespace: harvester-system
spec:
isoURL: https://releases.rancher.com/harvester/master/harvester-master-amd64.iso
EOFCreate an UpgradePlan CR with the desired version.
cat <<EOF | kubectl create -f -
apiVersion: management.harvesterhci.io/v1beta1
kind: UpgradePlan
metadata:
generateName: hvst-upgrade-
spec:
version: master-head
EOFAdditionally, upgrade can be triggered by creating an UpgradePlan CR with an existing ISO image on the cluster. The ISO image can be downloaded from a URL or uploaded to the cluster using the Harvester UI or CLI, and then referenced in the UpgradePlan CR.
For instance, to download the latest Harvester ISO from the releases page and use it for an upgrade, you can create a VirtualMachineImage CR as shown below:
cat <<EOF | kubectl create -f -
apiVersion: harvesterhci.io/v1beta1
kind: VirtualMachineImage
metadata:
annotations:
harvesterhci.io/os-upgrade-image: "True"
name: harvester-master-amd64
namespace: harvester-system
spec:
backend: cdi
displayName: harvester-master-amd64.iso
sourceType: download
url: https://releases.rancher.com/harvester/master/harvester-master-amd64.iso
checksum: ""
retry: 3
targetStorageClassName: longhorn-static
EOFLater, when the image is ready (actually, you don’t need to wait; the controller will automatically pick it up as soon as it becomes ready), you can create an UpgradePlan CR that references it (no need for referencing a Version CR):
cat <<EOF | kubectl create -f -
apiVersion: management.harvesterhci.io/v1beta1
kind: UpgradePlan
metadata:
generateName: hvst-upgrade-
spec:
image: harvester-master-amd64
EOFUpgrade Toolkit supports upgrading a Harvester cluster using other container images that are not packaged in the ISO image for Upgrade Repo and also node-specific upgrade jobs. To do so, please see below.
When creating the UpgradePlan CR, specifying a different container image tag:
cat <<EOF | kubectl create -f -
apiVersion: management.harvesterhci.io/v1beta1
kind: UpgradePlan
metadata:
generateName: hvst-upgrade-
spec:
version: master-head
upgrade: main-head
EOFOr optionally, specify a few options to customize the upgrade process:
cat <<EOF | kubectl create -f -
apiVersion: management.harvesterhci.io/v1beta1
kind: UpgradePlan
metadata:
generateName: hvst-upgrade-
spec:
version: master-head
upgrade: main-head
imagePreloadOption:
concurrency: -1
nodeUpgradeOption:
pauseNodes:
- charlie-1-tink-system
- charlie-3-tink-system
restoreVM: true
EOFFor all the available options, see the output of kubectl explain upgradeplans.spec.
A successfully executed UpgradePlan looks like the following:
apiVersion: management.harvesterhci.io/v1beta1
kind: UpgradePlan
metadata:
creationTimestamp: "2026-03-13T03:57:09Z"
generateName: hvst-upgrade-
generation: 3
name: hvst-upgrade-864fh
resourceVersion: "219468"
uid: 24fde87d-491b-4af8-bafd-001fceb20a62
spec:
imagePreloadOption:
concurrency: 100
nodeUpgradeOption: {}
restoreVM: true
upgrade: main-head
version: v1.8.0-rc1
status:
conditions:
- lastTransitionTime: "2026-03-13T05:32:57Z"
message: UpgradePlan has completed
observedGeneration: 3
reason: Succeeded
status: "False"
type: Progressing
- lastTransitionTime: "2026-03-13T05:32:57Z"
message: ""
observedGeneration: 3
reason: ReconcileSuccess
status: "False"
type: Degraded
- lastTransitionTime: "2026-03-13T05:32:57Z"
message: Entered one of the terminal phases
observedGeneration: 3
reason: Executed
status: "False"
type: Available
currentPhase: Succeeded
isoImageID: hvst-upgrade-864fh-iso
nodeUpgradeStatuses:
charlie-1-tink-system:
state: ImageCleaned
charlie-2-tink-system:
state: ImageCleaned
charlie-3-tink-system:
state: ImageCleaned
phaseTransitionTimestamps:
- phase: Initializing
phaseTransitionTimestamp: "2026-03-13T03:57:09Z"
- phase: Initialized
phaseTransitionTimestamp: "2026-03-13T03:57:09Z"
- phase: ISODownloading
phaseTransitionTimestamp: "2026-03-13T03:57:09Z"
- phase: ISODownloaded
phaseTransitionTimestamp: "2026-03-13T04:01:53Z"
- phase: RepoCreating
phaseTransitionTimestamp: "2026-03-13T04:01:53Z"
- phase: RepoCreated
phaseTransitionTimestamp: "2026-03-13T04:01:58Z"
- phase: MetadataPopulating
phaseTransitionTimestamp: "2026-03-13T04:01:58Z"
- phase: MetadataPopulated
phaseTransitionTimestamp: "2026-03-13T04:01:59Z"
- phase: ImagePreloading
phaseTransitionTimestamp: "2026-03-13T04:01:59Z"
- phase: ImagePreloaded
phaseTransitionTimestamp: "2026-03-13T04:12:00Z"
- phase: ClusterUpgrading
phaseTransitionTimestamp: "2026-03-13T04:12:00Z"
- phase: ClusterUpgraded
phaseTransitionTimestamp: "2026-03-13T04:27:15Z"
- phase: NodeUpgrading
phaseTransitionTimestamp: "2026-03-13T04:27:15Z"
- phase: NodeUpgraded
phaseTransitionTimestamp: "2026-03-13T05:30:42Z"
- phase: CleaningUp
phaseTransitionTimestamp: "2026-03-13T05:30:43Z"
- phase: CleanedUp
phaseTransitionTimestamp: "2026-03-13T05:32:56Z"
- phase: Succeeded
phaseTransitionTimestamp: "2026-03-13T05:32:57Z"
previousVersion: v1.7.1
provisionGeneration: 1
releaseMetadata:
harvester: v1.8.0-rc1
harvesterChart: 1.8.0-rc1
kubernetes: v1.35.2+rke2r1
minUpgradableVersion: v1.7.0
monitoringChart: 108.0.2+up77.9.1-rancher.11
os: Harvester v1.8.0-rc1
rancher: v2.14.0-alpha9
version:
isoChecksum: 3c6f98efc02959da524828b0c44273c8375b7815ce5fcf1c11581479d979daa6e184f3d20535041137282c3d2b4a12d9ba3ce847b051e9aecb5310b73f19523c
isoURL: https://releases.rancher.com/harvester/v1.8.0-rc1/harvester-v1.8.0-rc1-amd64.isoDuring the upgrade, events are emitted at phase transitions and key points. View the events for an UpgradePlan with kubectl describe upgradeplans <upgradeplan-name>:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal PhaseTransition 105m upgradeplan-controller Entering phase ISODownload
Warning ReconcileError 105m upgradeplan-controller Pipeline error: VirtualMachineImage.harvesterhci.io "hvst-upgrade-864fh-iso" not found
Normal PhaseCompleted 100m upgradeplan-controller Completed phase ISODownload
Normal PhaseTransition 100m upgradeplan-controller Entering phase RepoCreate
Warning ReconcileError 100m upgradeplan-controller Pipeline error: Deployment.apps "hvst-upgrade-864fh-repo" not found
Warning ReconcileError 100m upgradeplan-controller Pipeline error: Service "hvst-upgrade-864fh-repo" not found
Normal PhaseCompleted 100m upgradeplan-controller Completed phase RepoCreate
Normal PhaseTransition 100m upgradeplan-controller Entering phase MetadataPopulate
Normal PhaseCompleted 100m upgradeplan-controller Completed phase MetadataPopulate
Normal PhaseTransition 100m upgradeplan-controller Entering phase ImagePreload
Warning ReconcileError 100m upgradeplan-controller Pipeline error: Plan.upgrade.cattle.io "hvst-upgrade-864fh-image-preload" not found
Normal PhaseCompleted 90m upgradeplan-controller Completed phase ImagePreload
Normal PhaseTransition 90m upgradeplan-controller Entering phase ClusterUpgrade
Normal PhaseCompleted 75m upgradeplan-controller Completed phase ClusterUpgrade
Normal PhaseTransition 75m upgradeplan-controller Entering phase NodeUpgrade
Normal RestoreVMConfigMapCreated 63m vm-live-migrate-detector ConfigMap harvester-system/hvst-upgrade-864fh-restore-vm created
Normal VMShutdownCompleted 63m vm-live-migrate-detector Shutdown completed for 0 VM(s) on node charlie-1-tink-system, success: 0, failed: 0
Normal VMShutdownCompleted 39m vm-live-migrate-detector Shutdown completed for 1 VM(s) on node charlie-2-tink-system, success: 1, failed: 0
Normal RestoreVMCompleted 26m restore-vm Restored 1 VMs for node charlie-2-tink-system during upgrade hvst-upgrade-864fh, success: 1, failed: 0
Normal VMShutdownCompleted 25m vm-live-migrate-detector Shutdown completed for 1 VM(s) on node charlie-3-tink-system, success: 1, failed: 0
Warning ReconcileError 12m upgradeplan-controller Pipeline error: waiting for Rancher to complete node upgrades: secret custom-58afebea1719-machine-plan still has rke.cattle.io/post-drain annotation
Normal PhaseCompleted 12m upgradeplan-controller Completed phase NodeUpgrade
Normal PhaseTransition 12m upgradeplan-controller Entering phase ImageCleanup
Warning ReconcileError 12m upgradeplan-controller Pipeline error: Plan.upgrade.cattle.io "hvst-upgrade-864fh-image-cleanup" not found
Normal RestoreVMCompleted 11m restore-vm Restored 1 VMs for node charlie-3-tink-system during upgrade hvst-upgrade-864fh, success: 1, failed: 0
Normal PhaseCompleted 9m50s (x2 over 9m51s) upgradeplan-controller Completed phase ImageCleanup
Normal UpgradeSucceeded 9m50s (x2 over 9m50s) upgradeplan-controller Upgrade completed successfully
The following annotations can be set on an UpgradePlan CR to skip or override specific pre-flight checks.
| Annotation | Value | Scope | Description |
|---|---|---|---|
management.harvesterhci.io/skip-webhook |
"true" |
Webhook (create) | Bypasses all create-time validation checks |
management.harvesterhci.io/skip-single-replica-detached-vol |
"true" |
Webhook (create) | Skips the detached single-replica Longhorn volume check (active single-replica volumes are still blocked) |
management.harvesterhci.io/allow-deletion |
"true" |
Webhook (delete) | Allows deletion of a progressing UpgradePlan (hard-blocked during ClusterUpgrading and NodeUpgrading phases regardless) |
management.harvesterhci.io/skip-garbage-collection-threshold-check |
"true" |
Controller (init phase) | Skips the kubelet disk-space / image GC threshold pre-flight check |
management.harvesterhci.io/min-certs-expiration-in-day |
Integer > 0 | Controller (init phase) | Overrides the minimum certificate expiration window in days (default: 7) |
management.harvesterhci.io/upgrade-toolkit-image |
Image repo+name | Controller (all phases) | Overrides the default upgrade-toolkit container image (rancher/harvester-upgrade-toolkit); tag is still controlled by spec.upgrade |
Example usage:
cat <<EOF | kubectl create -f -
apiVersion: management.harvesterhci.io/v1beta1
kind: UpgradePlan
metadata:
generateName: hvst-upgrade-
annotations:
management.harvesterhci.io/skip-single-replica-detached-vol: "true"
management.harvesterhci.io/min-certs-expiration-in-day: "3"
spec:
version: master-head
EOFThe upgrade lifecycle is driven by a phase-based state machine. An UpgradePlan CR progresses through a strict sequence of phases (tracked in status.currentPhase). Each phase has an active (...ing) and completed (...ed) value. Certain phases can transition directly to Failed on unrecoverable errors (see diagram below).
stateDiagram-v2
[*] --> Initialize
Initialize --> ISODownload
ISODownload --> RepoCreate
RepoCreate --> MetadataPopulate
MetadataPopulate --> ImagePreload
ImagePreload --> ClusterUpgrade
ClusterUpgrade --> NodeUpgrade
NodeUpgrade --> ImageCleanup
ImageCleanup --> Succeeded
Initialize --> Failed
ISODownload --> Failed
ImagePreload --> Failed
ClusterUpgrade --> Failed
NodeUpgrade --> Failed
ImageCleanup --> Failed
Succeeded --> [*]
Failed --> [*]
The 8 phases are:
- Initialize (
Initializing/Initialized): Loads the Version snapshot (whenspec.imageis not set), records the previous Harvester version, detects single-node clusters, and then runs pre-flight checks (disk space projection against the kubelet image GC threshold and API server certificate expiration). Pre-flight failures are terminal. - ISODownload (
ISODownloading/ISODownloaded): Downloads the upgrade ISO via a VirtualMachineImage, or adopts a pre-uploaded image specified inspec.image. - RepoCreate (
RepoCreating/RepoCreated): Deploys an Nginx Deployment and Service to serve the ISO contents as an upgrade repository. - MetadataPopulate (
MetadataPopulating/MetadataPopulated): Fetches release metadata from the upgrade repository and populatesstatus.releaseMetadata(Harvester, HarvesterChart, OS, Kubernetes, Rancher, MonitoringChart, MinUpgradableVersion). - ImagePreload (
ImagePreloading/ImagePreloaded): Before enteringImagePreloading, runs an upgrade eligibility check (skipped whenspec.forceis true); failure is terminal. Then preloads container images onto all nodes via a system-upgrade-controller Plan. Concurrency is configurable; set to a negative value to skip entirely. - ClusterUpgrade (
ClusterUpgrading/ClusterUpgraded): Applies cluster-level upgrade manifests via a Kubernetes Job. - NodeUpgrade (
NodeUpgrading/NodeUpgraded): Upgrades individual nodes. Multi-node clusters use Rancher V2 Provisioning drain hooks; single-node clusters use a direct Job-based upgrade. - ImageCleanup (
CleaningUp/CleanedUp): Removes stale container images from all nodes via a system-upgrade-controller Plan.
During the ImagePreload, NodeUpgrade, and ImageCleanup phases, each node tracks its own state. The node states are grouped into ordinal tiers; a node can only move forward, never backward.
Multi-node cluster:
stateDiagram-v2
[*] --> ImagePreloading
state "ImagePreload" as ip {
ImagePreloading --> ImagePreloaded
ImagePreloading --> ImagePreloadFailed
}
ImagePreloaded --> UpgradePaused : if in pauseNodes
ImagePreloaded --> PreDraining : otherwise
state "NodeUpgrade (multi-node)" as nu {
UpgradePaused --> PreDraining : removed from pauseNodes
PreDraining --> PreDrained
PreDraining --> PreDrainFailed
PreDrained --> PostDraining
PostDraining --> WaitingReboot
PostDraining --> PostDrainFailed
WaitingReboot --> PostDrained
}
PostDrained --> ImageCleaning
state "ImageCleanup" as ic {
ImageCleaning --> ImageCleaned
ImageCleaning --> ImageCleanFailed
}
ImageCleaned --> [*]
Single-node cluster:
stateDiagram-v2
[*] --> ImagePreloading
state "ImagePreload" as ip {
ImagePreloading --> ImagePreloaded
ImagePreloading --> ImagePreloadFailed
}
ImagePreloaded --> UpgradePaused : if in pauseNodes
ImagePreloaded --> SingleNodeUpgrading : otherwise
state "NodeUpgrade (single-node)" as nu {
UpgradePaused --> SingleNodeUpgrading : removed from pauseNodes
SingleNodeUpgrading --> SingleNodeUpgraded
SingleNodeUpgrading --> SingleNodeUpgradeFailed
}
SingleNodeUpgraded --> ImageCleaning
state "ImageCleanup" as ic {
ImageCleaning --> ImageCleaned
ImageCleaning --> ImageCleanFailed
}
ImageCleaned --> [*]
Key points about node state transitions:
- Forward-only: Node states are organized into ordinal groups (0-9). A node's state can only advance to a higher group, never regress.
- Pause control: Nodes listed in
spec.nodeUpgradeOption.pauseNodesenterUpgradePausedafter image preload completes, in both multi-node and single-node clusters. They resume when removed from the list. - Failure states: During NodeUpgrade, the states
PreDrainFailed,PostDrainFailed, andSingleNodeUpgradeFailedcause the overall UpgradePlan to transition toFailed. During ImagePreload and ImageCleanup, the overall phase fails when the system-upgrade-controller Plan's job fails, which is detected independently of per-node states.
After making changes, build and test the upgrade-toolkit binary and container image.
# Lint the code
make lint
# Run unit tests and interation tests
make test
# Build the upgrade-toolkit binary (under `bin/`)
make build
# Build the container image
# The built image will be tagged with `rancher/harvester-upgrade-toolkit:<branch>-head`
make docker-buildTo build and push the container image, run:
# Adapt the `REPO` value below to your own Docker Hub repository
REPO=starbops make docker-buildxUpgrade Toolkit comes with a set of Kustomize manifests that enable easy installation.
To build or update the Kustomize manifests, run:
make manifestsThe generatedoutput is located in config/, and can be deployed with the following command:
# Specify the image name and tag in `IMG`
make deploy IMG=starbops/harvester-upgrade-toolkit:devUpgrade Toolkit comes with a single file of installer manifests that enable easy installation.
To build or update the installer manifests, run:
# Specify the image name and tag in `IMG`
make build-installer IMG=starbops/harvester-upgrade-toolkit:devThe built installer manifests are located in dist/installer.yaml, and can be installed via kubectl apply:
kubectl apply -f dist/installer.yamlUpgrade Toolkit leverages Kubebuilder's Helm plugin to manage the local Helm chart.
Note
Kubebuilder’s Helm plugin generates Helm charts from the installer manifests. Futhermore, make build-installer depends on the Kustomize manifests generated by make manifests, so it is recommended to run make manifests first, update the Kustomize manifests under config/, and then generate the Helm chart in order to ensure everything is in sync.
# Update the local Helm chart
kubebuilder edit --plugins=helm/v2-alphaNote
The kubebuilder edit --plugins=helm/v2-alpha command regenerates all template files under dist/chart/templates/. It does not preserve manual edits to templates. After running the plugin, the following manual fixups are required:
- Delete
dist/chart/templates/cert-manager/(the project does not use cert-manager) - Delete
dist/chart/templates/webhook/mutating-webhook-configuration.yamlandvalidating-webhook-configuration.yaml(replaced by the consolidated webhook.yaml) - In
dist/chart/templates/manager/manager.yaml, replace all occurrences of.Values.certManager.enablewith.Values.webhook.enable - In
dist/chart/templates/monitoring/servicemonitor.yaml, remove the cert-manager TLS configuration block and useinsecureSkipVerify: trueonly
The dist/chart/templates/webhook/webhook.yaml (which uses genCA()/genSignedCert() for self-signed cert generation) is not affected because the plugin does not delete unrecognized files.
Every time you make changes to the code, especially in the control loop, you may want to see the changes in action locally from your IDE or terminal.
To do so, make sure you have a Harvester cluster running and can be accessed via kubectl.
Install the UpgradePlan CRD:
# Make sure you have a valid KUBECONFIG env var, pointing to your cluster
make installRun the controller manager locally (without starting the webhook server):
ENABLE_WEBHOOKS=false make runCreate the Version and UpgradePlan CRs to kickstart the upgrade process.
After the UpgradePlan CR passes the RepoCreated phase, set up a port-forward to allow the local controller manager to access the remote Upgrade Repo.
UP_NAME=$(kubectl get upgradeplans -o json | \
jq -r '.items[]
| select(any(.status.conditions[]; .type=="Progressing" and .status=="True"))
| .metadata.name')
# If privileges are not sufficient, run the following command as root with `sudo -E` prepended:
kubectl -n harvester-system port-forward svc/$UP_NAME-repo 80:80The local controller manager should be able to access the remote Upgrade Repo, advance to the MetadataPopulated phase, and proceed further.
Make sure you have the container image built and pushed to a registry.
# Specify the image name and tag in `IMG`
make helm-deploy IMG=starbops/harvester-upgrade-toolkit:devCreate the Version and UpgradePlan CRs to kickstart the upgrade process.
The phase-based runner design facilitates well-organized phase ordering and allows for the easy integration of new phases.
Let's say we want to introduce a new phase called PreCheck. There will be three places in the codebase that require us to modify:
- Update the
pkg/upgradeplan/pipeline.gofile - Create the new
pkg/upgradeplan/phase_precheck.gofile
Copyright 2025-2026 SUSE, LLC.
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.