Skip to content

Primitives for qemu data disk reuse#28

Merged
solsson merged 3 commits into
mainfrom
qemu-data-disk-reuse
May 13, 2026
Merged

Primitives for qemu data disk reuse#28
solsson merged 3 commits into
mainfrom
qemu-data-disk-reuse

Conversation

@solsson
Copy link
Copy Markdown
Contributor

@solsson solsson commented May 13, 2026

No description provided.

Yolean k8s-qa added 3 commits May 13, 2026 08:05
Adds an operator-facing primitive that exposes the labeled
data-volume mechanic the appliance flow uses in cloud, but
through the local qemu provisioner. The maintainer no longer
has to round-trip through prepare-export + GCP / Hetzner
import to verify a disk-reuse scenario.

Surface:

  - QEMUConfig.DataDisk: path to a qcow2 attached as a
    LABEL=y-cluster-data ext4 volume. Empty (default) keeps
    today's behaviour. Relative paths resolve against the
    config-file directory (configfile DirAware convention).
  - QEMUConfig.DataDiskSize: size for a freshly-created
    DataDisk (default 10G; ignored when the file exists or
    when DataDisk is empty).

Behaviour:

  - Provision creates the file at the configured path with
    qemu-img create + virt-format if it doesn't exist;
    reuses it as-is when it does (the whole point of the
    primitive).
  - The cloud-init seed gains a `mounts:` entry pointing
    LABEL=y-cluster-data at /data/yolean with `nofail`, so
    the kernel mounts the labeled volume on every boot
    without needing prepare-export to have pre-baked the
    fstab line.
  - Teardown leaves the DataDisk file alone (it lives
    outside CacheDir, isn't in perVMArtefacts, and we log a
    "data disk preserved" line so the operator sees the
    path on the next teardown / re-provision pair).
  - virt-format is a conditional prereq: only required when
    the DataDisk file doesn't exist yet (existing-disk
    attachment uses qemu's built-in qcow2 support).

Coverage:

  - Unit: TestFromConfig_DataDiskDefaults pins
    absolute / relative / empty path semantics and the
    "no surprise default" rule for DataDiskSize when
    DataDisk is empty. TestRenderCloudInitUserData_*Mount*
    pins the cloud-init mount block. TestEnsureDataDisk_*
    pins the preserve-existing-bytes contract and the
    empty-path guard. Tag-driven schema regen included.

  - e2e (e2e && kvm): TestQemu_DataDisk_ReuseAcrossProvisions
    exercises the full round-trip: provision -> write
    sentinel under /data/yolean -> teardown (NO keep-disk)
    -> assert DataDisk survives + boot disk gone ->
    re-provision same DataDisk -> read sentinel back.
    Closes the coverage gap where the only existing
    appliance-volume e2e went through prepare-export +
    seed-tarball (a different code path).
Two of the three FRs from
specs/y-cluster/FEATURE_REQUEST_DISK_REUSE_E2E_PRIMITIVES.md
the checkit maintainer filed against the local-qemu disk-reuse
flow. FR 2 (`--first-boot-only`) is deferred -- the spec marks
it as needing a design pass on marker storage / mount semantics /
verb shape, and that conversation hasn't happened yet.

FR 1 -- `manifests replace` + `manifests rm`
============================================

Strict-in-both-directions verb set. No --force.

  - add     : name must NOT be staged. As a re-run convenience,
              an `add` with byte-identical content to an
              already-staged file succeeds silently ("my script
              ran twice, file unchanged" case) -- covers the
              checkit appliance-init.sh re-run pattern without
              weakening the strict-new contract.
  - replace : name MUST be staged, overwrites. Bails loud when
              the name isn't staged; the verb at the call site
              documents the operator's intent ("I know this is
              in use and I'm overwriting"). The same byte-equal
              short-circuit applies here for "no change".
  - rm      : name MUST be staged, removes. Bails loud when the
              name isn't staged.

The new prelude (readManifestInput) consolidates name-regex +
input-read + cluster-lookup + target-path so add/replace can't
drift on the validation steps. readStagedManifest /
writeStagedManifest / removeStagedManifest hold the cluster-
side shell commands in one place per verb.

Tests (cmd/y-cluster/manifests_test.go):

  - TestManifestsAddCmd_RejectsInvalidName  (existing, kept)
  - TestManifestsReplaceCmd_RejectsInvalidName
  - TestManifestsRmCmd_RejectsInvalidName
  - TestStagedManifestPath (pins the single touchpoint between
    the build-side verbs and prepare-export)
  - TestManifestsCmd_Subcommands (pins the three-verb surface so
    a future refactor can't accidentally drop one and silently
    turn a verb into a --force equivalent)

FR 3 -- `import` accepts qcow2
==============================

`y-cluster import` and the underlying `qemu.Import` now sniff
the input extension and dispatch:

  .vmdk  -> qemu-img convert -f vmdk  -O qcow2  (existing path)
  .qcow2 -> qemu-img convert -f qcow2 -O qcow2  (new; usually a
            copy + compaction, no foreign format conversion)

Lets a local-qemu-only e2e loop chain `y-cluster export
--format=qcow2` straight into `y-cluster import` without a
manual `qemu-img convert` step or `--format=vmdk` workaround.
The format-table lives in one switch (importFormatFromExt) so
adding raw / vdi / etc. later is a one-line change.

Other formats (raw, vdi, OVA, gcp-tar) intentionally not added:
they aren't on the import path today and the PoC doesn't need
them. The proposal sticks to formats `y-cluster export`
natively produces.

ImportVMDK kept as a deprecated thin wrapper around Import for
any out-of-tree caller pinned to the old name.

Tests (pkg/provision/qemu/qemu_test.go):

  - TestImportVMDK_MissingVMDK            (existing, kept)
  - TestImport_MissingInput               (both extensions)
  - TestImportFormatFromExt               (case-insensitive
    happy paths; explicit reject of unsupported / no-ext)

e2e (e2e/qemu_test.go::TestQemu_ExportImport_Qcow2):

  Full provision -> teardown(keep) -> export --format=qcow2 ->
  delete original boot disk -> Import qcow2 -> re-provision and
  SSH. Mirrors the existing VMDK-flavoured TestQemu_ExportImport,
  but exercises the no-foreign-format path.

FR 2 -- `--first-boot-only`
===========================

NOT included. The spec calls out three open design questions
the checkit maintainer wants to settle with the y-cluster
maintainer before any implementation lands: where the marker
file lives (data disk vs y-cluster-managed boot dir), whether
the wrapper Job auto-supplies a ServiceAccount + hostPath for
/etc/machine-id, and whether the flag lives on `add` (with a
flag-mismatch error from `replace`) or a separate verb
`add-first-boot`. Worth a design pass before code.

Refs:
- specs/y-cluster/FEATURE_REQUEST_DISK_REUSE_E2E_PRIMITIVES.md
Pre-FR 4 the import -> boot path was a deadlock:

  - `y-cluster import <disk>` writes <CacheDir>/<Name>.qcow2
  - `y-cluster provision` errors: "disk already exists; run
    teardown or start"
  - `y-cluster start` errors: "kubeconfig context X has no
    associated cluster; nothing to start" -- the disk has
    never been booted so no context was ever merged in

The customer-side appliance-upgrade case ("new image, keep data
disk") cannot complete with this deadlock; checkit's
appliance-disk-reuse-test.sh Phase 3 bails here.

Fix: `qemu.Provision` now detects an already-staged
<CacheDir>/<Name>.qcow2 (the only state `import` leaves) and
branches into a STAGED-DISK path that:

  - Skips `ensureCloudImage` + `ensureDisk` (the qcow2 IS the
    boot disk; nothing to fetch or create).
  - Still emits a fresh SSH keypair + cloud-init seed (so the
    customer's pubkey lands in the imported image's
    authorized_keys via cloud-init at first boot).
  - Boots the VM and waits for SSH.
  - Calls `waitForK3sReady` + `extractKubeconfig` + merges into
    the host kubeconfig. The pre-baked k3s on the appliance
    image is what we're attaching to.
  - SKIPS `writeRegistries`, `installK3s`, `localstorage.Install`,
    `envoygateway.Install`. Those were all baked in by the
    SOURCE appliance build. Re-running them would clobber the
    appliance's pre-baked state (overwrite GatewayClass /
    parametersRef, re-apply install.yaml at a different version
    than the appliance shipped, etc.).

The fresh-provision branch (no staged disk) is unchanged.

Coverage:

  TestQemu_ExportImport_Qcow2 (already e2e && kvm-gated from
  the FR 3 commit) exercises the full round-trip:
  Provision -> Teardown(keep) -> Export qcow2 -> rm disk ->
  Import qcow2 -> PROVISION (this is the FR 4 case) -> SSH +
  kubectl get nodes. Pre-FR 4 the second Provision hit the
  "disk already exists" error; with the staged-disk branch it
  reaches the in-cluster surface and returns a kubeconfig
  context that `kubectl get nodes` can talk to.

  TestQemu_ExportImport (the VMDK-flavoured counterpart) takes
  the same code path; same FR 4 fix unblocks it.

Refs:
  - specs/y-cluster/FEATURE_REQUEST_DISK_REUSE_E2E_PRIMITIVES.md
    (FR 4)
@solsson solsson merged commit c370b31 into main May 13, 2026
11 checks passed
solsson pushed a commit to Yolean/ystack that referenced this pull request May 13, 2026
Bumps host bin (bin/y-bin.runner.yaml) and the in-cluster
y-kustomize Deployment image. v0.4.6 adds primitives for qemu
data-disk reuse (Yolean/y-cluster#28).

Image digest verified via `crane digest ghcr.io/yolean/y-cluster:v0.4.6`.
sha256 sums copied from the release's checksums.txt.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant