Skip to content

chore(storage): optimize zonal system tests CloudBuild and make concurrency-safe#17171

Merged
chandra-siri merged 2 commits into
googleapis:mainfrom
chandra-siri:fix-zonal-system-tests-cloudbuild
May 19, 2026
Merged

chore(storage): optimize zonal system tests CloudBuild and make concurrency-safe#17171
chandra-siri merged 2 commits into
googleapis:mainfrom
chandra-siri:fix-zonal-system-tests-cloudbuild

Conversation

@chandra-siri
Copy link
Copy Markdown
Contributor

@chandra-siri chandra-siri commented May 19, 2026

This PR optimizes the zonal system tests CloudBuild configuration (zb-system-tests-cloudbuild.yaml) and its execution script (run_zonal_tests.sh) to make the test execution robust, fast, and concurrency-safe.

Proposed Changes

  • Direct Workspace Packaging & SCP Transfer: Replaces git-cloning and fetching inside the GCE VM with packaging the local /workspace/packages/google-cloud-storage directory on the Cloud Build runner and copying the tarball via scp to the VM. This eliminates Git and repo-cloning requirements on the GCE VM, improving speed and reliability.
  • OS Login TTL Keys for Concurrency Safety: Replaces the aggressive, concurrency-breaking cleanup-old-keys step (which deleted all OS Login SSH keys for the project and interfered with concurrent builds) by registering the generated SSH key with OS Login using a 1-hour Time-To-Live (TTL). GCP automatically expires old keys, preventing key accumulation without affecting other concurrent builds.
  • Robust Variable Handling & Safety:
    • Safely quotes all instances of "${_VM_NAME}" in gcloud compute ssh, scp, and delete commands.
    • Adds default empty substitutions (_PR_NUMBER, _CROSS_REGION_BUCKET, _ZONAL_BUCKET, _ZONAL_VM_SERVICE_ACCOUNT) to support manual builds.
    • Exports CROSS_REGION_BUCKET with a fallback default in the test runner script.
    • Enables dynamicSubstitutions: true under CloudBuild options to support active evaluation of build variables.

Verification Results

Both verification scenarios were executed and confirmed:

  1. Full Zonal Tests Run: CloudBuild successfully created the GCE VM, SCP'd the packaged code, executed the zonal system tests inside the VM, and cleanly deleted the VM upon completion.
  2. SSH Key Lifecycle: Verified the generated SSH key is registered with a 1-hour TTL and successfully cleaned up, avoiding OS Login profile accumulation.

fixes - b/514186407

TAG=agy
CONV=16d817fd-2422-432e-816b-bf159b381df2

@chandra-siri chandra-siri requested a review from a team as a code owner May 19, 2026 08:23
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the zonal system tests to use a pre-packaged source code tarball instead of cloning the repository on the VM. Key changes include adding a packaging step in the Cloud Build configuration, registering SSH keys with a TTL via OS Login, and simplifying VM naming. Review feedback recommends removing obsolete environment variables from the remote execution command and ensuring that the VM deletion step does not interfere with returning the correct test exit status.

Comment thread packages/google-cloud-storage/cloudbuild/zb-system-tests-cloudbuild.yaml Outdated
Comment thread packages/google-cloud-storage/cloudbuild/zb-system-tests-cloudbuild.yaml Outdated
…rrency-safe

- Replace Git cloning/fetching commands inside the GCE VM with local workspace archiving. We package the packages/google-cloud-storage directory and scp it to the VM directly. This ensures 100% reliability, natively supports fork PRs, and enables manual pre-push testing of local uncommitted changes.
- Replace the concurrency-breaking cleanup-old-keys step with OS Login key registration with a 1-hour Time-To-Live (TTL). This allows GCP to automatically expire and delete old keys safely without interfering with other active concurrent builds.
- Clean up substitutions by removing nested substitution variables (_SHORT_BUILD_ID and _VM_NAME) and instead using direct gcb-${BUILD_ID} naming in all step definitions, matching standard CloudBuild compliance rules.
- Add safe default substitutions to support running builds manually from the local workspace without throwing validation errors.
@chandra-siri chandra-siri merged commit 819ce1b into googleapis:main May 19, 2026
31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants