Skip to content

ci: use preinstalled JDK for SonarCloud scanner, cache scanner engine jar#21632

Merged
taratorio merged 3 commits into
mainfrom
taratorio/sonar-preinstalled-jdk
Jun 5, 2026
Merged

ci: use preinstalled JDK for SonarCloud scanner, cache scanner engine jar#21632
taratorio merged 3 commits into
mainfrom
taratorio/sonar-preinstalled-jdk

Conversation

@taratorio
Copy link
Copy Markdown
Member

@taratorio taratorio commented Jun 5, 2026

Problem

The sonar job pulls two artifacts from scanner.sonarcloud.io on every scan: a JRE ("JRE provisioning") and the scanner-engine jar. That CDN intermittently 403s GitHub-runner IPs, and the blocks outlive the spaced retry added in #21604:

The 403s are IP-scoped blocking, not artifact availability: the exact jar URL that failed serves 200 from outside the runners (published Jun 1, still on the CDN), and api.sonarcloud.io answered fine in the same failing run — only the scanner.sonarcloud.io host blocks, and for longer than the 90s retry spacing, so a same-runner retry cannot ride it out.

Fix

Remove both per-scan dependencies on that host.

JRE: skip provisioning and point the scanner at the JDK already baked into the runner image, via the scanner's documented switches:

  • SONAR_SCANNER_SKIP_JRE_PROVISIONING=true
  • SONAR_SCANNER_JAVA_EXE_PATH=$JAVA_HOME_21_X64/bin/java

The ubuntu-24.04 image ships Temurin 21 — the same major version Sonar provisions (the failing artifact was OpenJDK21U-jre_...21.0.9). The env vars go through $GITHUB_ENV, so both the scan and the retry step inherit them. cleanup-space in setup-erigon does not remove the preinstalled JDKs.

Engine jar: seed it into the actions cache from cache-warming push runs; PR and merge-queue scans restore it. The seed step queries api.sonarcloud.io/analysis/engine (the host that stays reachable; returns {filename, sha256, downloadUrl}), downloads the jar with retries, sha256-verifies it, and saves it in the scanner's content-addressed download-cache layout — ~/.sonar/cache/<sha256>/<filename>, cache key sonar-scanner-engine|<sha256>. At scan time the bootstrapper asks the API for the prescribed sha and, finding it in the local cache, never contacts the CDN.

Verified end-to-end with scanner CLI 8.1.0.6389 against a cache seeded exactly as the workflow does it: the debug log shows the metadata call, zero requests to scanner.sonarcloud.io, and the engine launched straight from the cached jar.

Cache-warming runs on every push to main/release, and SonarCloud rotates engines every few days (12.37 published Jun 1, 12.38 on Jun 5), so seeds refresh within hours of a rotation.

Failure modes considered

  • Runner image drops Temurin 21: the [ -x ... ] guard leaves the env vars unset and the scanner falls back to downloading, i.e. current behavior.
  • SonarCloud raises its minimum JRE above 21: the scan fails deterministically with a version error (historically preceded by months of deprecation warnings in the scan log); fix is bumping the env var to JAVA_HOME_25_X64, which the image already ships.
  • Engine cache miss (version rotated since the last base-branch push, or cache evicted): the scanner falls back to the direct download plus the existing retry — today's behavior, never worse.
  • Seeding fails (the CDN 403s the cache-warming runner too, or the bootstrap API contract changes): the lookup and download steps are continue-on-error, no cache is saved, and the next push to the branch retries; scans fall back as above.

The scanner CLI zip and GPG key still come per run from binaries.sonarsource.com and the keyserver; those hosts have not been the ones failing, and the existing retry covers them.

Note: the first engine seed only materializes once this merges (cache-warming triggers on push to main), so this PR's own sonar runs still use the fallback download path.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the SonarCloud workflow to avoid SonarScanner’s per-run JRE download (which has been failing intermittently with 403s) by configuring the scanner to use the GitHub runner’s preinstalled Java instead.

Changes:

  • Add a workflow step that sets SONAR_SCANNER_SKIP_JRE_PROVISIONING=true and points SONAR_SCANNER_JAVA_EXE_PATH at the runner’s Temurin 21 Java executable via $GITHUB_ENV.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +52 to +56
run: |
if [ -x "${JAVA_HOME_21_X64}/bin/java" ]; then
echo "SONAR_SCANNER_SKIP_JRE_PROVISIONING=true" >> "$GITHUB_ENV"
echo "SONAR_SCANNER_JAVA_EXE_PATH=${JAVA_HOME_21_X64}/bin/java" >> "$GITHUB_ENV"
fi
Copy link
Copy Markdown
Member Author

@taratorio taratorio Jun 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 4fd9271 — added a -n presence check so an unset JAVA_HOME_21_X64 falls back to JRE provisioning instead of resolving to /bin/java.

Copy link
Copy Markdown
Member

@yperbasis yperbasis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving — two non-blocking nits.

1. Guard an unset JAVA_HOME_21_X64 (as Copilot noted). If the var is ever unset, ${JAVA_HOME_21_X64}/bin/java expands to a bare /bin/java (usrmerge symlinks /bin/usr/bin), which can pass the -x test and silently point the scanner at the system-default JDK instead of falling back to downloading as the description states. A presence check makes the documented fallback actually hold:

if [ -n "${JAVA_HOME_21_X64}" ] && [ -x "${JAVA_HOME_21_X64}/bin/java" ]; then

2. x64-only var. JAVA_HOME_21_X64 is arch-specific — on an ubuntu-24.04-arm runner it'd be JAVA_HOME_21_ARM64 and this would silently fall back to downloading. Fine while the job is pinned to ubuntu-24.04; just flagging for any future runner change.

Copy link
Copy Markdown
Member

@lystopad lystopad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks.

@taratorio
Copy link
Copy Markdown
Member Author

Addressed point 1 in 4fd9271: [ -n "${JAVA_HOME_21_X64}" ] now guards the -x check, so an unset var falls back to JRE provisioning as documented instead of silently passing via /bin/java.

On point 2 (arch-specific var): agreed, leaving as is while the job is pinned to ubuntu-24.04. With the new guard, an arm runner would have JAVA_HOME_21_X64 unset and fall back to downloading — degraded but correct.

@taratorio taratorio enabled auto-merge June 5, 2026 08:04
The engine jar is the remaining per-scan artifact fetched from
scanner.sonarcloud.io, which intermittently 403s GitHub-runner IPs for
longer than the spaced retry covers. Cache-warming push runs now look up
the current engine via api.sonarcloud.io (which stays reachable during
those blocks), download and verify it, and save it under the scanner's
content-addressed download-cache layout (~/.sonar/cache/<sha256>/<file>);
scan runs restore it so the scanner skips the CDN fetch entirely. A
stale or missing seed falls back to the existing download-plus-retry
path.
@taratorio taratorio disabled auto-merge June 5, 2026 10:21
@taratorio
Copy link
Copy Markdown
Member Author

Update — the sonar run on this PR failed with the same class of 403, but on the second artifact the scanner pulls from scanner.sonarcloud.io: the engine jar. The JRE half of the fix did work — the log shows Using the configured java executable '/usr/lib/jvm/temurin-21-jdk-amd64/bin/java' and no JRE download — but bootstrap then hit GET https://scanner.sonarcloud.io/engines/sonarcloud-scanner-engine-12.37.0.3460.jar → HTTP 403 Forbidden on both the scan and the 90s-spaced retry.

New finding from investigating that run: the 403s are IP-scoped blocking of GitHub-runner IPs, not artifact availability and not short blips —

  • the exact jar URL that 403'd serves 200 from outside the runners (published Jun 1, still on the CDN);
  • api.sonarcloud.io — which prescribes the engine via GET /analysis/engine{filename, sha256, downloadUrl} — answered fine in the same failing run; only the scanner.sonarcloud.io CDN host blocks;
  • the block outlived the 92s between the two attempts, so same-runner retries can't ride it out.

That invalidates the description's original bet that the engine jar "remain[s] covered by the existing retry". Pushed 997d57c extending the eliminate-the-host approach to the engine jar: cache-warming push runs seed it into the actions cache under the scanner's content-addressed download-cache layout (~/.sonar/cache/<sha256>/<filename>), scans restore it, and the bootstrapper — after checking the prescribed sha against the API — uses the cached jar without ever contacting the CDN. Verified end-to-end with scanner CLI 8.1.0.6389 against a seeded cache: zero requests to scanner.sonarcloud.io, engine launched straight from the cache. A stale or missing seed falls back to the existing download-plus-retry path.

PR description updated to match.

@taratorio taratorio enabled auto-merge June 5, 2026 10:32
@taratorio taratorio changed the title ci: use preinstalled JDK for SonarCloud scanner, skip JRE download ci: use preinstalled JDK for SonarCloud scanner, cache scanner engine jar Jun 5, 2026
@taratorio taratorio added this pull request to the merge queue Jun 5, 2026
Merged via the queue into main with commit 6898be8 Jun 5, 2026
88 checks passed
@taratorio taratorio deleted the taratorio/sonar-preinstalled-jdk branch June 5, 2026 11:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants