Skip to content

[SYNPY-1764] Add Trivy container vulnerability scanning#1346

Merged
BryanFauble merged 12 commits intodevelopfrom
synpy-1764-trivy-scanning
Apr 9, 2026
Merged

[SYNPY-1764] Add Trivy container vulnerability scanning#1346
BryanFauble merged 12 commits intodevelopfrom
synpy-1764-trivy-scanning

Conversation

@BryanFauble
Copy link
Copy Markdown
Member

Summary

  • Add Trivy vulnerability scanning to gate Docker image publication on GHCR, following Sage's Container Vulnerability Scanning guidelines
  • Restructure both release and develop Docker jobs from single build+push steps into a build → scan → push pattern — images are only pushed if no Critical/High unfixed vulnerabilities are found
  • Add daily periodic scan of the latest published image with auto-remediation (patch version bump + rebuild) when new vulnerabilities are detected

New workflow files

File Purpose
trivy.yml Reusable Trivy scanning workflow — scans tar or remote image, uploads SARIF to GitHub Security tab
docker_build.yml Reusable build/scan/push workflow for periodic rebuilds
trivy_periodic_scan.yml Daily rescan of latest published image with auto-remediation

Key Trivy settings

  • ignore-unfixed: true — only actionable vulnerabilities
  • severity: CRITICAL,HIGH — skip Medium/Low
  • exit-code: 1 — fail builds on findings
  • SARIF upload to GitHub Security tab for triage
  • Alternate Trivy DB repos (public.ecr.aws) to avoid rate limits

Test plan

  • Push to develop branch and verify the build → Trivy scan → push flow completes successfully
  • Verify SARIF results appear in the repo's Security tab (Code Scanning)
  • Verify Docker image is pushed to GHCR only after Trivy scan passes
  • Manually trigger trivy_periodic_scan.yml via workflow_dispatch and verify it scans the latest published image
  • Create a pre-release to verify the release Docker flow works end-to-end

🤖 Generated with Claude Code

BryanFauble and others added 4 commits March 23, 2026 19:47
Captures non-obvious conventions (async-to-sync decorator, protocol classes,
dataclass models with fill_from_dict, concrete Java types), architecture
overview, verified commands, constraints, and testing patterns that Claude
cannot infer from code alone.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Each file documents non-obvious patterns specific to that module:
- models/: new model checklist, fill_from_dict pattern, _last_persistent_instance
  lifecycle, EnumCoercionMixin usage, standard field requirements
- api/: function signature conventions, REST call patterns, pagination helpers,
  new service file checklist
- core/: async_to_sync internals, retry strategies, credentials chain,
  upload/download resilience, concrete types registration
- tests/: async-only test convention, unit test socket blocking, integration
  test cleanup with schedule_for_cleanup(), fixture scoping

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rewrote 5 existing files with enhanced behavioral conventions
(reusable utilities, conditional behavior, concurrency patterns).
Added 11 new module-level files for full directory coverage:
operations, models/mixins, models/services, models/protocols,
core/upload, core/download, core/constants, core/credentials,
extensions/curator, synapseutils, and docs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add Trivy scanning to gate Docker image publication on GHCR. Both release
and develop Docker jobs now follow a build→scan→push pattern where images
are only pushed if no Critical/High unfixed vulnerabilities are found.

New workflows:
- trivy.yml: reusable Trivy scanning workflow with SARIF upload to GitHub Security tab
- docker_build.yml: reusable build/scan/push workflow for image rebuilds
- trivy_periodic_scan.yml: daily rescan of latest published image with auto-remediation
@BryanFauble BryanFauble requested a review from a team as a code owner March 23, 2026 22:20
@BryanFauble BryanFauble changed the base branch from develop to add-claude-md March 23, 2026 22:21
Copy link
Copy Markdown
Member Author

@BryanFauble BryanFauble left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pre-review: documentation comments on complex areas to help reviewers.

Note: These comments were generated with AI assistance to help reviewers understand complex areas.

Comment thread .github/workflows/build.yml
@@ -0,0 +1,91 @@
---
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the central reusable scanning workflow — called from both build.yml (pre-push scan) and trivy_periodic_scan.yml (post-publish rescan). It supports two modes:

Mode SOURCE_TYPE How it gets the image Used by
Pre-push tar Downloads artifact from calling workflow, loads via docker load build.yml, docker_build.yml
Post-publish image Trivy pulls directly from GHCR trivy_periodic_scan.yml

The EXIT_CODE input controls whether findings fail the workflow (1) or just report (0). Both build.yml and the periodic scan use 1 so vulnerabilities are blocking.

The alternate Trivy DB repos (public.ecr.aws/aquasecurity/trivy-db:2) are important — the default GitHub-hosted DB gets rate-limited due to high download volume across the ecosystem.

SARIF results are uploaded even when Trivy finds vulnerabilities (the success() || steps.trivy.conclusion == 'failure' condition), so findings always land in the Security tab for triage regardless of whether the build passes.

Note: This comment was drafted with AI assistance and reviewed by me for accuracy.

@@ -0,0 +1,89 @@
---
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This workflow rescans the latest published image daily to catch newly disclosed CVEs. The flow has a multi-job conditional chain that's worth understanding:

graph TD
    A[get-image-reference] -->|"latest tag from mathieudutour/github-tag-action"| B[periodic-scan]
    B -->|clean| C[Done — no action needed]
    B -->|"trivy_conclusion == 'failure'"| D[bump-tag]
    D -->|"new patch version tag"| E[update-image]
    E -->|"calls docker_build.yml"| F[Rebuild + Trivy scan + push]
Loading

The !cancelled() condition on bump-tag and update-image is important — without it, these jobs would be skipped when the scan fails (since needs.periodic-scan would have a failure result, and GitHub Actions skips downstream jobs by default on failure). The !cancelled() override lets them run, and the trivy_conclusion == 'failure' check ensures they only run when there are actual findings.

The rebuild via docker_build.yml includes its own Trivy scan, so the patched image is only published if the rebuild actually remediates the vulnerabilities.

Note: This comment was drafted with AI assistance and reviewed by me for accuracy.

Comment thread .github/workflows/docker_build.yml
Copy link
Copy Markdown
Contributor

@linglp linglp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See news about trivy scan: https://socket.dev/blog/trivy-under-attack-again-github-actions-compromise
Also, should we merge this branch to develop rather than add-claude-md?

Comment thread .github/workflows/trivy.yml Outdated
Copy link
Copy Markdown
Contributor

@jaymedina jaymedina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: Trivy Container Vulnerability Scanning

Good overall structure — the reusable trivy.yml, docker_build.yml, and daily trivy_periodic_scan.yml follow the Sage Container Vulnerability Scanning guidelines cleanly. One critical correctness issue needs fixing before merge, plus a few smaller items.

Issues

  • 🔴 Critical — Push jobs in build.yml rebuild the image from scratch instead of loading and pushing the scanned tar. The scanned artifact is abandoned, so the image that reaches GHCR was never verified. docker_build.yml's push-image job shows the correct pattern: download → load → tag → push.
  • 🟡 Moderate — YAML \ line continuation in IMAGE_REFERENCES preserves newlines and leading spaces, producing a tag with a leading space that will fail on docker push.
  • 🔵 Minor — Third-party actions pinned to floating tags (@v6.2, @v1) rather than commit SHAs.
  • 🔵 Minorget-image-reference job carries deployments: write and security-events: write permissions it does not need.
  • 🔵 Minorghcr-push-on-develop relies on implicit skip from needs: rather than an explicit if: condition.

Comment thread .github/workflows/build.yml Outdated
Comment thread .github/workflows/build.yml Outdated
Comment thread .github/workflows/trivy_periodic_scan.yml Outdated
Comment thread .github/workflows/trivy_periodic_scan.yml Outdated
Comment thread .github/workflows/trivy_periodic_scan.yml Outdated
Comment thread .github/workflows/build.yml Outdated
Comment thread .github/workflows/build.yml Outdated
Copy link
Copy Markdown
Contributor

@jaymedina jaymedina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great architectural direction — the build → scan → push pattern is exactly the right approach, and the reusable trivy.yml workflow is well designed. There are two bugs that need to be fixed before merging:

  1. Push jobs rebuild the image instead of loading the scanned artifact (both ghcr-push-on-release and ghcr-push-on-develop in build.yml): the image that gets pushed to GHCR is a fresh rebuild, not the artifact that Trivy scanned. docker_build.yml correctly solves this by loading from tar — the same pattern needs to be applied here.

  2. ${{ env.repo_name }} in the job outputs section won't capture the runtime env var (trivy_periodic_scan.yml:33): env vars set via $GITHUB_ENV are not accessible in the jobs.<job>.outputs expression context. This will produce an empty string, causing the periodic scan to target the wrong image.

Note: This comment has been generated with AI assistance and reviewed by the author.

Comment thread .github/workflows/build.yml Outdated
Comment thread .github/workflows/build.yml Outdated
Comment thread .github/workflows/trivy_periodic_scan.yml Outdated
Comment thread .github/workflows/trivy.yml Outdated
Comment thread .github/workflows/docker_build.yml Outdated
Comment thread .github/workflows/trivy_periodic_scan.yml Outdated
- Fix push jobs to load scanned tar instead of rebuilding (build.yml)
- Pin trivy-action to SHA for v0.35.0 to address supply chain attack
- Fix env.repo_name output using $GITHUB_OUTPUT (trivy_periodic_scan.yml)
- Pin all third-party actions to commit SHAs
- Remove unnecessary permissions on get-image-reference job
- Use !cancelled() for SARIF upload condition (trivy.yml)
- Use LOCAL_IMAGE_TAG env var instead of hardcoded string (docker_build.yml)
- Fix IMAGE_REFERENCES YAML line continuation
@BryanFauble
Copy link
Copy Markdown
Member Author

Keeping add-claude-md as the base for now — this PR is part of a stacked set of changes on that branch. It'll merge to develop once the parent PR lands.

On the Trivy supply chain concern — updated to pin to the safe SHA (v0.35.0). All other third-party actions have been SHA-pinned as well.

Note: This comment was drafted with AI assistance and reviewed by me for accuracy.

@BryanFauble BryanFauble requested review from jaymedina and linglp April 1, 2026 00:19
Copy link
Copy Markdown
Contributor

@linglp linglp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a comment about ignore-unfixed: true. I am a bit concerned that if there's a fix in a higher version of the dependency, the code would trigger a rebuild loop

Comment thread .github/workflows/docker_build.yml
Comment thread .github/workflows/build.yml
Comment thread .github/workflows/trivy_periodic_scan.yml Outdated
Restructure trivy_periodic_scan.yml so the git tag is only created
after a successful rebuild (not before). If the rebuild still has
vulnerabilities, a GitHub issue is opened for manual triage instead
of looping endlessly.

- Rename bump-tag → compute-next-version (dry_run: true)
- Add create-tag job gated on update-image success
- Add alert-on-failure job that opens a GitHub issue with
  duplicate prevention when remediation fails
Base automatically changed from add-claude-md to develop April 3, 2026 21:09
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Trivy container vulnerability scanning to the GitHub Actions CI/CD pipeline so Docker images are scanned before publication to GHCR, and introduces a daily rescan workflow that can auto-remediate by rebuilding with a patch bump when new vulnerabilities appear.

Changes:

  • Introduces a reusable Trivy scanning workflow that can scan either a local tarball image or a remote registry image and upload SARIF results.
  • Refactors the release and develop GHCR publishing flows into build → scan → push so publication is gated on the Trivy result.
  • Adds a scheduled “periodic scan” workflow that scans the latest image daily and attempts an automated rebuild + tag bump on failures.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File Description
.github/workflows/trivy.yml New reusable Trivy scan workflow producing SARIF output for GitHub code scanning.
.github/workflows/docker_build.yml New reusable build/scan/push workflow used for automated rebuilds after periodic scans.
.github/workflows/trivy_periodic_scan.yml New scheduled workflow that scans the latest image daily and triggers rebuild/tagging on findings.
.github/workflows/build.yml Refactors existing release/develop Docker publishing into build → scan → push using the reusable Trivy workflow.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .github/workflows/trivy.yml
Comment thread .github/workflows/build.yml Outdated
Comment thread .github/workflows/build.yml
Comment thread .github/workflows/trivy_periodic_scan.yml Outdated
Comment thread .github/workflows/trivy_periodic_scan.yml Outdated
Comment thread .github/workflows/trivy.yml Outdated
Comment thread .github/workflows/trivy_periodic_scan.yml Outdated
Comment thread .github/workflows/trivy_periodic_scan.yml Outdated
Copy link
Copy Markdown
Contributor

@linglp linglp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just some minor comments!

- Pin codeql-action/upload-sarif to SHA and upgrade to v3.35.1
- Guard update-image job on compute-next-version success
- Use absolute URL for Security tab link in auto-created issues
@BryanFauble BryanFauble merged commit 0415279 into develop Apr 9, 2026
32 checks passed
@BryanFauble BryanFauble deleted the synpy-1764-trivy-scanning branch April 9, 2026 19:40
thomasyu888 pushed a commit that referenced this pull request Apr 15, 2026
* [SYNPY-1798]: updated black to 26.3.1 and reran pre-commit (#1341)

* update black to 26.3.1 and rerun pre-commit

* update the tutorial line

---------

Co-authored-by: Lingling Peng <lpeng@w290.local>
Co-authored-by: Lingling Peng <lpeng@Mac.SageCorpWiFi>

* [SYNPY-1764] Add Trivy container vulnerability scanning (#1346)

* [SYNPY-1764] Add Trivy container vulnerability scanning to Docker build

Add Trivy scanning to gate Docker image publication on GHCR. Both release
and develop Docker jobs now follow a build→scan→push pattern where images
are only pushed if no Critical/High unfixed vulnerabilities are found.

New workflows:
- trivy.yml: reusable Trivy scanning workflow with SARIF upload to GitHub Security tab
- docker_build.yml: reusable build/scan/push workflow for image rebuilds
- trivy_periodic_scan.yml: daily rescan of latest published image with auto-remediation

* Address PR review feedback from linglp and jaymedina

- Fix push jobs to load scanned tar instead of rebuilding (build.yml)
- Pin trivy-action to SHA for v0.35.0 to address supply chain attack
- Fix env.repo_name output using $GITHUB_OUTPUT (trivy_periodic_scan.yml)
- Pin all third-party actions to commit SHAs
- Remove unnecessary permissions on get-image-reference job
- Use !cancelled() for SARIF upload condition (trivy.yml)
- Use LOCAL_IMAGE_TAG env var instead of hardcoded string (docker_build.yml)
- Fix IMAGE_REFERENCES YAML line continuation

* Prevent infinite rebuild loop in periodic Trivy scan

Restructure trivy_periodic_scan.yml so the git tag is only created
after a successful rebuild (not before). If the rebuild still has
vulnerabilities, a GitHub issue is opened for manual triage instead
of looping endlessly.

- Rename bump-tag → compute-next-version (dry_run: true)
- Add create-tag job gated on update-image success
- Add alert-on-failure job that opens a GitHub issue with
  duplicate prevention when remediation fails

* pre-commit

* Update Trivy scan workflow to use previous tag and adjust image references

* Address PR review feedback

- Pin codeql-action/upload-sarif to SHA and upgrade to v3.35.1
- Guard update-image job on compute-next-version success
- Use absolute URL for Security tab link in auto-created issues

* Add actions read permission for Trivy scan job (#1355)

* Add optional ARTIFACT_NAME_SUFFIX input to Trivy workflow and update artifact naming (#1357)

* remove sort

---------

Co-authored-by: Lingling <55448354+linglp@users.noreply.github.com>
Co-authored-by: Lingling Peng <lpeng@w290.local>
Co-authored-by: Lingling Peng <lpeng@Mac.SageCorpWiFi>
Co-authored-by: BryanFauble <17128019+BryanFauble@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants