Skip to content

EC-1706: Remove filesystem image layer cache#3217

Open
robnester-rh wants to merge 1 commit intoconforma:mainfrom
robnester-rh:EC-1706
Open

EC-1706: Remove filesystem image layer cache#3217
robnester-rh wants to merge 1 commit intoconforma:mainfrom
robnester-rh:EC-1706

Conversation

@robnester-rh
Copy link
Copy Markdown
Contributor

Summary

  • Remove the filesystem image layer cache entirely — it was not thread-safe (#1109) and provided no measurable performance gain
  • Remove EC_CACHE env var from Tekton task definitions where it was set to "false" to work around the thread-safety issue
  • Clean up related tests and benchmark code

The cache was originally intended to avoid re-fetching shared image layers from the registry, but caused random redhat_manifests.redhat_manifests_missing violations when validating multiple images with shared layers. Rather than merging the thread-safe cache fix (PR #3150), we opted to remove the cache entirely since benchmarks showed no meaningful gain.

Test plan

  • make test passes
  • cr --plain returns no findings
  • CI acceptance tests pass

Ref: EC-1706
Related: EC-1669, #1109

Made with Cursor

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 1, 2026

Important

Review skipped

This PR was authored by the user configured for CodeRabbit reviews. CodeRabbit does not review PRs authored by this user. It's recommended to use a dedicated user account to post CodeRabbit review feedback.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f99e7d57-bada-41b9-b344-3dcec9b65243

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Removed the filesystem-backed image cache and all explicit EC_CACHE usages; the OCI client now returns images directly via remote.Image(...). Tests, a benchmark, and Tekton task steps were updated to remove cache-related behavior, assertions, and environment settings.

Changes

Cohort / File(s) Summary
OCI client & tests
internal/utils/oci/client.go, internal/utils/oci/client_test.go
Removed global filesystem image cache and its init logic; defaultClient.Image now returns remote.Image(...) directly. Tests: deleted TestCacheInit, stopped capturing registry logs (use io.Discard), added helper to fully read/verify layers, simplified image/layer fetch assertions, and removed unused imports.
Benchmark
benchmark/simple/simple.go
Removed setting EC_CACHE="false" in the ec(dir) closure before invoking the validate-image command.
Tekton task specs
tasks/verify-conforma-konflux-ta/0.1/verify-conforma-konflux-ta.yaml, tasks/verify-conforma-konflux-vsa-ta/0.1/verify-conforma-konflux-vsa-ta.yaml, tasks/verify-enterprise-contract/0.1/verify-enterprise-contract.yaml
Removed EC_CACHE environment variable entries from validate steps (previously "false"). No other step commands, args, mounts, or envs were changed.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: removal of the filesystem image layer cache, which is the primary objective of the PR.
Description check ✅ Passed The description is directly related to the changeset, explaining the rationale for cache removal, thread-safety issues, and impact on test/benchmark code.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@qodo-code-review
Copy link
Copy Markdown
Contributor

Review Summary by Qodo

Remove filesystem image layer cache for thread-safety

🐞 Bug fix ✨ Enhancement

Grey Divider

Walkthroughs

Description
• Remove non-thread-safe filesystem image layer cache entirely
• Eliminate EC_CACHE environment variable from Tekton task definitions
• Simplify OCI client by removing cache initialization and wrapping logic
• Clean up related tests and benchmark code that relied on caching
Diagram
flowchart LR
  A["OCI Client with Cache"] -->|Remove cache logic| B["Simplified OCI Client"]
  C["Tekton Tasks with EC_CACHE"] -->|Remove env var| D["Tekton Tasks without EC_CACHE"]
  E["Cache-dependent Tests"] -->|Simplify/Remove| F["Streamlined Tests"]
  B --> G["Direct remote.Image calls"]
  D --> H["No workaround needed"]
Loading

Grey Divider

File Changes

1. internal/utils/oci/client.go 🐞 Bug fix +1/-39

Remove cache initialization and wrapping logic

• Remove imgCache variable and initCache() function that managed filesystem caching
• Remove imports for os, path, strconv, sync, and cache packages
• Simplify Image() method to directly return remote.Image() without cache wrapping
• Remove stale TODO comment about layer caching from Layer() method

internal/utils/oci/client.go


2. internal/utils/oci/client_test.go 🧪 Tests +11/-32

Remove cache-related tests and simplifications

• Remove entire TestCacheInit() test function that validated cache behavior
• Simplify TestImage() by removing cache verification logic and repeated fetch calls
• Replace logger buffer with io.Discard since blob download counting is no longer needed
• Remove assertions checking that layers are fetched only once due to caching

internal/utils/oci/client_test.go


3. benchmark/simple/simple.go ⚙️ Configuration changes +0/-3

Remove EC_CACHE workaround from benchmark

• Remove os.Setenv("EC_CACHE", "false") call from benchmark setup

benchmark/simple/simple.go


View more (3)
4. tasks/verify-conforma-konflux-ta/0.1/verify-conforma-konflux-ta.yaml ⚙️ Configuration changes +0/-5

Remove EC_CACHE env var from Tekton task

• Remove EC_CACHE environment variable definition with value "false"
• Remove associated comment explaining the thread-safety issue and workaround

tasks/verify-conforma-konflux-ta/0.1/verify-conforma-konflux-ta.yaml


5. tasks/verify-conforma-konflux-vsa-ta/0.1/verify-conforma-konflux-vsa-ta.yaml ⚙️ Configuration changes +0/-8

Remove EC_CACHE env vars from Tekton task

• Remove EC_CACHE environment variable definition from two separate task steps
• Remove associated comments explaining the thread-safety issue and workaround

tasks/verify-conforma-konflux-vsa-ta/0.1/verify-conforma-konflux-vsa-ta.yaml


6. tasks/verify-enterprise-contract/0.1/verify-enterprise-contract.yaml ⚙️ Configuration changes +0/-5

Remove EC_CACHE env var from Tekton task

• Remove EC_CACHE environment variable definition with value "false"
• Remove associated comment explaining the thread-safety issue and workaround

tasks/verify-enterprise-contract/0.1/verify-enterprise-contract.yaml


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown
Contributor

qodo-code-review bot commented Apr 1, 2026

Code Review by Qodo

🐞 Bugs (1) 📘 Rule violations (0) 📎 Requirement gaps (0)

Grey Divider


Remediation recommended

1. Unclosed layer readers 🐞 Bug ☼ Reliability
Description
TestImage reads from layer.Uncompressed() but never closes the returned stream, which can leak
resources and keep HTTP connections open across tests. This pattern can lead to flaky test runs when
executed in parallel or under tighter FD/connection limits.
Code

internal/utils/oci/client_test.go[R127-132]

+	for _, l := range layers {
+		r, err := l.Uncompressed()
		require.NoError(t, err)
-		layers, err := img.Layers()
+		_, err = io.ReadAll(r)
		require.NoError(t, err)
-		for _, l := range layers {
-			r, err := l.Uncompressed()
-			require.NoError(t, err)
-			_, err = io.ReadAll(r)
-			require.NoError(t, err)
-		}
	}
-
-	fetchFully()
-	fetchFully()
-	fetchFully()
-
-	blobDownloadCount := strings.Count(l.String(), "GET /v2/repository/image/blobs/sha256:")
-	assert.Equal(t, 5, blobDownloadCount) // three configs fetched each time and two layers fetched only once
Evidence
In TestImage, the Uncompressed() stream is read but not closed. Elsewhere in this repo,
Uncompressed()/signature streams are consistently closed via defer Close(), indicating the expected
usage pattern and avoiding leaks.

internal/utils/oci/client_test.go[119-133]
internal/attestation/attestation.go[62-67]
acceptance/registry/registry.go[271-276]
internal/tracker/oci_test.go[106-109]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`internal/utils/oci/client_test.go` reads from `Uncompressed()` streams but never closes them. These streams are `Close()`-able and should be closed to avoid leaking resources (e.g., HTTP connections / readers) during tests.

## Issue Context
Other parts of the repo consistently `defer reader.Close()` after calling `Uncompressed()`, indicating this is the intended usage pattern.

## Fix Focus Areas
- internal/utils/oci/client_test.go[127-132]
- internal/utils/oci/client_test.go[165-168]

## Expected change
After `r, err := ...Uncompressed()`, add `defer r.Close()` (or explicit `r.Close()` after `io.ReadAll`) inside the loop, ensuring each opened stream is closed.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

@robnester-rh robnester-rh force-pushed the EC-1706 branch 2 times, most recently from 7122de9 to 022d176 Compare April 1, 2026 17:49
@github-actions github-actions bot added size: L and removed size: M labels Apr 1, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
internal/utils/oci/client.go (1)

193-193: Consider request-scoped image memoization after cache removal.

Line 193 now guarantees a fresh remote.Image fetch per call. In internal/fetchers/oci/config/config.go (Lines 30-75), FetchConfig() and FetchParentImage() each call oci.NewClient(ctx).Image(ref) separately, which can duplicate registry round-trips for the same reference. Consider sharing one fetched image (or a short-lived request-scoped map keyed by digest) to reduce latency and 429 pressure without reintroducing global cache.

As per coding guidelines "Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/utils/oci/client_test.go`:
- Around line 119-121: Tests construct a zero-value defaultClient which leaves
ctx nil, causing trace.StartRegion in defaultClient.Image and
defaultClient.Layer to receive a nil context; initialize the test clients with a
non-nil context (for example defaultClient{ctx: context.Background()} or
context.TODO()) wherever defaultClient{} is used in the tests so Image and Layer
call trace.StartRegion with a valid context.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2fa98bfa-8ae2-4a56-bbe0-bc90856e97e5

📥 Commits

Reviewing files that changed from the base of the PR and between 2ebb3f2 and 7122de9.

📒 Files selected for processing (6)
  • benchmark/simple/simple.go
  • internal/utils/oci/client.go
  • internal/utils/oci/client_test.go
  • tasks/verify-conforma-konflux-ta/0.1/verify-conforma-konflux-ta.yaml
  • tasks/verify-conforma-konflux-vsa-ta/0.1/verify-conforma-konflux-vsa-ta.yaml
  • tasks/verify-enterprise-contract/0.1/verify-enterprise-contract.yaml
💤 Files with no reviewable changes (4)
  • benchmark/simple/simple.go
  • tasks/verify-conforma-konflux-vsa-ta/0.1/verify-conforma-konflux-vsa-ta.yaml
  • tasks/verify-conforma-konflux-ta/0.1/verify-conforma-konflux-ta.yaml
  • tasks/verify-enterprise-contract/0.1/verify-enterprise-contract.yaml

The image layer cache was not thread-safe and caused random
redhat_manifests.redhat_manifests_missing violations when validating
multiple images sharing layers (conforma#1109). All Tekton tasks
already disabled it via EC_CACHE=false, and benchmarking showed no
measurable performance gain from caching.

Remove the cache entirely rather than fixing it:
- Remove imgCache/initCache and cache.Image wrapping from OCI client
- Remove EC_CACHE env var from Tekton task definitions
- Remove EC_CACHE=false from benchmark setup
- Remove related test (TestCacheInit) and simplify TestImage
- Remove stale caching TODO in Layer()

Ref: EC-1706
Ref: EC-1669

Signed-off-by: Rob Nester <rnester@redhat.com>
Assisted-by: Cursor
Made-with: Cursor
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 1, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

Flag Coverage Δ
acceptance 55.16% <100.00%> (-0.02%) ⬇️
generative 17.92% <0.00%> (+0.02%) ⬆️
integration 26.67% <0.00%> (+0.04%) ⬆️
unit 69.02% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
benchmark/simple/simple.go 0.00% <ø> (ø)
internal/utils/oci/client.go 54.28% <100.00%> (-2.63%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant