CM-716: Add HTTP01 Challenge Proxy controller#398
CM-716: Add HTTP01 Challenge Proxy controller#398sebrandon1 wants to merge 1 commit intoopenshift:masterfrom
Conversation
|
@sebrandon1: This pull request references CM-716 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
WalkthroughAdds an HTTP01Proxy feature: API types and CRD, controller and reconciliation logic, embedded manifests (DaemonSet, RBAC, ServiceAccount, NetworkPolicy), generated client/informer/lister/applyconfigs, manager wiring, and new image/version env vars in manager and Makefile. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: sebrandon1 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Actionable comments posted: 8
🧹 Nitpick comments (1)
pkg/controller/http01proxy/controller.go (1)
40-42: Context should be passed through, not stored in struct.The
ctxfield is initialized tocontext.Background()and stored in the Reconciler. Helper methods (e.g.,deleteDaemonSet,deleteServiceAccount) user.ctxinstead of the context passed toReconcile(). This means operations won't be cancelled if the reconcile request is cancelled, potentially causing reliability issues with long-running cleanup operations.♻️ Recommended approach
Pass the context from
Reconcile(ctx, req)down to helper methods rather than storing it:type Reconciler struct { common.CtrlClient - ctx context.Context eventRecorder record.EventRecorder log logr.Logger scheme *runtime.Scheme }Then update helper method signatures to accept context and pass it from Reconcile:
func (r *Reconciler) deleteDaemonSet(ctx context.Context, proxy *v1alpha1.HTTP01Proxy) error { return r.deleteIfExists(ctx, &appsv1.DaemonSet{}, ...) }Also applies to: 62-62
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/http01proxy/controller.go` around lines 40 - 42, The struct field ctx should not be stored or initialized to context.Background(); remove the ctx field from Reconciler and stop using r.ctx. Update Reconcile(ctx, req) to pass its ctx into all helper calls (e.g., deleteDaemonSet, deleteServiceAccount and any other helpers) and change those helper signatures to accept ctx context.Context as their first parameter (for example func (r *Reconciler) deleteDaemonSet(ctx context.Context, proxy *v1alpha1.HTTP01Proxy) error) and ensure internal calls like deleteIfExists use the passed ctx instead of r.ctx. Search for uses of r.ctx across the file (and functions like deleteIfExists) and replace them to use the ctx parameter propagated from Reconcile.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@bindata/http01-proxy/cert-manager-http01-proxy-clusterrole.yaml`:
- Around line 26-34: The ClusterRole block granting verbs ["create","update"] on
the cluster-scoped resource "machineconfigs" is too broad; remove the "create"
and "update" verbs from the cluster-wide ClusterRole in
cert-manager-http01-proxy-clusterrole.yaml and instead either (a) move necessary
write privileges into a namespaced Role bound to a dedicated controller
ServiceAccount, or (b) if cluster-scoped writes are unavoidable, replace the
ClusterRole entry with a more restrictive rule using "resourceNames" limited to
the specific deterministic MachineConfig names and bind it only to the dedicated
controller identity; update any RoleBinding/ClusterRoleBinding to target that
controller ServiceAccount accordingly.
In `@bindata/http01-proxy/cert-manager-http01-proxy-daemonset.yaml`:
- Around line 45-52: The securityContext for the proxy container currently
grants NET_ADMIN and runs as root but does not prevent privilege escalation;
update the proxy container's securityContext (the block containing capabilities:
add: - NET_ADMIN / drop: - ALL and runAsNonRoot: false) to include
allowPrivilegeEscalation: false so escalation is explicitly blocked while
keeping the existing capabilities intact.
In `@config/manager/manager.yaml`:
- Around line 89-90: The RELATED_IMAGE_CERT_MANAGER_HTTP01PROXY env var is using
an unpinned image tag ("quay.io/bapalm/cert-mgr-http01-proxy:latest"); change it
to a fixed version tag or digest (for example
"quay.io/bapalm/cert-mgr-http01-proxy:v0.1.0" or a sha256@digest) to match other
operand pins and ensure reproducibility and verifiability—update the value for
RELATED_IMAGE_CERT_MANAGER_HTTP01PROXY accordingly and confirm the chosen
tag/digest matches the tested/released HTTP01Proxy version.
In `@Makefile`:
- Around line 325-330: The RELATED_IMAGE_CERT_MANAGER_HTTP01PROXY value
currently uses the mutable tag ":latest" while HTTP01PROXY_OPERAND_IMAGE_VERSION
is set to "0.1.0"; change RELATED_IMAGE_CERT_MANAGER_HTTP01PROXY to use a fixed
tag or digest that matches HTTP01PROXY_OPERAND_IMAGE_VERSION (or derive the tag
from HTTP01PROXY_OPERAND_IMAGE_VERSION) so both variables are aligned and the
image is pinned for reproducible, reviewable runs; update any
documentation/comments to reflect the pinning.
In `@pkg/controller/http01proxy/infrastructure.go`:
- Around line 67-68: The current check only compares info.apiVIPs[0] and
info.ingressVIPs[0]; update the validation to ensure no VIP appears in both
slices anywhere by checking the full lists (e.g., build a set from info.apiVIPs
and test each entry of info.ingressVIPs or vice versa) and if any overlap exists
return a clear error message referencing the duplicated VIP(s). Modify the
comparison around the block that currently uses info.apiVIPs[0] and
info.ingressVIPs[0] to perform this full-list intersection check and return the
same formatted message (including the conflicting VIP(s)) when overlaps are
found.
- Around line 33-44: The code is ignoring errors from unstructured.NestedString
and NestedStringSlice which can mask malformed Infrastructure status; update the
extraction around platformType (using unstructured.NestedString) and the VIP
slices (using unstructured.NestedStringSlice) to capture and check the returned
ok/err values, and handle failures explicitly (e.g., return an error from the
function or log and fail fast) instead of silently treating them as unsupported;
specifically modify the block that builds platformInfo (referencing variables
infra.Object, platformType, platformInfo, platformBareMetal, apiVIPs,
ingressVIPs) to validate the extraction results and propagate a clear error when
parsing fails.
In `@pkg/controller/http01proxy/utils.go`:
- Around line 115-137: The daemonSetSpecModified function currently only checks
a few fields and misses many important pod and controller settings; update it to
detect any meaningful drift by comparing the full managed DaemonSet spec (or at
minimum the full PodTemplateSpec and controller-level fields) rather than only
selector/template labels and container[0] subsets—specifically include
Template.Annotations, Template.Spec (hostNetwork, tolerations, volumes,
securityContext, affinity, dnsPolicy, imagePullSecrets, init containers and all
containers with their names/images/env/ports/resources), and top-level Spec
fields like Strategy; easiest fix: replace the ad-hoc checks in
daemonSetSpecModified with a deep equality comparison between desired.Spec and
fetched.Spec (or between desired.Spec.Template and fetched.Spec.Template) or
switch to server-side apply (SSA) for this resource so any difference in those
fields triggers an update.
- Around line 141-145: The status-update error handling currently discards the
original reconcile error (prependErr) by aggregating only err and errUpdate;
change the aggregation inside the if block that follows r.updateStatus(r.ctx,
proxy) to include prependErr as well (use utilerrors.NewAggregate with
prependErr, err, and errUpdate or otherwise append prependErr into the errors
slice) so the returned aggregate preserves the original reconcile error and the
update failure; modify the code paths around r.updateStatus, prependErr,
errUpdate, and utilerrors.NewAggregate accordingly.
---
Nitpick comments:
In `@pkg/controller/http01proxy/controller.go`:
- Around line 40-42: The struct field ctx should not be stored or initialized to
context.Background(); remove the ctx field from Reconciler and stop using r.ctx.
Update Reconcile(ctx, req) to pass its ctx into all helper calls (e.g.,
deleteDaemonSet, deleteServiceAccount and any other helpers) and change those
helper signatures to accept ctx context.Context as their first parameter (for
example func (r *Reconciler) deleteDaemonSet(ctx context.Context, proxy
*v1alpha1.HTTP01Proxy) error) and ensure internal calls like deleteIfExists use
the passed ctx instead of r.ctx. Search for uses of r.ctx across the file (and
functions like deleteIfExists) and replace them to use the ctx parameter
propagated from Reconcile.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 9c15de22-9f29-499f-b07d-8f12c902ce3e
📒 Files selected for processing (44)
Makefileapi/operator/v1alpha1/features.goapi/operator/v1alpha1/http01proxy_types.goapi/operator/v1alpha1/zz_generated.deepcopy.gobindata/http01-proxy/cert-manager-http01-proxy-clusterrole.yamlbindata/http01-proxy/cert-manager-http01-proxy-clusterrolebinding.yamlbindata/http01-proxy/cert-manager-http01-proxy-daemonset.yamlbindata/http01-proxy/cert-manager-http01-proxy-scc-rolebinding.yamlbindata/http01-proxy/cert-manager-http01-proxy-serviceaccount.yamlbindata/networkpolicies/http01-proxy-allow-egress-networkpolicy.yamlbindata/networkpolicies/http01-proxy-deny-all-networkpolicy.yamlconfig/crd/bases/operator.openshift.io_http01proxies.yamlconfig/manager/manager.yamlconfig/rbac/role.yamlconfig/samples/operator.openshift.io_v1alpha1_http01proxy.yamlpkg/controller/http01proxy/constants.gopkg/controller/http01proxy/controller.gopkg/controller/http01proxy/daemonsets.gopkg/controller/http01proxy/infrastructure.gopkg/controller/http01proxy/install_http01proxy.gopkg/controller/http01proxy/networkpolicies.gopkg/controller/http01proxy/rbacs.gopkg/controller/http01proxy/serviceaccounts.gopkg/controller/http01proxy/utils.gopkg/features/features_test.gopkg/operator/applyconfigurations/internal/internal.gopkg/operator/applyconfigurations/operator/v1alpha1/http01proxy.gopkg/operator/applyconfigurations/operator/v1alpha1/http01proxycustomdeploymentspec.gopkg/operator/applyconfigurations/operator/v1alpha1/http01proxyspec.gopkg/operator/applyconfigurations/operator/v1alpha1/http01proxystatus.gopkg/operator/applyconfigurations/utils.gopkg/operator/assets/bindata.gopkg/operator/clientset/versioned/typed/operator/v1alpha1/fake/fake_http01proxy.gopkg/operator/clientset/versioned/typed/operator/v1alpha1/fake/fake_operator_client.gopkg/operator/clientset/versioned/typed/operator/v1alpha1/generated_expansion.gopkg/operator/clientset/versioned/typed/operator/v1alpha1/http01proxy.gopkg/operator/clientset/versioned/typed/operator/v1alpha1/operator_client.gopkg/operator/informers/externalversions/generic.gopkg/operator/informers/externalversions/operator/v1alpha1/http01proxy.gopkg/operator/informers/externalversions/operator/v1alpha1/interface.gopkg/operator/listers/operator/v1alpha1/expansion_generated.gopkg/operator/listers/operator/v1alpha1/http01proxy.gopkg/operator/setup_manager.gopkg/operator/starter.go
| - apiGroups: | ||
| - machineconfiguration.openshift.io | ||
| resources: | ||
| - machineconfigs | ||
| verbs: | ||
| - get | ||
| - list | ||
| - create | ||
| - update |
There was a problem hiding this comment.
Tighten MachineConfig write permissions (least-privilege gap).
This grants cluster-wide create/update on machineconfigs, which is high-impact node-level access. Scope these verbs to only what is strictly required (for example, isolate write access to a dedicated controller identity and constrain by deterministic resource names where possible).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@bindata/http01-proxy/cert-manager-http01-proxy-clusterrole.yaml` around lines
26 - 34, The ClusterRole block granting verbs ["create","update"] on the
cluster-scoped resource "machineconfigs" is too broad; remove the "create" and
"update" verbs from the cluster-wide ClusterRole in
cert-manager-http01-proxy-clusterrole.yaml and instead either (a) move necessary
write privileges into a namespaced Role bound to a dedicated controller
ServiceAccount, or (b) if cluster-scoped writes are unavoidable, replace the
ClusterRole entry with a more restrictive rule using "resourceNames" limited to
the specific deterministic MachineConfig names and bind it only to the dedicated
controller identity; update any RoleBinding/ClusterRoleBinding to target that
controller ServiceAccount accordingly.
| - name: RELATED_IMAGE_CERT_MANAGER_HTTP01PROXY | ||
| value: quay.io/bapalm/cert-mgr-http01-proxy:latest |
There was a problem hiding this comment.
Avoid using :latest tag for operand images.
The HTTP01Proxy image uses :latest while all other operand images use pinned version tags (e.g., v1.19.4, v0.16.0). Using :latest is problematic for:
- Reproducibility: Builds can change without notice.
- Auditability: No traceability to a specific image digest.
- Supply chain security: Breaks image signature verification workflows.
Consider pinning to a specific version tag (e.g., v0.1.0 to match the operand version) before merging.
Suggested fix
- name: RELATED_IMAGE_CERT_MANAGER_HTTP01PROXY
- value: quay.io/bapalm/cert-mgr-http01-proxy:latest
+ value: quay.io/bapalm/cert-mgr-http01-proxy:v0.1.0📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - name: RELATED_IMAGE_CERT_MANAGER_HTTP01PROXY | |
| value: quay.io/bapalm/cert-mgr-http01-proxy:latest | |
| - name: RELATED_IMAGE_CERT_MANAGER_HTTP01PROXY | |
| value: quay.io/bapalm/cert-mgr-http01-proxy:v0.1.0 |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@config/manager/manager.yaml` around lines 89 - 90, The
RELATED_IMAGE_CERT_MANAGER_HTTP01PROXY env var is using an unpinned image tag
("quay.io/bapalm/cert-mgr-http01-proxy:latest"); change it to a fixed version
tag or digest (for example "quay.io/bapalm/cert-mgr-http01-proxy:v0.1.0" or a
sha256@digest) to match other operand pins and ensure reproducibility and
verifiability—update the value for RELATED_IMAGE_CERT_MANAGER_HTTP01PROXY
accordingly and confirm the chosen tag/digest matches the tested/released
HTTP01Proxy version.
| RELATED_IMAGE_CERT_MANAGER_HTTP01PROXY=quay.io/bapalm/cert-mgr-http01-proxy:latest \ | ||
| OPERATOR_NAME=cert-manager-operator \ | ||
| OPERAND_IMAGE_VERSION=$(BUNDLE_VERSION) \ | ||
| ISTIOCSR_OPERAND_IMAGE_VERSION=$(ISTIO_CSR_VERSION) \ | ||
| TRUSTMANAGER_OPERAND_IMAGE_VERSION=$(TRUST_MANAGER_VERSION) \ | ||
| HTTP01PROXY_OPERAND_IMAGE_VERSION=0.1.0 \ |
There was a problem hiding this comment.
Avoid mutable latest for the HTTP01 proxy image.
Using :latest here makes local runs non-reproducible and can silently pull unreviewed bits; it also conflicts with the explicit HTTP01PROXY_OPERAND_IMAGE_VERSION=0.1.0. Pin this to a fixed tag (or digest) and keep both variables aligned.
Suggested fix
- RELATED_IMAGE_CERT_MANAGER_HTTP01PROXY=quay.io/bapalm/cert-mgr-http01-proxy:latest \
+ RELATED_IMAGE_CERT_MANAGER_HTTP01PROXY=quay.io/bapalm/cert-mgr-http01-proxy:v0.1.0 \
...
- HTTP01PROXY_OPERAND_IMAGE_VERSION=0.1.0 \
+ HTTP01PROXY_OPERAND_IMAGE_VERSION=v0.1.0 \📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| RELATED_IMAGE_CERT_MANAGER_HTTP01PROXY=quay.io/bapalm/cert-mgr-http01-proxy:latest \ | |
| OPERATOR_NAME=cert-manager-operator \ | |
| OPERAND_IMAGE_VERSION=$(BUNDLE_VERSION) \ | |
| ISTIOCSR_OPERAND_IMAGE_VERSION=$(ISTIO_CSR_VERSION) \ | |
| TRUSTMANAGER_OPERAND_IMAGE_VERSION=$(TRUST_MANAGER_VERSION) \ | |
| HTTP01PROXY_OPERAND_IMAGE_VERSION=0.1.0 \ | |
| RELATED_IMAGE_CERT_MANAGER_HTTP01PROXY=quay.io/bapalm/cert-mgr-http01-proxy:v0.1.0 \ | |
| OPERATOR_NAME=cert-manager-operator \ | |
| OPERAND_IMAGE_VERSION=$(BUNDLE_VERSION) \ | |
| ISTIOCSR_OPERAND_IMAGE_VERSION=$(ISTIO_CSR_VERSION) \ | |
| TRUSTMANAGER_OPERAND_IMAGE_VERSION=$(TRUST_MANAGER_VERSION) \ | |
| HTTP01PROXY_OPERAND_IMAGE_VERSION=v0.1.0 \ |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@Makefile` around lines 325 - 330, The RELATED_IMAGE_CERT_MANAGER_HTTP01PROXY
value currently uses the mutable tag ":latest" while
HTTP01PROXY_OPERAND_IMAGE_VERSION is set to "0.1.0"; change
RELATED_IMAGE_CERT_MANAGER_HTTP01PROXY to use a fixed tag or digest that matches
HTTP01PROXY_OPERAND_IMAGE_VERSION (or derive the tag from
HTTP01PROXY_OPERAND_IMAGE_VERSION) so both variables are aligned and the image
is pinned for reproducible, reviewable runs; update any documentation/comments
to reflect the pinning.
| platformType, _, _ := unstructured.NestedString(infra.Object, "status", "platformStatus", "type") | ||
|
|
||
| info := &platformInfo{ | ||
| platformType: platformType, | ||
| } | ||
|
|
||
| // Extract VIPs based on platform type | ||
| switch platformType { | ||
| case platformBareMetal: | ||
| apiVIPs, _, _ := unstructured.NestedStringSlice(infra.Object, "status", "platformStatus", "baremetal", "apiServerInternalIPs") | ||
| ingressVIPs, _, _ := unstructured.NestedStringSlice(infra.Object, "status", "platformStatus", "baremetal", "ingressIPs") | ||
| info.apiVIPs = apiVIPs |
There was a problem hiding this comment.
Don’t ignore unstructured extraction errors.
Lines 33 and 42-43 ignore parse errors from NestedString/NestedStringSlice. A malformed or unexpected Infrastructure status can be silently treated as “unsupported,” which is the wrong failure mode.
Suggested patch
- platformType, _, _ := unstructured.NestedString(infra.Object, "status", "platformStatus", "type")
+ platformType, found, err := unstructured.NestedString(infra.Object, "status", "platformStatus", "type")
+ if err != nil {
+ return nil, fmt.Errorf("failed to parse infrastructure status.platformStatus.type: %w", err)
+ }
+ if !found {
+ return nil, fmt.Errorf("infrastructure status.platformStatus.type not found")
+ }
@@
switch platformType {
case platformBareMetal:
- apiVIPs, _, _ := unstructured.NestedStringSlice(infra.Object, "status", "platformStatus", "baremetal", "apiServerInternalIPs")
- ingressVIPs, _, _ := unstructured.NestedStringSlice(infra.Object, "status", "platformStatus", "baremetal", "ingressIPs")
+ apiVIPs, _, err := unstructured.NestedStringSlice(infra.Object, "status", "platformStatus", "baremetal", "apiServerInternalIPs")
+ if err != nil {
+ return nil, fmt.Errorf("failed to parse baremetal.apiServerInternalIPs: %w", err)
+ }
+ ingressVIPs, _, err := unstructured.NestedStringSlice(infra.Object, "status", "platformStatus", "baremetal", "ingressIPs")
+ if err != nil {
+ return nil, fmt.Errorf("failed to parse baremetal.ingressIPs: %w", err)
+ }
info.apiVIPs = apiVIPs
info.ingressVIPs = ingressVIPs
}As per coding guidelines, "-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| platformType, _, _ := unstructured.NestedString(infra.Object, "status", "platformStatus", "type") | |
| info := &platformInfo{ | |
| platformType: platformType, | |
| } | |
| // Extract VIPs based on platform type | |
| switch platformType { | |
| case platformBareMetal: | |
| apiVIPs, _, _ := unstructured.NestedStringSlice(infra.Object, "status", "platformStatus", "baremetal", "apiServerInternalIPs") | |
| ingressVIPs, _, _ := unstructured.NestedStringSlice(infra.Object, "status", "platformStatus", "baremetal", "ingressIPs") | |
| info.apiVIPs = apiVIPs | |
| platformType, found, err := unstructured.NestedString(infra.Object, "status", "platformStatus", "type") | |
| if err != nil { | |
| return nil, fmt.Errorf("failed to parse infrastructure status.platformStatus.type: %w", err) | |
| } | |
| if !found { | |
| return nil, fmt.Errorf("infrastructure status.platformStatus.type not found") | |
| } | |
| info := &platformInfo{ | |
| platformType: platformType, | |
| } | |
| // Extract VIPs based on platform type | |
| switch platformType { | |
| case platformBareMetal: | |
| apiVIPs, _, err := unstructured.NestedStringSlice(infra.Object, "status", "platformStatus", "baremetal", "apiServerInternalIPs") | |
| if err != nil { | |
| return nil, fmt.Errorf("failed to parse baremetal.apiServerInternalIPs: %w", err) | |
| } | |
| ingressVIPs, _, err := unstructured.NestedStringSlice(infra.Object, "status", "platformStatus", "baremetal", "ingressIPs") | |
| if err != nil { | |
| return nil, fmt.Errorf("failed to parse baremetal.ingressIPs: %w", err) | |
| } | |
| info.apiVIPs = apiVIPs | |
| info.ingressVIPs = ingressVIPs |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/controller/http01proxy/infrastructure.go` around lines 33 - 44, The code
is ignoring errors from unstructured.NestedString and NestedStringSlice which
can mask malformed Infrastructure status; update the extraction around
platformType (using unstructured.NestedString) and the VIP slices (using
unstructured.NestedStringSlice) to capture and check the returned ok/err values,
and handle failures explicitly (e.g., return an error from the function or log
and fail fast) instead of silently treating them as unsupported; specifically
modify the block that builds platformInfo (referencing variables infra.Object,
platformType, platformInfo, platformBareMetal, apiVIPs, ingressVIPs) to validate
the extraction results and propagate a clear error when parsing fails.
| func daemonSetSpecModified(desired, fetched *appsv1.DaemonSet) bool { | ||
| if !reflect.DeepEqual(desired.Spec.Selector.MatchLabels, fetched.Spec.Selector.MatchLabels) { | ||
| return true | ||
| } | ||
| if !reflect.DeepEqual(desired.Spec.Template.Labels, fetched.Spec.Template.Labels) || | ||
| len(desired.Spec.Template.Spec.Containers) != len(fetched.Spec.Template.Spec.Containers) { | ||
| return true | ||
| } | ||
| if len(desired.Spec.Template.Spec.Containers) > 0 && len(fetched.Spec.Template.Spec.Containers) > 0 { | ||
| desiredContainer := desired.Spec.Template.Spec.Containers[0] | ||
| fetchedContainer := fetched.Spec.Template.Spec.Containers[0] | ||
| if desiredContainer.Image != fetchedContainer.Image || | ||
| desiredContainer.Name != fetchedContainer.Name || | ||
| !reflect.DeepEqual(desiredContainer.Env, fetchedContainer.Env) || | ||
| !reflect.DeepEqual(desiredContainer.Ports, fetchedContainer.Ports) { | ||
| return true | ||
| } | ||
| } | ||
| if desired.Spec.Template.Spec.ServiceAccountName != fetched.Spec.Template.Spec.ServiceAccountName || | ||
| !reflect.DeepEqual(desired.Spec.Template.Spec.NodeSelector, fetched.Spec.Template.Spec.NodeSelector) { | ||
| return true | ||
| } | ||
| return false |
There was a problem hiding this comment.
The DaemonSet drift check misses critical pod spec fields.
This only compares selector labels, template labels, container[0] image/name/env/ports, serviceAccountName, and nodeSelector. Changes to hostNetwork, tolerations, volumes, security context/capabilities, template annotations, strategy, or additional containers will not trigger an update, so upgrades can leave the live proxy on stale networking or privilege settings. Please compare the full managed spec or move this resource to SSA.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/controller/http01proxy/utils.go` around lines 115 - 137, The
daemonSetSpecModified function currently only checks a few fields and misses
many important pod and controller settings; update it to detect any meaningful
drift by comparing the full managed DaemonSet spec (or at minimum the full
PodTemplateSpec and controller-level fields) rather than only selector/template
labels and container[0] subsets—specifically include Template.Annotations,
Template.Spec (hostNetwork, tolerations, volumes, securityContext, affinity,
dnsPolicy, imagePullSecrets, init containers and all containers with their
names/images/env/ports/resources), and top-level Spec fields like Strategy;
easiest fix: replace the ad-hoc checks in daemonSetSpecModified with a deep
equality comparison between desired.Spec and fetched.Spec (or between
desired.Spec.Template and fetched.Spec.Template) or switch to server-side apply
(SSA) for this resource so any difference in those fields triggers an update.
28e765b to
baa4972
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
bundle/manifests/cert-manager-operator.clusterserviceversion.yaml (1)
347-399:⚠️ Potential issue | 🟠 MajorDeclare
HTTP01Proxyas an owned CRD in the CSV.This PR now exposes
http01proxiesin CSV RBAC, butspec.customresourcedefinitions.ownedstill doesn't listhttp01proxies.operator.openshift.io. OLM metadata for the new API stays incomplete until that entry is added.Suggested patch
- description: CertManager is the Schema for the certmanagers API displayName: CertManager kind: CertManager name: certmanagers.operator.openshift.io version: v1alpha1 + - description: HTTP01Proxy describes the configuration for the managed HTTP-01 challenge proxy. + displayName: HTTP01Proxy + kind: HTTP01Proxy + name: http01proxies.operator.openshift.io + version: v1alpha1 - description: Challenge is a type to represent a Challenge request with an ACME serverAs per coding guidelines, "-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."
Also applies to: 657-658
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bundle/manifests/cert-manager-operator.clusterserviceversion.yaml` around lines 347 - 399, The CSV is missing an owned CRD entry for the new HTTP01Proxy API: add an entry under spec.customresourcedefinitions.owned with displayName "HTTP01Proxy", kind "HTTP01Proxy", name "http01proxies.operator.openshift.io" and the appropriate version (e.g., v1alpha1) so OLM recognizes the CRD as owned; update the list near the other owned CRDs (similar to entries for IstioCSR/TrustManager) to match the RBAC exposure of http01proxies.
♻️ Duplicate comments (5)
bindata/http01-proxy/cert-manager-http01-proxy-daemonset.yaml (1)
45-51:⚠️ Potential issue | 🟠 MajorSet
allowPrivilegeEscalation: falseon the proxy container.This pod already runs with
hostNetwork, root, andNET_ADMIN; leaving privilege escalation implicit keeps the attack surface wider than necessary for a very privileged workload.Suggested patch
securityContext: + allowPrivilegeEscalation: false capabilities: add: - NET_ADMIN drop: - ALL runAsNonRoot: falseAs per coding guidelines, "-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindata/http01-proxy/cert-manager-http01-proxy-daemonset.yaml` around lines 45 - 51, The container securityContext is missing allowPrivilegeEscalation explicitly and currently runAsNonRoot is false while adding NET_ADMIN and hostNetwork increases risk; update the container's securityContext for the proxy (the block where "securityContext:", "capabilities: add: - NET_ADMIN", and "runAsNonRoot: false" are defined) to set allowPrivilegeEscalation: false so privilege escalation is explicitly denied for the privileged workload, keeping other capability and runAsNonRoot settings as intended.pkg/controller/http01proxy/infrastructure.go (2)
66-68:⚠️ Potential issue | 🟠 MajorCheck the full VIP sets for overlap, not just index 0.
This only compares the first API/ingress VIP. Multi-VIP baremetal clusters can still overlap on later entries, which would wrongly let the proxy deploy onto an unsupported topology.
Suggested patch
- // If API VIP == Ingress VIP, proxy is not needed - if info.apiVIPs[0] == info.ingressVIPs[0] { - return fmt.Sprintf("API VIP (%s) and ingress VIP (%s) are the same; HTTP01 proxy is not needed", info.apiVIPs[0], info.ingressVIPs[0]) - } + // If any API VIP overlaps any ingress VIP, the proxy is not needed. + apiVIPSet := make(map[string]struct{}, len(info.apiVIPs)) + for _, vip := range info.apiVIPs { + apiVIPSet[vip] = struct{}{} + } + for _, vip := range info.ingressVIPs { + if _, found := apiVIPSet[vip]; found { + return fmt.Sprintf("API VIP (%s) and ingress VIP (%s) are the same; HTTP01 proxy is not needed", vip, vip) + } + }As per coding guidelines, "-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/http01proxy/infrastructure.go` around lines 66 - 68, The current check only compares info.apiVIPs[0] and info.ingressVIPs[0]; change it to detect any overlap between the two slices instead. Replace the single-index comparison with a proper intersection test (e.g., build a set from info.apiVIPs and check each entry of info.ingressVIPs for membership, or vice versa) and return the same message when any VIP exists in both sets; update the conditional that uses info.apiVIPs and info.ingressVIPs accordingly so multi-VIP clusters with any shared VIP are handled correctly.
33-43:⚠️ Potential issue | 🟠 MajorPropagate Infrastructure parse failures instead of masking them.
If any of these
unstructured.NestedString*calls fail,validatePlatformfalls back to generic "unsupported" messages and hides the real problem. Please checkfound/errand return a parse error here.Suggested patch
- platformType, _, _ := unstructured.NestedString(infra.Object, "status", "platformStatus", "type") + platformType, found, err := unstructured.NestedString(infra.Object, "status", "platformStatus", "type") + if err != nil { + return nil, fmt.Errorf("failed to parse infrastructure status.platformStatus.type: %w", err) + } + if !found { + return nil, fmt.Errorf("infrastructure status.platformStatus.type not found") + } @@ switch platformType { case platformBareMetal: - apiVIPs, _, _ := unstructured.NestedStringSlice(infra.Object, "status", "platformStatus", "baremetal", "apiServerInternalIPs") - ingressVIPs, _, _ := unstructured.NestedStringSlice(infra.Object, "status", "platformStatus", "baremetal", "ingressIPs") + apiVIPs, _, err := unstructured.NestedStringSlice(infra.Object, "status", "platformStatus", "baremetal", "apiServerInternalIPs") + if err != nil { + return nil, fmt.Errorf("failed to parse baremetal.apiServerInternalIPs: %w", err) + } + ingressVIPs, _, err := unstructured.NestedStringSlice(infra.Object, "status", "platformStatus", "baremetal", "ingressIPs") + if err != nil { + return nil, fmt.Errorf("failed to parse baremetal.ingressIPs: %w", err) + } info.apiVIPs = apiVIPs info.ingressVIPs = ingressVIPs }As per coding guidelines, "-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/http01proxy/infrastructure.go` around lines 33 - 43, The current infrastructure parsing uses unstructured.NestedString and unstructured.NestedStringSlice but ignores the found and err values, which masks parse failures and causes validatePlatform to return generic "unsupported" messages; update the parsing in pkg/controller/http01proxy/infrastructure.go (around platformInfo construction and the switch handling platformBareMetal) to check the returned (found, err) for each unstructured.NestedString/NestedStringSlice call and, on error or missing expected fields, return a clear parse error (propagate instead of swallowing) so callers like validatePlatform receive and can report the real parsing failure; reference the platformInfo struct, platformBareMetal constant, and the unstructured.NestedString/NestedStringSlice calls when making the changes.pkg/controller/http01proxy/utils.go (2)
115-137:⚠️ Potential issue | 🟠 MajorDaemonSet drift detection is still too narrow.
This only checks a small subset of
DaemonSetSpec, so changes tohostNetwork, tolerations, volumes, security context, template annotations, strategy, or any non-zero-index container will not trigger an update. That can leave the live proxy on stale networking or privilege settings after an upgrade. Please compare the full managed spec/template here or move this resource to SSA.
141-145:⚠️ Potential issue | 🟠 MajorKeep the original reconcile error in the aggregate.
This branch still returns the status-update failure twice and drops
prependErr, which hides the root cause and can change requeue behavior.Suggested fix
errUpdate := fmt.Errorf("failed to update %s/%s status: %w", proxy.GetNamespace(), proxy.GetName(), err) if prependErr != nil { - return utilerrors.NewAggregate([]error{err, errUpdate}) + return utilerrors.NewAggregate([]error{prependErr, errUpdate}) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/http01proxy/utils.go` around lines 141 - 145, The status-update error branch currently aggregates the update error and the update-creation error (err and errUpdate) and drops the original reconcile error stored in prependErr; change the logic in the block that calls r.updateStatus so that when prependErr != nil you return utilerrors.NewAggregate([]error{prependErr, errUpdate}) (i.e., include the original reconcile error), and when prependErr == nil return errUpdate (or the single update error), ensuring you reference the existing symbols updateStatus, prependErr, errUpdate and utilerrors.NewAggregate and avoid duplicating the same error twice.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@bundle/manifests/cert-manager-operator.clusterserviceversion.yaml`:
- Around line 797-798: The RELATED_IMAGE_CERT_MANAGER_HTTP01PROXY entry is using
an immutably-tagged image (quay.io/...:latest) which should be pinned; replace
the :latest tag with the immutable release reference used by the operator (use
the same value as HTTP01PROXY_OPERAND_IMAGE_VERSION or hardcode that version,
e.g. :0.1.0) so the bundle is reproducible, and apply the same change to the
other RELATED_IMAGE_CERT_MANAGER_HTTP01PROXY occurrences noted in the file (the
other RELATED_IMAGE entries mentioned). Ensure the RELATED_IMAGE value exactly
matches the operand version the operator publishes
(HTTP01PROXY_OPERAND_IMAGE_VERSION).
In `@pkg/controller/http01proxy/utils.go`:
- Around line 140-141: The updateCondition function currently uses the stored
background context r.ctx which can ignore reconcile cancellation; change the
signature of updateCondition to accept a context parameter (ctx
context.Context), propagate that ctx into the call to r.updateStatus (replace
r.ctx with ctx), and update all callers (notably Reconciler.Reconcile) to pass
the incoming request context instead of relying on r.ctx; ensure any other
internal uses of updateCondition are updated similarly and remove reliance on
r.ctx for status updates so status updates respect deadlines/cancellations.
In `@pkg/features/features_test.go`:
- Around line 114-117: The current assertion compares isPreGA to !spec.Default
(using assert.Equal), which forces GA features to be default-enabled; change the
test to only assert that pre-GA features are disabled by making the check
conditional: compute isPreGA (as you already do using spec.PreRelease,
featuregate.Alpha and featuregate.Beta) and then if isPreGA assert that
spec.Default is false (e.g., assert.False on spec.Default with the same message
referencing feat and spec.PreRelease); do not assert anything for non-pre-GA
features so GA features can be intentionally default-off.
---
Outside diff comments:
In `@bundle/manifests/cert-manager-operator.clusterserviceversion.yaml`:
- Around line 347-399: The CSV is missing an owned CRD entry for the new
HTTP01Proxy API: add an entry under spec.customresourcedefinitions.owned with
displayName "HTTP01Proxy", kind "HTTP01Proxy", name
"http01proxies.operator.openshift.io" and the appropriate version (e.g.,
v1alpha1) so OLM recognizes the CRD as owned; update the list near the other
owned CRDs (similar to entries for IstioCSR/TrustManager) to match the RBAC
exposure of http01proxies.
---
Duplicate comments:
In `@bindata/http01-proxy/cert-manager-http01-proxy-daemonset.yaml`:
- Around line 45-51: The container securityContext is missing
allowPrivilegeEscalation explicitly and currently runAsNonRoot is false while
adding NET_ADMIN and hostNetwork increases risk; update the container's
securityContext for the proxy (the block where "securityContext:",
"capabilities: add: - NET_ADMIN", and "runAsNonRoot: false" are defined) to set
allowPrivilegeEscalation: false so privilege escalation is explicitly denied for
the privileged workload, keeping other capability and runAsNonRoot settings as
intended.
In `@pkg/controller/http01proxy/infrastructure.go`:
- Around line 66-68: The current check only compares info.apiVIPs[0] and
info.ingressVIPs[0]; change it to detect any overlap between the two slices
instead. Replace the single-index comparison with a proper intersection test
(e.g., build a set from info.apiVIPs and check each entry of info.ingressVIPs
for membership, or vice versa) and return the same message when any VIP exists
in both sets; update the conditional that uses info.apiVIPs and info.ingressVIPs
accordingly so multi-VIP clusters with any shared VIP are handled correctly.
- Around line 33-43: The current infrastructure parsing uses
unstructured.NestedString and unstructured.NestedStringSlice but ignores the
found and err values, which masks parse failures and causes validatePlatform to
return generic "unsupported" messages; update the parsing in
pkg/controller/http01proxy/infrastructure.go (around platformInfo construction
and the switch handling platformBareMetal) to check the returned (found, err)
for each unstructured.NestedString/NestedStringSlice call and, on error or
missing expected fields, return a clear parse error (propagate instead of
swallowing) so callers like validatePlatform receive and can report the real
parsing failure; reference the platformInfo struct, platformBareMetal constant,
and the unstructured.NestedString/NestedStringSlice calls when making the
changes.
In `@pkg/controller/http01proxy/utils.go`:
- Around line 141-145: The status-update error branch currently aggregates the
update error and the update-creation error (err and errUpdate) and drops the
original reconcile error stored in prependErr; change the logic in the block
that calls r.updateStatus so that when prependErr != nil you return
utilerrors.NewAggregate([]error{prependErr, errUpdate}) (i.e., include the
original reconcile error), and when prependErr == nil return errUpdate (or the
single update error), ensuring you reference the existing symbols updateStatus,
prependErr, errUpdate and utilerrors.NewAggregate and avoid duplicating the same
error twice.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 5f8af34a-6eb0-4edc-86ac-c64d16215328
📒 Files selected for processing (45)
Makefileapi/operator/v1alpha1/features.goapi/operator/v1alpha1/http01proxy_types.goapi/operator/v1alpha1/zz_generated.deepcopy.gobindata/http01-proxy/cert-manager-http01-proxy-clusterrole.yamlbindata/http01-proxy/cert-manager-http01-proxy-clusterrolebinding.yamlbindata/http01-proxy/cert-manager-http01-proxy-daemonset.yamlbindata/http01-proxy/cert-manager-http01-proxy-scc-rolebinding.yamlbindata/http01-proxy/cert-manager-http01-proxy-serviceaccount.yamlbindata/networkpolicies/http01-proxy-allow-egress-networkpolicy.yamlbindata/networkpolicies/http01-proxy-deny-all-networkpolicy.yamlbundle/manifests/cert-manager-operator.clusterserviceversion.yamlconfig/crd/bases/operator.openshift.io_http01proxies.yamlconfig/manager/manager.yamlconfig/rbac/role.yamlconfig/samples/operator.openshift.io_v1alpha1_http01proxy.yamlpkg/controller/http01proxy/constants.gopkg/controller/http01proxy/controller.gopkg/controller/http01proxy/daemonsets.gopkg/controller/http01proxy/infrastructure.gopkg/controller/http01proxy/install_http01proxy.gopkg/controller/http01proxy/networkpolicies.gopkg/controller/http01proxy/rbacs.gopkg/controller/http01proxy/serviceaccounts.gopkg/controller/http01proxy/utils.gopkg/features/features_test.gopkg/operator/applyconfigurations/internal/internal.gopkg/operator/applyconfigurations/operator/v1alpha1/http01proxy.gopkg/operator/applyconfigurations/operator/v1alpha1/http01proxycustomdeploymentspec.gopkg/operator/applyconfigurations/operator/v1alpha1/http01proxyspec.gopkg/operator/applyconfigurations/operator/v1alpha1/http01proxystatus.gopkg/operator/applyconfigurations/utils.gopkg/operator/assets/bindata.gopkg/operator/clientset/versioned/typed/operator/v1alpha1/fake/fake_http01proxy.gopkg/operator/clientset/versioned/typed/operator/v1alpha1/fake/fake_operator_client.gopkg/operator/clientset/versioned/typed/operator/v1alpha1/generated_expansion.gopkg/operator/clientset/versioned/typed/operator/v1alpha1/http01proxy.gopkg/operator/clientset/versioned/typed/operator/v1alpha1/operator_client.gopkg/operator/informers/externalversions/generic.gopkg/operator/informers/externalversions/operator/v1alpha1/http01proxy.gopkg/operator/informers/externalversions/operator/v1alpha1/interface.gopkg/operator/listers/operator/v1alpha1/expansion_generated.gopkg/operator/listers/operator/v1alpha1/http01proxy.gopkg/operator/setup_manager.gopkg/operator/starter.go
✅ Files skipped from review due to trivial changes (18)
- pkg/operator/applyconfigurations/utils.go
- bindata/http01-proxy/cert-manager-http01-proxy-serviceaccount.yaml
- pkg/operator/clientset/versioned/typed/operator/v1alpha1/generated_expansion.go
- bindata/http01-proxy/cert-manager-http01-proxy-scc-rolebinding.yaml
- bindata/networkpolicies/http01-proxy-allow-egress-networkpolicy.yaml
- config/samples/operator.openshift.io_v1alpha1_http01proxy.yaml
- bindata/networkpolicies/http01-proxy-deny-all-networkpolicy.yaml
- pkg/operator/applyconfigurations/internal/internal.go
- pkg/operator/listers/operator/v1alpha1/expansion_generated.go
- pkg/operator/informers/externalversions/generic.go
- bindata/http01-proxy/cert-manager-http01-proxy-clusterrolebinding.yaml
- config/manager/manager.yaml
- pkg/operator/applyconfigurations/operator/v1alpha1/http01proxyspec.go
- pkg/operator/clientset/versioned/typed/operator/v1alpha1/fake/fake_http01proxy.go
- pkg/operator/applyconfigurations/operator/v1alpha1/http01proxystatus.go
- config/crd/bases/operator.openshift.io_http01proxies.yaml
- pkg/controller/http01proxy/networkpolicies.go
- pkg/operator/clientset/versioned/typed/operator/v1alpha1/http01proxy.go
🚧 Files skipped from review as they are similar to previous changes (16)
- api/operator/v1alpha1/features.go
- pkg/operator/clientset/versioned/typed/operator/v1alpha1/fake/fake_operator_client.go
- bindata/http01-proxy/cert-manager-http01-proxy-clusterrole.yaml
- pkg/operator/starter.go
- Makefile
- pkg/controller/http01proxy/serviceaccounts.go
- config/rbac/role.yaml
- pkg/operator/applyconfigurations/operator/v1alpha1/http01proxycustomdeploymentspec.go
- pkg/operator/informers/externalversions/operator/v1alpha1/interface.go
- pkg/controller/http01proxy/rbacs.go
- pkg/controller/http01proxy/install_http01proxy.go
- pkg/operator/informers/externalversions/operator/v1alpha1/http01proxy.go
- api/operator/v1alpha1/http01proxy_types.go
- pkg/controller/http01proxy/daemonsets.go
- pkg/controller/http01proxy/controller.go
- pkg/operator/assets/bindata.go
| - name: RELATED_IMAGE_CERT_MANAGER_HTTP01PROXY | ||
| value: quay.io/bapalm/cert-mgr-http01-proxy:latest |
There was a problem hiding this comment.
Pin the HTTP01 proxy image to an immutable release reference.
Using :latest here makes the bundle non-reproducible, and it can drift away from HTTP01PROXY_OPERAND_IMAGE_VERSION=0.1.0. Because the operator consumes this env var and republishes it in relatedImages, a retag silently changes what clusters deploy and mirror.
As per coding guidelines, "-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."
Also applies to: 805-806, 923-924
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@bundle/manifests/cert-manager-operator.clusterserviceversion.yaml` around
lines 797 - 798, The RELATED_IMAGE_CERT_MANAGER_HTTP01PROXY entry is using an
immutably-tagged image (quay.io/...:latest) which should be pinned; replace the
:latest tag with the immutable release reference used by the operator (use the
same value as HTTP01PROXY_OPERAND_IMAGE_VERSION or hardcode that version, e.g.
:0.1.0) so the bundle is reproducible, and apply the same change to the other
RELATED_IMAGE_CERT_MANAGER_HTTP01PROXY occurrences noted in the file (the other
RELATED_IMAGE entries mentioned). Ensure the RELATED_IMAGE value exactly matches
the operand version the operator publishes (HTTP01PROXY_OPERAND_IMAGE_VERSION).
Add the HTTP01 Challenge Proxy feature to cert-manager-operator, enabling cert-manager to complete HTTP01 ACME challenges for the API endpoint on baremetal platforms where the API VIP is not exposed via OpenShift Ingress. This follows the IstioCSR/TrustManager controller pattern and includes: - HTTP01Proxy CRD with mode (DefaultDeployment/CustomDeployment) and optional customDeployment.internalPort configuration - Full controller-runtime reconciler with finalizer handling, status conditions (Ready/Degraded), and managed resource watches - Platform validation via Infrastructure CR — only deploys on BareMetal platforms with distinct API/Ingress VIPs, sets Degraded condition with descriptive message on unsupported platforms - DaemonSet deployment on control plane nodes with hostNetwork, nftables NET_ADMIN capability, and privileged SCC - RBAC for reading cluster config and managing MachineConfig resources - NetworkPolicies for deny-all and allow-egress - FeatureHTTP01Proxy feature gate (Alpha, default: false) - Integrated with unified cache builder and common package utilities - Controller-runtime metrics on :8085 to avoid library-go port conflict - Generated clientset, informers, listers, and apply configurations Ref: openshift/enhancements#1929
baa4972 to
8fc4aa5
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (1)
pkg/controller/http01proxy/utils.go (1)
115-137:⚠️ Potential issue | 🟠 MajorDaemonSet drift detection still misses networking and privilege changes.
This comparator only checks labels, first-container image/env/ports,
serviceAccountName, andnodeSelector. Hardening or rollout changes in the current manifest—hostNetwork, tolerations,allowPrivilegeEscalation, capabilities, resources,priorityClassName, and most of the pod template—won't update existing clusters.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/http01proxy/utils.go` around lines 115 - 137, The daemonSetSpecModified comparator currently only checks a few fields and misses important pod/podTemplate changes (networking, security, resources, scheduling) — update daemonSetSpecModified to compare the full pod template spec (or explicitly include the missing fields) instead of only labels/first container basics: ensure you compare desired.Spec.Template.Spec vs fetched.Spec.Template.Spec (or at minimum include hostNetwork, tolerations, securityContext (containers[].securityContext and podSecurityContext), allowPrivilegeEscalation, capabilities, resources, volume/volumeMounts, affinity/nodeAffinity, tolerations, priorityClassName, dnsPolicy, restartPolicy, and any other container-level fields) so drift detection triggers updates when those values diverge.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/controller/http01proxy/controller.go`:
- Around line 101-109: The controller currently never watches the cluster
Infrastructure resource, so platform changes discovered by discoverPlatform()
(in pkg/controller/http01proxy/infrastructure.go) won't trigger reconciles for
HTTP01Proxy; add a Watch for the cluster Infrastructure object
(configv1.Infrastructure) to the controller builder—e.g. add
Watches(&configv1.Infrastructure{}, handler.EnqueueRequestsFromMapFunc(mapFunc),
controllerManagedResourcePredicates) (and import configv1) so changes to the
Infrastructure/cluster resource enqueue the HTTP01Proxy reconcile via the
existing mapFunc and predicates.
In `@pkg/controller/http01proxy/install_http01proxy.go`:
- Around line 18-20: validatePlatform returning unsupported currently bails out
without cleaning up already-deployed proxy resources; change the branch so that
before returning the irrecoverable error you invoke the same teardown used in
the deletion path (e.g., call the deletion/cleanup routine used for HTTP01 proxy
such as r.deleteHTTP01ProxyResources or the function(s) that remove the
DaemonSet, ServiceAccount, RBAC, NetworkPolicies and any nftables redirection),
ensure the cleanup is idempotent and ignores NotFound errors, then log and
return common.NewIrrecoverableError(fmt.Errorf("platform validation failed"),
"%s", reason) as before; keep references to validatePlatform and the
NewIrrecoverableError return path.
In `@pkg/operator/assets/bindata.go`:
- Around line 2299-2325: The ClusterRole in
bindata/http01-proxy/cert-manager-http01-proxy-clusterrole.yaml grants
unnecessary MCO write permissions: remove the entire rule block that targets
apiGroups: ["operator.openshift.io"] with resources: ["machineconfigurations"]
and verbs: ["update"] and the rule block that targets apiGroups:
["machineconfiguration.openshift.io"] with resources: ["machineconfigs"] and
verbs: ["get","list","create","update"]; then regenerate the bindata so
pkg/operator/assets/bindata.go is updated accordingly and verify http01-proxy
DaemonSet/service account no longer receives machineconfig write rights.
---
Duplicate comments:
In `@pkg/controller/http01proxy/utils.go`:
- Around line 115-137: The daemonSetSpecModified comparator currently only
checks a few fields and misses important pod/podTemplate changes (networking,
security, resources, scheduling) — update daemonSetSpecModified to compare the
full pod template spec (or explicitly include the missing fields) instead of
only labels/first container basics: ensure you compare
desired.Spec.Template.Spec vs fetched.Spec.Template.Spec (or at minimum include
hostNetwork, tolerations, securityContext (containers[].securityContext and
podSecurityContext), allowPrivilegeEscalation, capabilities, resources,
volume/volumeMounts, affinity/nodeAffinity, tolerations, priorityClassName,
dnsPolicy, restartPolicy, and any other container-level fields) so drift
detection triggers updates when those values diverge.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: e8402571-bb33-4ef5-9165-93da089dc08e
📒 Files selected for processing (45)
Makefileapi/operator/v1alpha1/features.goapi/operator/v1alpha1/http01proxy_types.goapi/operator/v1alpha1/zz_generated.deepcopy.gobindata/http01-proxy/cert-manager-http01-proxy-clusterrole.yamlbindata/http01-proxy/cert-manager-http01-proxy-clusterrolebinding.yamlbindata/http01-proxy/cert-manager-http01-proxy-daemonset.yamlbindata/http01-proxy/cert-manager-http01-proxy-scc-rolebinding.yamlbindata/http01-proxy/cert-manager-http01-proxy-serviceaccount.yamlbindata/networkpolicies/http01-proxy-allow-egress-networkpolicy.yamlbindata/networkpolicies/http01-proxy-deny-all-networkpolicy.yamlbundle/manifests/cert-manager-operator.clusterserviceversion.yamlconfig/crd/bases/operator.openshift.io_http01proxies.yamlconfig/manager/manager.yamlconfig/rbac/role.yamlconfig/samples/operator.openshift.io_v1alpha1_http01proxy.yamlpkg/controller/http01proxy/constants.gopkg/controller/http01proxy/controller.gopkg/controller/http01proxy/daemonsets.gopkg/controller/http01proxy/infrastructure.gopkg/controller/http01proxy/install_http01proxy.gopkg/controller/http01proxy/networkpolicies.gopkg/controller/http01proxy/rbacs.gopkg/controller/http01proxy/serviceaccounts.gopkg/controller/http01proxy/utils.gopkg/features/features_test.gopkg/operator/applyconfigurations/internal/internal.gopkg/operator/applyconfigurations/operator/v1alpha1/http01proxy.gopkg/operator/applyconfigurations/operator/v1alpha1/http01proxycustomdeploymentspec.gopkg/operator/applyconfigurations/operator/v1alpha1/http01proxyspec.gopkg/operator/applyconfigurations/operator/v1alpha1/http01proxystatus.gopkg/operator/applyconfigurations/utils.gopkg/operator/assets/bindata.gopkg/operator/clientset/versioned/typed/operator/v1alpha1/fake/fake_http01proxy.gopkg/operator/clientset/versioned/typed/operator/v1alpha1/fake/fake_operator_client.gopkg/operator/clientset/versioned/typed/operator/v1alpha1/generated_expansion.gopkg/operator/clientset/versioned/typed/operator/v1alpha1/http01proxy.gopkg/operator/clientset/versioned/typed/operator/v1alpha1/operator_client.gopkg/operator/informers/externalversions/generic.gopkg/operator/informers/externalversions/operator/v1alpha1/http01proxy.gopkg/operator/informers/externalversions/operator/v1alpha1/interface.gopkg/operator/listers/operator/v1alpha1/expansion_generated.gopkg/operator/listers/operator/v1alpha1/http01proxy.gopkg/operator/setup_manager.gopkg/operator/starter.go
✅ Files skipped from review due to trivial changes (21)
- pkg/operator/clientset/versioned/typed/operator/v1alpha1/generated_expansion.go
- pkg/operator/applyconfigurations/internal/internal.go
- bindata/networkpolicies/http01-proxy-allow-egress-networkpolicy.yaml
- config/samples/operator.openshift.io_v1alpha1_http01proxy.yaml
- bindata/networkpolicies/http01-proxy-deny-all-networkpolicy.yaml
- bindata/http01-proxy/cert-manager-http01-proxy-clusterrolebinding.yaml
- pkg/operator/informers/externalversions/generic.go
- Makefile
- bindata/http01-proxy/cert-manager-http01-proxy-serviceaccount.yaml
- pkg/operator/applyconfigurations/operator/v1alpha1/http01proxycustomdeploymentspec.go
- bindata/http01-proxy/cert-manager-http01-proxy-clusterrole.yaml
- bindata/http01-proxy/cert-manager-http01-proxy-scc-rolebinding.yaml
- pkg/operator/informers/externalversions/operator/v1alpha1/interface.go
- config/rbac/role.yaml
- pkg/operator/applyconfigurations/operator/v1alpha1/http01proxystatus.go
- pkg/controller/http01proxy/networkpolicies.go
- pkg/operator/applyconfigurations/operator/v1alpha1/http01proxyspec.go
- pkg/operator/listers/operator/v1alpha1/http01proxy.go
- pkg/operator/informers/externalversions/operator/v1alpha1/http01proxy.go
- pkg/controller/http01proxy/daemonsets.go
- pkg/operator/applyconfigurations/operator/v1alpha1/http01proxy.go
🚧 Files skipped from review as they are similar to previous changes (14)
- pkg/operator/applyconfigurations/utils.go
- pkg/operator/listers/operator/v1alpha1/expansion_generated.go
- pkg/operator/starter.go
- config/manager/manager.yaml
- api/operator/v1alpha1/features.go
- pkg/operator/clientset/versioned/typed/operator/v1alpha1/fake/fake_operator_client.go
- pkg/features/features_test.go
- pkg/controller/http01proxy/serviceaccounts.go
- pkg/operator/setup_manager.go
- pkg/controller/http01proxy/constants.go
- pkg/operator/clientset/versioned/typed/operator/v1alpha1/fake/fake_http01proxy.go
- bundle/manifests/cert-manager-operator.clusterserviceversion.yaml
- api/operator/v1alpha1/http01proxy_types.go
- config/crd/bases/operator.openshift.io_http01proxies.yaml
| return ctrl.NewControllerManagedBy(mgr). | ||
| For(&v1alpha1.HTTP01Proxy{}, builder.WithPredicates(predicate.GenerationChangedPredicate{})). | ||
| Named(ControllerName). | ||
| Watches(&appsv1.DaemonSet{}, handler.EnqueueRequestsFromMapFunc(mapFunc), withIgnoreStatusUpdatePredicates). | ||
| Watches(&rbacv1.ClusterRole{}, handler.EnqueueRequestsFromMapFunc(mapFunc), controllerManagedResourcePredicates). | ||
| Watches(&rbacv1.ClusterRoleBinding{}, handler.EnqueueRequestsFromMapFunc(mapFunc), controllerManagedResourcePredicates). | ||
| Watches(&corev1.ServiceAccount{}, handler.EnqueueRequestsFromMapFunc(mapFunc), controllerManagedResourcePredicates). | ||
| Watches(&networkingv1.NetworkPolicy{}, handler.EnqueueRequestsFromMapFunc(mapFunc), controllerManagedResourcePredicates). | ||
| Complete(r) |
There was a problem hiding this comment.
Unsupported states won't recover when Infrastructure changes.
Platform validation depends on Infrastructure/cluster (discoverPlatform() in pkg/controller/http01proxy/infrastructure.go), but this controller never watches that object. Because validation failures are treated as irrecoverable and no dependent resources exist yet to generate events, later VIP/platform fixes won't trigger another reconcile unless the HTTP01Proxy is touched manually.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/controller/http01proxy/controller.go` around lines 101 - 109, The
controller currently never watches the cluster Infrastructure resource, so
platform changes discovered by discoverPlatform() (in
pkg/controller/http01proxy/infrastructure.go) won't trigger reconciles for
HTTP01Proxy; add a Watch for the cluster Infrastructure object
(configv1.Infrastructure) to the controller builder—e.g. add
Watches(&configv1.Infrastructure{}, handler.EnqueueRequestsFromMapFunc(mapFunc),
controllerManagedResourcePredicates) (and import configv1) so changes to the
Infrastructure/cluster resource enqueue the HTTP01Proxy reconcile via the
existing mapFunc and predicates.
| if reason := validatePlatform(info); reason != "" { | ||
| r.log.V(1).Info("platform not supported for HTTP01 proxy", "reason", reason, "platformType", info.platformType) | ||
| return common.NewIrrecoverableError(fmt.Errorf("platform validation failed"), "%s", reason) |
There was a problem hiding this comment.
Unsupported-platform reconciliation leaves old workloads behind.
Returning here only blocks future applies. If the proxy was already deployed, the existing DaemonSet/RBAC/NetworkPolicies/ServiceAccount remain because cleanup only happens in the deletion path, which can leave nftables redirection active after the controller has declared the platform unsupported.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/controller/http01proxy/install_http01proxy.go` around lines 18 - 20,
validatePlatform returning unsupported currently bails out without cleaning up
already-deployed proxy resources; change the branch so that before returning the
irrecoverable error you invoke the same teardown used in the deletion path
(e.g., call the deletion/cleanup routine used for HTTP01 proxy such as
r.deleteHTTP01ProxyResources or the function(s) that remove the DaemonSet,
ServiceAccount, RBAC, NetworkPolicies and any nftables redirection), ensure the
cleanup is idempotent and ignores NotFound errors, then log and return
common.NewIrrecoverableError(fmt.Errorf("platform validation failed"), "%s",
reason) as before; keep references to validatePlatform and the
NewIrrecoverableError return path.
| rules: | ||
| - apiGroups: | ||
| - config.openshift.io | ||
| resources: | ||
| - clusterversions | ||
| - infrastructures | ||
| - ingresses | ||
| verbs: | ||
| - get | ||
| - list | ||
| - watch | ||
| - apiGroups: | ||
| - operator.openshift.io | ||
| resources: | ||
| - machineconfigurations | ||
| verbs: | ||
| - update | ||
| - apiGroups: | ||
| - machineconfiguration.openshift.io | ||
| resources: | ||
| - machineconfigs | ||
| verbs: | ||
| - get | ||
| - list | ||
| - create | ||
| - update | ||
| `) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== ClusterRole source asset ==="
sed -n '1,160p' bindata/http01-proxy/cert-manager-http01-proxy-clusterrole.yaml
echo
echo "=== Proxy service account bindings ==="
sed -n '14,42p' pkg/controller/http01proxy/rbacs.go
echo
echo "=== MCO API references in HTTP01Proxy code ==="
rg -n -C2 'machineconfigurations|machineconfigs|MachineConfig|MachineConfiguration' \
bindata/http01-proxy \
pkg/controller/http01proxy || trueRepository: openshift/cert-manager-operator
Length of output: 3155
Remove unnecessary MCO write permissions from the HTTP01Proxy ClusterRole.
The ClusterRole grants update on operator.openshift.io/machineconfigurations and create/update on machineconfiguration.openshift.io/machineconfigs, but these MCO APIs are not referenced anywhere in the HTTP01Proxy controller code. Since this role is bound to the proxy DaemonSet's service account, every proxy pod receives unnecessary cluster-wide node-config mutation rights. Remove these rules from bindata/http01-proxy/cert-manager-http01-proxy-clusterrole.yaml and regenerate bindata.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/operator/assets/bindata.go` around lines 2299 - 2325, The ClusterRole in
bindata/http01-proxy/cert-manager-http01-proxy-clusterrole.yaml grants
unnecessary MCO write permissions: remove the entire rule block that targets
apiGroups: ["operator.openshift.io"] with resources: ["machineconfigurations"]
and verbs: ["update"] and the rule block that targets apiGroups:
["machineconfiguration.openshift.io"] with resources: ["machineconfigs"] and
verbs: ["get","list","create","update"]; then regenerate the bindata so
pkg/operator/assets/bindata.go is updated accordingly and verify http01-proxy
DaemonSet/service account no longer receives machineconfig write rights.
|
@sebrandon1: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Summary
Adds the HTTP01 Challenge Proxy controller to cert-manager-operator, enabling cert-manager to complete HTTP01 ACME challenges for the API endpoint (
api.cluster.example.com) on baremetal platforms where the API VIP is not exposed via OpenShift Ingress.This follows the enhancement proposal in openshift/enhancements#1929 and the existing IstioCSR/TrustManager controller pattern.
What it does
When a user creates an
HTTP01ProxyCR withmode: DefaultDeployment, the operator:API_VIP:80to a local proxy port (default 8888)/.well-known/acme-challenge/*requests are forwarded to the Ingress VIP — all other requests are rejectedOn unsupported platforms (non-BareMetal, missing VIPs, or equal API/Ingress VIPs), the controller sets
Degraded=Truewith a descriptive message and does not deploy any workloads.New CRD
Feature Gate
HTTP01Proxy— Alpha, disabled by default. Enable via--unsupported-addon-features=HTTP01Proxy=true.Testing
Degradedwith "platform type None is not supported" and creates no workloadsTracking
quay.io/bapalm/cert-mgr-http01-proxy:latest(multi-arch: amd64/arm64)