Skip to content

CNF-23534: Add automated OpenAPI schema generation for PolicyGenerator list merging#787

Open
irinamihai wants to merge 3 commits into
openshift-kni:mainfrom
irinamihai:extend-openapi-schema
Open

CNF-23534: Add automated OpenAPI schema generation for PolicyGenerator list merging#787
irinamihai wants to merge 3 commits into
openshift-kni:mainfrom
irinamihai:extend-openapi-schema

Conversation

@irinamihai

Copy link
Copy Markdown
Contributor

Add automated OpenAPI schema generation for PolicyGenerator list merging

Two-step pipeline to dynamically generate schema.openapi files from upstream CRDs, replacing the previously hand-crafted schemas:

  1. generate-schema-config.py scans Subscription CRs to update crd-schema-config.json with current channels and git refs
  2. extract-schema.py downloads CRDs and extracts minimal merge directive schemas for ACM PolicyGenerator

Adds nightly GitHub Actions workflow to auto-PR when schemas drift.

CNF-23534: Add an openapi schema for all the lists in the reference configuration

Description:

Add automated OpenAPI schema generation for PolicyGenerator list merging

  Two-step pipeline to dynamically generate schema.openapi files from
  upstream CRDs, replacing the previously hand-crafted schemas:

  1. generate-schema-config.py scans Subscription CRs to update
     crd-schema-config.json with current channels and git refs
  2. extract-schema.py downloads CRDs and extracts minimal merge
     directive schemas for ACM PolicyGenerator

  Adds nightly GitHub Actions workflow to auto-PR when schemas drift.
@openshift-ci openshift-ci Bot requested review from sabbir-47 and yuvalk May 26, 2026 21:16
@openshift-ci

openshift-ci Bot commented May 26, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: irinamihai

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 26, 2026
@irinamihai irinamihai changed the title CNF-23534 CNF-23534: Add automated OpenAPI schema generation for PolicyGenerator list merging May 26, 2026
@coderabbitai

coderabbitai Bot commented May 26, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 55487316-923c-4f5a-a215-3a0796516dd3

📥 Commits

Reviewing files that changed from the base of the PR and between 21ddbd7 and b549584.

📒 Files selected for processing (2)
  • Makefile
  • telco-ran/configuration/argocd/example/acmpolicygenerator/hub-side-templating/schema.openapi

📝 Walkthrough

Walkthrough

Adds tooling and CI to generate ACM PolicyGenerator OpenAPI merge-directive schemas: a JSON CRD config, Python CLIs to extract and process CRD schemas, Makefile targets to orchestrate generation/checking, generated schema.openapi artifacts for telco components, documentation, and a scheduled workflow that opens automated PRs.

Changes

OpenAPI Schema Generation Infrastructure

Layer / File(s) Summary
Schema extraction and configuration generation tools
hack/extract-schema.py, hack/generate-schema-config.py
extract-schema.py loads CRDs from GitHub/cluster/file, strips selected x-kubernetes-* keys, converts list-type maps into patch-merge directives, injects configured merge keys, prunes to minimal directive-bearing subtrees, and emits stable JSON definitions. generate-schema-config.py scans Subscriptions, detects OCP version, and updates hack/crd-schema-config.json with subscription channels and derived GitHub refs.
Schema configuration and build target wiring
hack/crd-schema-config.json, Makefile
hack/crd-schema-config.json centralizes CRD source refs, merge key paths, and subscription associations. Makefile adds generate-schema-config, generate-openapi-schemas, and check-openapi-schemas targets to regenerate configs/schemas and enforce committed outputs.
Generated OpenAPI schema artifacts
telco-core/configuration/schema.openapi, telco-ran/configuration/argocd/example/acmpolicygenerator/schema.openapi, telco-ran/.../hub-side-templating/schema.openapi
Generated schema.openapi files contain OpenShift CRD definitions annotated with x-kubernetes-patch-strategy/x-kubernetes-patch-merge-key for PolicyGenerator strategic-merge behavior; includes updated PtpConfig, ClusterLogForwarder, PerformanceProfile, ImageDigestMirrorSet entries, and a new Tuned definition.
User-facing schema documentation
telco-ran/configuration/argocd/example/acmpolicygenerator/README.md
Explains default list replacement vs keyed strategic-merge when referencing schema.openapi, how merge directives are encoded, and documents commands/workflow for generating and updating schema artifacts across OpenShift/operator releases.
CI automation and version update procedures
.github/workflows/openapi-schema-check.yml, VERSION_UPDATE_GUIDE.md
Adds a scheduled/manual GitHub Actions workflow that runs make generate-openapi-schemas, installs pinned dependencies, and uses peter-evans/create-pull-request to open an automated PR to update-openapi-schemas. The version guide includes steps to regenerate schemas and guidance on ref derivation and mirror registry configuration.

Sequence Diagram(s)

sequenceDiagram
  actor Developer
  participant Make
  participant GenerateConfig
  participant ExtractSchema
  participant Git
  participant Workflow

  Developer->>Make: make generate-openapi-schemas
  Make->>GenerateConfig: run hack/generate-schema-config.py
  GenerateConfig->>GenerateConfig: scan subscriptions / detect OCP version
  GenerateConfig->>Make: write updated hack/crd-schema-config.json
  Make->>ExtractSchema: run hack/extract-schema.py (telco-ran)
  ExtractSchema->>ExtractSchema: fetch CRDs, convert, inject merge keys, prune
  ExtractSchema->>Git: write telco-ran schema.openapi
  Make->>ExtractSchema: run hack/extract-schema.py (telco-core)
  ExtractSchema->>Git: write telco-core schema.openapi

  Workflow->>Make: (daily) make generate-openapi-schemas
  Workflow->>Git: git diff --exit-code
  Workflow->>Workflow: create PR if schemas changed
Loading

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: automated OpenAPI schema generation for PolicyGenerator list merging, which is the core feature introduced across all modified files.
Description check ✅ Passed The description is directly related to the changeset, clearly explaining the two-step pipeline (generate-schema-config.py and extract-schema.py) and the nightly GitHub Actions workflow that align with the actual changes made.
Docstring Coverage ✅ Passed Docstring coverage is 87.50% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci-robot

openshift-ci-robot commented May 26, 2026

Copy link
Copy Markdown
Collaborator

@irinamihai: This pull request references CNF-23534 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Add automated OpenAPI schema generation for PolicyGenerator list merging

Two-step pipeline to dynamically generate schema.openapi files from upstream CRDs, replacing the previously hand-crafted schemas:

  1. generate-schema-config.py scans Subscription CRs to update crd-schema-config.json with current channels and git refs
  2. extract-schema.py downloads CRDs and extracts minimal merge directive schemas for ACM PolicyGenerator

Adds nightly GitHub Actions workflow to auto-PR when schemas drift.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@irinamihai

Copy link
Copy Markdown
Contributor Author

/hold for further testing

@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 26, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
.github/workflows/openapi-schema-check.yml (1)

25-25: ⚡ Quick win

Pin PyYAML to a specific version for reproducibility.

Installing PyYAML without a version constraint means the workflow will fetch whatever version is latest at runtime. If PyYAML releases breaking changes, the workflow could fail unpredictably.

📌 Proposed fix to pin PyYAML version
-          pip install PyYAML
+          pip install PyYAML==6.0.1

Update the version number as needed to match the version tested with your scripts.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/openapi-schema-check.yml at line 25, Replace the unpinned
installer command "pip install PyYAML" in the GitHub Actions job with a pinned
version (e.g., "pip install PyYAML==<version>") to ensure reproducible runs;
pick and lock the exact PyYAML version you tested against and update that
version string in the workflow step that currently runs pip install PyYAML.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/openapi-schema-check.yml:
- Line 19: Replace semantic version tags for GitHub Actions with pinned commit
SHAs to satisfy supply-chain requirements: update usages of actions/checkout@v4
and peter-evans/create-pull-request@v7 to their corresponding full commit SHAs
(e.g., actions/checkout@<sha> and peter-evans/create-pull-request@<sha>), fetch
the latest stable commit SHAs from each action's repository first, and commit
the workflow changes so the workflow references the fixed SHAs instead of the
version tags.
- Around line 23-24: The workflow currently downloads yq_linux_amd64 and makes
it executable without integrity checks; update the steps around the curl and
chmod so you first fetch or define an expected checksum (e.g.,
YQ_CHECKSUM/YQ_VERSION), verify the downloaded file's SHA256 (or other selected
hash) against that expected value using a verification tool (sha256sum or shasum
-a 256) and only call chmod +x /usr/local/bin/yq and move it into the PATH if
the checksum matches; ensure the job fails if verification fails and document
updating YQ_VERSION and YQ_CHECKSUM to the intended release.

In `@hack/extract-schema.py`:
- Around line 236-240: The branch that handles "x-kubernetes-patch-strategy" can
KeyError on schema["x-kubernetes-patch-merge-key"]; update the logic in the
block that builds result (the code using the local variable schema and creating
result) to read the merge key via schema.get("x-kubernetes-patch-merge-key") and
only include the "x-kubernetes-patch-merge-key" entry in the result when that
get() returns a non-None value (keep the "x-kubernetes-patch-strategy" entry as
before and preserve the default for "type").

In `@hack/generate-schema-config.py`:
- Around line 66-70: The aggregation logic that populates results[pkg_name] with
"channel" and "components" silently ignores when the same spec.name (pkg_name)
appears with a different channel; update the block that checks pkg_name and
channel (where results is built and
results[pkg_name]["components"].add(component) is called) to detect if
results[pkg_name]["channel"] already exists and differs from the current
channel, and on that conflict raise an error (or exit non‑zero) with a clear
message referencing the spec.name/pkg_name and both channels; do not merge
components across differing channels.

---

Nitpick comments:
In @.github/workflows/openapi-schema-check.yml:
- Line 25: Replace the unpinned installer command "pip install PyYAML" in the
GitHub Actions job with a pinned version (e.g., "pip install PyYAML==<version>")
to ensure reproducible runs; pick and lock the exact PyYAML version you tested
against and update that version string in the workflow step that currently runs
pip install PyYAML.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 00891ba2-3175-4320-88cf-d34e51e42032

📥 Commits

Reviewing files that changed from the base of the PR and between abd1ee2 and ea70e61.

📒 Files selected for processing (11)
  • .github/workflows/openapi-schema-check.yml
  • Makefile
  • VERSION_UPDATE_GUIDE.md
  • hack/crd-schema-config.json
  • hack/extract-schema.py
  • hack/generate-schema-config.py
  • telco-core/configuration/schema.openapi
  • telco-ran/configuration/argocd/example/acmpolicygenerator/README.md
  • telco-ran/configuration/argocd/example/acmpolicygenerator/hub-side-templating/schema.openapi
  • telco-ran/configuration/argocd/example/acmpolicygenerator/hub-side-templating/schema.openapi
  • telco-ran/configuration/argocd/example/acmpolicygenerator/schema.openapi

Comment thread .github/workflows/openapi-schema-check.yml Outdated
Comment thread .github/workflows/openapi-schema-check.yml Outdated
Comment thread hack/extract-schema.py
Comment on lines +66 to +70
if pkg_name and channel:
if pkg_name not in results:
results[pkg_name] = {"channel": channel, "components": set()}
results[pkg_name]["components"].add(component)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Detect and fail on cross-component channel conflicts per package.

If the same spec.name is found with different channels, current logic silently keeps the first channel and merges components, which can generate wrong refs and schemas.

Proposed fix
             if pkg_name and channel:
                 if pkg_name not in results:
                     results[pkg_name] = {"channel": channel, "components": set()}
+                elif results[pkg_name]["channel"] != channel:
+                    raise RuntimeError(
+                        f"Conflicting channels for package '{pkg_name}': "
+                        f"'{results[pkg_name]['channel']}' vs '{channel}'"
+                    )
                 results[pkg_name]["components"].add(component)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@hack/generate-schema-config.py` around lines 66 - 70, The aggregation logic
that populates results[pkg_name] with "channel" and "components" silently
ignores when the same spec.name (pkg_name) appears with a different channel;
update the block that checks pkg_name and channel (where results is built and
results[pkg_name]["components"].add(component) is called) to detect if
results[pkg_name]["channel"] already exists and differs from the current
channel, and on that conflict raise an error (or exit non‑zero) with a clear
message referencing the spec.name/pkg_name and both channels; do not merge
components across differing channels.

Description:
- check openapi schema workflow - security hardening
- warning for channel clash across core vs ran
- README updates
@irinamihai

Copy link
Copy Markdown
Contributor Author

/unhold

@openshift-ci openshift-ci Bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 27, 2026
Comment on lines +10 to +11
contents: write
pull-requests: write

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there finer grained permissions that can be granted. These are pretty broad.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are already the minimum required permissions for peter-evans/create-pull-request - contents: write to push the commit and pull-requests: write to create the PR. This matches the permissions used by doc-updater.yml in this repo. There are no finer-grained alternatives for this use case where we want to automatically create PRs.

What restrictions/rules did you have in mind?

YQ_VERSION="v4.53.2"
YQ_SHA256="d56bf5c6819e8e696340c312bd70f849dc1678a7cda9c2ad63eebd906371d56b"
curl -sL "https://github.com/mikefarah/yq/releases/download/${YQ_VERSION}/yq_linux_amd64" -o /usr/local/bin/yq
echo "${YQ_SHA256} /usr/local/bin/yq" | sha256sum -c -

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will PolicyGenerator follow the redirect path?
Also nit: newline

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PolicyGenerator rejects symlinks that resolve outside the directory tree. The file is kept as a regular copy of the parent schema.openapi. The following work to restructure the directories to resemble telco-core plus the removal of the non templated PolicyGenerator examples will consolidate this so there's a single schema.openapi co-located with the PolicyGenerator CRs.
For now, I'm going to leave the full content in this file and update the Makefile to copy the generated schema in here.

Comment on lines +4 to +7
"Static fields (source, merge_keys, version, package_name, ref_rule)",
"are manually maintained.",
"Dynamic fields (subscription_channel, components, source.github.ref)",
"are updated by: python3 hack/generate-schema-config.py",

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to separate the static and dynamic/derived data into separate files? Perhaps yaml file listing the static (human generated) data.
Do we need persistence of the dynamic data, or is it for archival/informational purposes (ie a log of what was found)?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dynamic fields (subscription_channel, components, source.github.ref) are needed by extract-schema.py to know which GitHub branch to fetch CRDs from - they're not just informational. generate-schema-config.py updates them in-place while preserving the static fields (source.github.{owner,repo,path},merge_keys, version, ref_rule).

Splitting into two files would add complexity to the merge/read logic, since the round-trip update preserves static fields and git diff shows exactly what changed. The _comment block at the top documents which fields are static vs dynamic.

We can revisit if the config grows significantly, but for 6 CRD entries the single file keeps things simple.
WDYT?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants