Skip to content

intelliaide: add skill metadata, OWNERS, config, and README#28

Open
sakshipatels98-byte wants to merge 2 commits into
openshift:mainfrom
sakshipatels98-byte:spatidar/intelliaide/skill-config
Open

intelliaide: add skill metadata, OWNERS, config, and README#28
sakshipatels98-byte wants to merge 2 commits into
openshift:mainfrom
sakshipatels98-byte:spatidar/intelliaide/skill-config

Conversation

@sakshipatels98-byte
Copy link
Copy Markdown

@sakshipatels98-byte sakshipatels98-byte commented May 28, 2026

Summary

Add IntelliAide as a new agentic skill for deep root-cause analysis of OpenShift
cluster issues using pre-mounted diagnostic data.

  • IntelliAide RCA pipelineSKILL.md orchestrates a 4-step workflow
    (extract_clusterselect_filesanalyze_dataperform_rca) with
    3-pass priority analysis (High → Medium → Low). Python scripts handle
    computation only (token estimation, chunking, file I/O); the orchestrating
    agent performs all LLM reasoning.
  • PVC-based data source — diagnostic data is read from /data/input,
    mounted read-only from the Proposal's spec.dataSource PVC. No manual
    oc cp or wait step; extract_cluster.py validates the pre-populated mount
    at pipeline start.
  • Vendored Python dependenciesrequirements.txt deps (requests,
    PyYAML, google-auth, drain3, odfpy, python-docx) are vendored under
    intelliaide/vendor/ for Python 3.12 so the restricted sandbox can run
    scripts without pip install at runtime.
  • Supporting assets — must-gather routing documentation (DataSource/),
    ML log/YAML classifiers, and pipeline configuration (Config/).
  • Repo hygieneintelliaide/README.md documents the pipeline and vendor
    regeneration; intelliaide/OWNERS for Prow review assignment; YAML
    frontmatter on SKILL.md for skill discovery; simplified config.json
    (removed obsolete oc cp-related keys).

Test plan

  • Build skills image (Containerfile) and confirm intelliaide/ is copied to /app/skills/
  • Verify skill discovery: agent selects intelliaide for RCA/deep-analysis requests via SKILL.md frontmatter
  • Create Proposal with spec.dataSource.claimName pointing to a PVC pre-populated with must-gather data
  • Confirm /data/input is mounted and extract_cluster.py validates data without oc cp
  • Run full pipeline end-to-end: extract → select → analyze → perform_rca produces structured diagnosis output
  • Confirm vendored deps load correctly (no runtime pip install) inside the sandbox
  • Verify config.json keys match what skill scripts consume (no references to removed must_gather_incoming_dir / must_gather_wait_seconds)
  • Confirm progress annotations appear on the Proposal during pipeline execution

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 28, 2026
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented May 28, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@sakshipatels98-byte sakshipatels98-byte force-pushed the spatidar/intelliaide/skill-config branch from 9b0d0fa to 46eba4f Compare May 29, 2026 07:20
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented May 29, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: sakshipatels98-byte
Once this PR has been reviewed and has the lgtm label, please assign cali0707 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sakshipatels98-byte sakshipatels98-byte marked this pull request as ready for review May 29, 2026 09:25
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 29, 2026
@openshift-ci openshift-ci Bot requested review from Cali0707 and harche May 29, 2026 09:27
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented May 29, 2026

@sakshipatels98-byte: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/eval 46eba4f link true /test eval

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

"""
extract_cluster.py — Step 1 of IntelliAide pure-skills pipeline.

Reads diagnostic data from /data/input (mounted from a PVC specified in the
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PVC-based data source approach is being questioned on the companion operator PR — the sandbox pods are ephemeral and the agent can collect must-gather data itself during its run using /tmp. See this comment.

If that feedback lands, this script would need to collect the data instead of assuming it's pre-mounted at /data/input.

@harche
Copy link
Copy Markdown
Contributor

harche commented May 29, 2026

This PR doesn't include evals. The repo has an eval framework under evals/skills/ — each existing skill has a directory with test_cases.yaml and system_prompt.md. Could you add evals/skills/intelliaide/ with test cases covering the pipeline?

It would be especially valuable to include negative cases that demonstrate why this skill is needed — e.g., what happens when the agent tries to analyze a large must-gather bundle without the IntelliAide pipeline (buffer overflow from raw file reads, shallow diagnosis that misses root cause, etc.). Those cases help justify the skill's existence and guard against regressions if someone later simplifies the flow.

Comment thread intelliaide/OWNERS
@@ -0,0 +1,6 @@
# See the OWNERS docs: https://git.k8s.io/community/contributors/guide/owners.md

approvers:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Individual teams are responsible for the ownership of the skill.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants