Skip to content

Commit 54e2e8b

Browse files
Merge pull request #6429 from ggiguash/fix-bug-correlation
USHIFT-6764: Revert bug correlation to fuzzy logic
2 parents b99e0be + 06c1c4f commit 54e2e8b

1 file changed

Lines changed: 5 additions & 18 deletions

File tree

.claude/commands/analyze-ci/create-report.md

Lines changed: 5 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -196,14 +196,8 @@ The HTML file must be a self-contained, single-file document with embedded CSS a
196196
<div class="collapsible-content">
197197
<div class="root-cause"><strong>Root Cause:</strong> Root cause description from summary</div>
198198
<!-- Bug links from bug mapping file (if available) -->
199-
<!-- Match issue title/error signature against ERROR_SIGNATURE values using token-overlap matching: -->
200-
<!-- 1. Normalize both strings: lowercase, strip punctuation, remove stopwords (a, an, the, is, in, of, to, for, and, or, with, that, this, from, on, at, by) -->
201-
<!-- 2. Simple stemming: strip common suffixes (-ing, -ed, -s, -tion, -ment, -ness, -ly, -er, -est) -->
202-
<!-- 3. Extract distinctive tokens (tool names, error codes, test IDs, paths, numeric identifiers) -->
203-
<!-- 4. Compute token overlap: intersection of token sets; coverage = overlap_count / min(set_a_size, set_b_size) -->
204-
<!-- 5. Match if: (a) exact substring match of either full string within the other (highest confidence=1.0), OR (b) ≥3 distinctive tokens overlap, OR (c) token coverage ≥60% -->
205-
<!-- 6. Confidence score = overlap_count / min(set_a_size, set_b_size); ties broken by highest confidence, then longest ERROR_SIGNATURE -->
206-
<!-- 7. Each issue matches at most one bug candidate (the highest-confidence match) -->
199+
<!-- Match by comparing the issue title/error signature against ERROR_SIGNATURE values in the bug mapping -->
200+
<!-- Use fuzzy matching: if significant keywords from the issue title appear in a bug candidate's ERROR_SIGNATURE, consider it a match -->
207201
<div class="bug-links">
208202
<span class="bug-links-label">JIRA Bugs:</span>
209203
<!-- For each matching JIRA duplicate (open bugs): -->
@@ -242,7 +236,8 @@ The HTML file must be a self-contained, single-file document with embedded CSS a
242236
<p><strong>Job:</strong> <span class="job-date">[YYYY-MM-DD]</span> <a href="JOB_URL">job-name</a></p>
243237
<div class="root-cause"><strong>Root Cause:</strong> Root cause from PR summary</div>
244238
<!-- Bug links from bug mapping file (if available for this rebase PR) -->
245-
<!-- Match job root cause/error description against ERROR_SIGNATURE values using the same token-overlap algorithm described in the Periodics section above -->
239+
<!-- Match by comparing the job's root cause/error description against ERROR_SIGNATURE values in the bug mapping -->
240+
<!-- Use the same fuzzy keyword matching as the Periodics tab: if significant keywords from the root cause/error description appear in a bug candidate's ERROR_SIGNATURE, consider it a match -->
246241
<div class="bug-links">
247242
<span class="bug-links-label">JIRA Bugs:</span>
248243
<a class="bug-tag bug-tag-open" href="https://issues.redhat.com/browse/USHIFT-XXXXX" title="Bug summary text [Status]">USHIFT-XXXXX</a>
@@ -296,15 +291,7 @@ document.querySelectorAll('.collapsible').forEach(function(el) {
296291
- Do NOT re-analyze or reinterpret the data — use summary file content as-is
297292
- Convert the plain text summary reports into HTML-formatted content, preserving all information
298293
- Ensure all Prow job URLs from the summaries remain clickable links in the HTML
299-
- **Bug Correlation**: For each issue in the TOP ISSUES section (Periodics tab) and each failed job entry (Pull Requests tab), attempt to match it against the bug candidates from the corresponding bug mapping file (`analyze-ci-bugs-<release>.txt` for releases, `analyze-ci-bugs-rebase-release-<version>.txt` for rebase PRs). Match by comparing the issue title/description or job root cause against the `ERROR_SIGNATURE` in each `--- BUG CANDIDATE ---` block using this token-overlap algorithm:
300-
1. **Normalize** both strings: lowercase, strip punctuation, remove stopwords (`a, an, the, is, in, of, to, for, and, or, with, that, this, from, on, at, by`)
301-
2. **Simple stemming**: strip common suffixes (`-ing, -ed, -s, -tion, -ment, -ness, -ly, -er, -est`)
302-
3. **Extract distinctive tokens**: tool names, error codes, test IDs, paths, numeric identifiers
303-
4. **Compute token overlap**: intersection of token sets; coverage = `overlap_count / min(set_a_size, set_b_size)`
304-
5. **Match criteria**: (a) exact substring match of either full string within the other (highest confidence=1.0), OR (b) ≥3 distinctive tokens overlap, OR (c) token coverage ≥60%
305-
6. **Confidence score** = `overlap_count / min(set_a_size, set_b_size)`; ties broken by highest confidence, then longest `ERROR_SIGNATURE`
306-
7. Each issue matches at most one bug candidate (the highest-confidence match)
307-
When a match is found:
294+
- **Bug Correlation**: For each issue in the TOP ISSUES section (Periodics tab) and each failed job entry (Pull Requests tab), attempt to match it against the bug candidates from the corresponding bug mapping file (`analyze-ci-bugs-<release>.txt` for releases, `analyze-ci-bugs-rebase-release-<version>.txt` for rebase PRs). Match by comparing the issue title/description or job root cause against the `ERROR_SIGNATURE` in each `--- BUG CANDIDATE ---` block — use fuzzy keyword matching (shared distinctive terms like tool names, test IDs, error codes). When a match is found:
308295
- Show `JIRA_DUPLICATES` as clickable links with `bug-tag-open` styling (linking to `https://issues.redhat.com/browse/<KEY>`) with the summary from `JIRA_DUPLICATE_DETAILS` as the title attribute
309296
- Show `JIRA_REGRESSIONS` as clickable links with `bug-tag-regression` styling (with ⟲ suffix) with the summary from `JIRA_REGRESSION_DETAILS` as the title attribute
310297
- If no bug mapping file exists for a release, or no candidates match an issue, show `<span class="no-bugs">No tracked bugs</span>`

0 commit comments

Comments
 (0)