Skip to content

feat(ci): smart PR labeler with type, size, module, complexity detection#148

Open
Yuvraj-Sarathe wants to merge 3 commits into
KDM-cli:mainfrom
Yuvraj-Sarathe:PR-Labeler
Open

feat(ci): smart PR labeler with type, size, module, complexity detection#148
Yuvraj-Sarathe wants to merge 3 commits into
KDM-cli:mainfrom
Yuvraj-Sarathe:PR-Labeler

Conversation

@Yuvraj-Sarathe

@Yuvraj-Sarathe Yuvraj-Sarathe commented Jun 8, 2026

Copy link
Copy Markdown
Member
  • Add prLabels config to kdm-automation.json with label definitions and module path mappings
  • Rewrite pr-labeler.cjs: title-based type detection, size calculation from additions/deletions, module detection from file paths, complexity scoring
  • Refactor pr-labeler.cjs to use shared helpers (buildBotContext, addLabels)
  • Add validatePrLabels() to config-loader.cjs for config validation
  • Fix helper require() calls to use .cjs extension (Node.js v24 compat)
  • Add LABELS.md documenting the complete label system

Summary by CodeRabbit

Release Notes

  • New Features

    • Implemented intelligent PR labeling that automatically categorizes pull requests by type (from title patterns), size (based on line changes), affected modules (from file paths), and complexity (using heuristics).
  • Documentation

    • Added comprehensive guide documenting the PR label taxonomy and labeling rules.
  • Chores

    • Updated GitHub automation configuration to support the new labeling system.

- Add prLabels config to kdm-automation.json with label definitions and module path mappings
- Rewrite pr-labeler.cjs: title-based type detection, size calculation from
  additions/deletions, module detection from file paths, complexity scoring
- Refactor pr-labeler.cjs to use shared helpers (buildBotContext, addLabels)
- Add validatePrLabels() to config-loader.cjs for config validation
- Fix helper require() calls to use .cjs extension (Node.js v24 compat)
- Add LABELS.md documenting the complete label system
@github-actions github-actions Bot added the documentation Improvements or additions to documentation label Jun 8, 2026
@coderabbitai

coderabbitai Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Warning

Review limit reached

@Yuvraj-Sarathe, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 39 minutes and 17 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 73f46548-ee38-46e0-9af0-a0f473f0be25

📥 Commits

Reviewing files that changed from the base of the PR and between 5309ff8 and db3e932.

📒 Files selected for processing (8)
  • .github/LABELS.md
  • .github/scripts/helpers/api.cjs
  • .github/scripts/helpers/checks.cjs
  • .github/scripts/helpers/comments.cjs
  • .github/scripts/helpers/config-loader.cjs
  • .github/scripts/helpers/constants.cjs
  • .github/scripts/helpers/index.cjs
  • .github/scripts/pr-labeler.cjs

Warning

.coderabbit.yaml has a parsing error

The CodeRabbit configuration file in this repository has a parsing error and default settings were used instead. Please fix the error(s) in the configuration file. You can initialize chat with CodeRabbit to get help with the configuration file.

💥 Parsing errors (2)
Validation error: Invalid input: expected string, received undefined at "reviews.path_instructions[3].path"; Invalid input: expected string, received undefined at "reviews.path_instructions[3].instructions"
⚙️ Configuration instructions
  • Please see the configuration documentation for more information.
  • You can also validate your configuration using the online YAML validator.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json
📝 Walkthrough

Walkthrough

This PR implements a configurable pull-request auto-labeler that derives labels from PR title patterns, changed file modules, total line changes, and complexity heuristics. It replaces fixed path-based labeling with a configuration-driven system loaded from kdm-automation.json, adds validation for the label schema, updates helper module imports to use .cjs extensions, and rewrites the core labeler logic to detect and apply multi-factor labels.

Changes

PR Auto-Labeler System

Layer / File(s) Summary
Label taxonomy configuration and documentation
.github/kdm-automation.json, .github/LABELS.md
The prLabels configuration section maps label values for type, size, module, and complexity categories, plus path-to-module glob assignments. LABELS.md documents how the labeler derives each label type using thresholds and heuristics for reviewer reference.
Label configuration validation
.github/scripts/helpers/config-loader.cjs
Adds validatePrLabels() to validate the optional prLabels config structure, including type/size/module/complexity shape checks, required string-prefix formats, and cross-referencing of modulePaths values against known modules.
Helper module extension consistency
.github/scripts/helpers/api.cjs, .github/scripts/helpers/checks.cjs, .github/scripts/helpers/comments.cjs, .github/scripts/helpers/constants.cjs, .github/scripts/helpers/index.cjs
All local module imports updated to explicitly require .cjs-suffixed file extensions instead of extensionless paths for consistent module resolution.
PR labeler detection and determination functions
.github/scripts/pr-labeler.cjs
Rewritten from fixed path-based rules into a configurable system. New functions: detectType() classifies PR type from title patterns; determineSize() maps total changes to size buckets; detectModules() matches changed files against configured glob patterns; matchGlobPattern() provides custom glob-to-regex matching; calculateComplexity() computes a heuristic score; determineComplexity() maps scores to complexity tiers. The main labelPR() function loads config, fetches PR details via GitHub API, applies all detection functions, collects labels, and applies them via addLabels().

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

Possibly related PRs

  • KDM-cli/kdm-cli#33: Updates workflow to run ./.github/scripts/pr-labeler.cjs directly, which is the labeler being rewritten in this PR.
  • KDM-cli/kdm-cli#32: Switches workflow to run a JS labeler script, setting up the execution context for the labeler logic implemented in this PR.

Suggested labels

ci/cd

Suggested reviewers

  • utkarsh232005
  • codescene-delta-analysis

Poem

🐰 A labeler rabbit hops through PR tides,
Sniffing titles, files, and code-change sizes,
Complexity scores and module guides,
Smart tags applied—automation surprises! ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: a rewrite of pr-labeler.cjs implementing smart PR labeling with type, size, module, and complexity detection based on configuration. It is concise, clear, and directly reflects the substantial work in the PR.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Yuvraj-Sarathe

Copy link
Copy Markdown
Member Author

Solves Issue #148

codescene-delta-analysis[bot]

This comment was marked as outdated.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (2)
.github/scripts/helpers/config-loader.cjs (1)

318-320: 💤 Low value

Use in operator for more robust property checking.

The condition !pr.module[moduleName] uses a truthy check, which could incorrectly flag keys with falsy values (though the label format validation prevents empty strings). Using !(moduleName in pr.module) is more explicit and idiomatic for checking property existence.

♻️ Proposed refactor
-      if (typeof moduleName === 'string' && pr.module && !pr.module[moduleName]) {
+      if (typeof moduleName === 'string' && pr.module && !(moduleName in pr.module)) {
         errors.push(`prLabels.modulePaths["${pattern}"] references unknown module "${moduleName}"`);
       }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/scripts/helpers/config-loader.cjs around lines 318 - 320, Replace
the truthy property check used in the condition that validates moduleName
existence and use the `in` operator instead: in the block where you check `if
(typeof moduleName === 'string' && pr.module && !pr.module[moduleName])` change
the existence test to `!(moduleName in pr.module)` so the condition becomes `if
(typeof moduleName === 'string' && pr.module && !(moduleName in pr.module))`;
keep the same error push using `pattern` and `moduleName`.
.github/kdm-automation.json (1)

127-127: src/module: cli fallback is intentional (and documented)**

  • detectModules collects all matching prLabels.modulePaths entries (deduped), so any changed src/** file will add module: cli in addition to any more specific module matches.
  • This “fallback” is explicitly documented in .github/LABELS.md under module: cli (src/** (fallback)).
  • Because matchedModules.length drives both the complexity score and the multi-module indicator, the catch-all cli can increase those counts; consider making the fallback apply only when no other non-cli module matched if label noise becomes a concern.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/kdm-automation.json at line 127, detectModules currently adds the
catch-all "module: cli" for any src/** change which inflates matchedModules and
multi-module/complexity metrics; update the detection logic in detectModules
(which reads prLabels.modulePaths and builds matchedModules) to dedupe as it
does now but then, if matchedModules contains more than one entry and includes
the "cli" module, remove "cli" so it only remains when it is the sole match;
ensure the same adjusted matchedModules is used downstream for complexity
scoring and multi-module determination.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/LABELS.md:
- Line 4: The relative path to the pr-labeler script in LABELS.md is wrong;
update the two occurrences of "../.github/scripts/pr-labeler.cjs" to
"scripts/pr-labeler.cjs" so the link correctly points from LABELS.md to
pr-labeler.cjs under the same .github directory (apply the change at both
locations referenced in the diff).

In @.github/scripts/helpers/config-loader.cjs:
- Around line 297-310: The complexity validation loop currently checks only
label; extend it to validate entry.maxScore: for keys 'easy' and 'medium' ensure
entry.maxScore is a finite number (typeof === 'number' and isFinite) and for
'complex' allow either a finite number or null (explicitly check === null),
pushing descriptive errors to errors (e.g.
prLabels.complexity["${key}"].maxScore must be a number or null) when the check
fails; reference the pr.complexity object, the compKeys loop and the
determineComplexity() usage so the validation prevents runtime errors.
- Around line 297-310: The complexity validation currently checks only that each
complexity entry has a label but doesn't enforce ascending thresholds required
by determineComplexity(); update the validation in the pr.complexity block to
verify that entry.maxScore for 'easy' and 'medium' are numbers, that
easy.maxScore < medium.maxScore, and that complex.maxScore is null (or omitted)
— push corresponding error messages into errors when types or ordering violate
this (reference pr.complexity, compKeys, each entry, and determineComplexity()
to locate where to add the checks).
- Around line 273-286: The size label validation in the pr.size loop (inside
config-loader.cjs) currently only checks entry.label but not entry.maxChanges,
so add validation that entry.maxChanges is a finite number for keys
'xs','s','m','l' and is either a finite number or null for 'xl' (or whatever key
you expect to allow null), and push an error to errors with a clear message when
the value is invalid; update the loop that iterates sizeKeys (and reference
pr.size[key] / entry.maxChanges) to perform typeof/Number.isFinite checks (or
allow null for the XL case) so determineSize() numeric comparisons won't receive
non-numeric types.
- Around line 273-286: Add validation to the existing pr.size block to ensure
numeric thresholds are strictly ascending to match determineSize's assumptions:
after reading entries for sizeKeys (xs, s, m, l, xl) check that for keys
xs→s→m→l each entry.maxChanges is a finite number and strictly less than the
next key's maxChanges, and require xl.maxChanges to be null; push descriptive
errors to errors (e.g., `prLabels.size["xs"].maxChanges must be <
prLabels.size["s"].maxChanges`) when any check fails. Use the same symbols found
in the diff (pr.size, sizeKeys, entry, entry.maxChanges) so the validation sits
next to the existing label checks.

In @.github/scripts/helpers/index.cjs:
- Around line 8-13: The helper index currently requires modules without the .cjs
extension (constants, logger, validation, api, checks, comments), which fails
under Node’s extensionless resolution for this folder; update each require call
in the helpers index (the statements that reference constants, logger,
validation, api, checks, comments) to append “.cjs” (e.g.,
require('./constants.cjs'), require('./logger.cjs'), etc.), and make equivalent
fixes in other scripts mentioned (bot-inactivity-comments.cjs,
bot-on-pr-merged.cjs, bot-inactivity.cjs) so every require('./helpers...') or
require('./helpers/<name>') uses the .cjs filename explicitly.

In @.github/scripts/pr-labeler.cjs:
- Around line 224-230: The PR file listing only requests one page (per_page:
100) using github.rest.pulls.listFiles, so files is incomplete for large PRs and
breaks downstream detectModules(files, ...) and
calculateComplexity(files.length, ...); update the code to paginate through all
pages (e.g., use github.paginate or loop over pages calling
github.rest.pulls.listFiles with page) to accumulate all file entries into files
before calling detectModules and calculateComplexity, ensuring resp and files
are merged/concatenated across pages and handling the full result set.

---

Nitpick comments:
In @.github/kdm-automation.json:
- Line 127: detectModules currently adds the catch-all "module: cli" for any
src/** change which inflates matchedModules and multi-module/complexity metrics;
update the detection logic in detectModules (which reads prLabels.modulePaths
and builds matchedModules) to dedupe as it does now but then, if matchedModules
contains more than one entry and includes the "cli" module, remove "cli" so it
only remains when it is the sole match; ensure the same adjusted matchedModules
is used downstream for complexity scoring and multi-module determination.

In @.github/scripts/helpers/config-loader.cjs:
- Around line 318-320: Replace the truthy property check used in the condition
that validates moduleName existence and use the `in` operator instead: in the
block where you check `if (typeof moduleName === 'string' && pr.module &&
!pr.module[moduleName])` change the existence test to `!(moduleName in
pr.module)` so the condition becomes `if (typeof moduleName === 'string' &&
pr.module && !(moduleName in pr.module))`; keep the same error push using
`pattern` and `moduleName`.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 11dce551-bd95-47fd-90df-a7d67dc6927f

📥 Commits

Reviewing files that changed from the base of the PR and between 0624a8f and 5309ff8.

📒 Files selected for processing (9)
  • .github/LABELS.md
  • .github/kdm-automation.json
  • .github/scripts/helpers/api.cjs
  • .github/scripts/helpers/checks.cjs
  • .github/scripts/helpers/comments.cjs
  • .github/scripts/helpers/config-loader.cjs
  • .github/scripts/helpers/constants.cjs
  • .github/scripts/helpers/index.cjs
  • .github/scripts/pr-labeler.cjs

Comment thread .github/LABELS.md Outdated
Comment thread .github/scripts/helpers/config-loader.cjs Outdated
Comment thread .github/scripts/helpers/config-loader.cjs Outdated
Comment thread .github/scripts/helpers/index.cjs Outdated
Comment thread .github/scripts/pr-labeler.cjs Outdated
codescene-delta-analysis[bot]

This comment was marked as outdated.

@codescene-delta-analysis codescene-delta-analysis Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our agent can fix these. Install it.

No application code in the PR — skipped Code Health checks.

Quality Gate Profile: The Bare Minimum
Install CodeScene MCP: safeguard and uplift AI-generated code. Catch issues early with our IDE extension and CLI tool.

@Yuvraj-Sarathe

Copy link
Copy Markdown
Member Author

All done @utkarsh232005

@utkarsh232005

Copy link
Copy Markdown
Member

can you provide any tested automation labeling pr ss?

@Yuvraj-Sarathe

Copy link
Copy Markdown
Member Author

can you provide any tested automation labeling pr ss?

you'll have to wait till tomorrow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants