pick the earliest-occurring level keyword in detect_log_level by HrachShah · Pull Request #23 · HrachShah/log-analyzer-cli

HrachShah · 2026-06-25T02:01:43Z

What

utils.detect_log_level returns the first regex match in a hardcoded
CRITICAL > ERROR > WARNING > INFO > DEBUG > TRACE order, regardless of
where the keyword actually appears in the line.

Repro

Summary by Sourcery

Update log level detection to choose the earliest-occurring level keyword in a log line and add unit tests to cover the new behavior and edge cases.

Bug Fixes:

Fix misclassification of log lines that contain multiple log level keywords by selecting the leftmost match instead of a fixed severity order.

Tests:

Add unit tests for detect_log_level covering earliest-match selection, case insensitivity, timestamped lines, non-level lines, and ignoring substring matches.

detect_log_level walked a hardcoded list of (regex, level) tuples in CRITICAL > ERROR > WARNING > INFO > DEBUG > TRACE order and returned the first patternprecedence — whichever keyword appears first in the line is the intent of the line. Without this, '2024-01-15 10:23:45 WARNING cannot connect to CRITICAL service' was classified as CRITICAL even though the line is a WARNING. The fix runs every level regex, finds the leftmost match across all of them, and returns the level that owns that match. A new tests/test_utils.py pins the new contract: the leftmost keyword wins, embedded levels in later words do not override, and words like 'ERROR_RATE' or 'CRITICAL_ERROR' (which contain the level as a prefix of a longer identifier) are correctly ignored because the pattern uses \b boundaries.

sourcery-ai · 2026-06-25T02:01:51Z

Reviewer's Guide

Update log level detection to choose the earliest-occurring level keyword in a line instead of a fixed severity order, and add unit tests covering the new behavior and existing edge cases.

Flow diagram for updated detect_log_level logic

flowchart TD
    A[detect_log_level line] --> B[Convert line to uppercase line_upper]
    B --> C[Initialize earliest_level = None]
    C --> D[Initialize earliest_index = infinity]
    D --> E[Iterate level_patterns]
    E --> F[re.finditer pattern line_upper]
    F --> G{Any matches?}
    G -->|No| H{More patterns?}
    H -->|Yes| E
    H -->|No| I{earliest_level is not None?}
    G -->|Yes| J[Take first_match.start]
    J --> K{first_match.start < earliest_index?}
    K -->|Yes| L[Update earliest_index and earliest_level]
    K -->|No| H
    L --> H
    I -->|Yes| M[Return earliest_level]
    I -->|No| N[Return UNKNOWN]

File-Level Changes

Change	Details	Files
Change log level detection to select the leftmost matching level keyword rather than the first match in a fixed severity-ordered pattern list.	Update detect_log_level docstring to describe earliest-occurring keyword behavior. Introduce tracking variables earliest_level and earliest_index initialized to None and infinity, respectively. Replace re.search with re.finditer to collect all matches for each level pattern in the input line. Choose the level whose first match has the smallest start index across all patterns and return it if any match was found. Preserve UNKNOWN as the default return value when no level keyword is present.	`src/log_analyzer_cli/utils.py`
Add unit tests to validate the new earliest-match behavior and guard existing behaviors like case-insensitivity and word-boundary matching.	Add tests for simple level keyword detection, including abbreviations like WARN, CRIT, and ERR. Add tests for timestamp-prefixed log lines to ensure detection after leading timestamps. Add tests verifying that when multiple level keywords are present, the earliest (leftmost) one is chosen, including cases where it is higher severity. Add tests ensuring lines without level keywords return UNKNOWN, including empty and whitespace-only strings. Add tests confirming that word-boundary-based patterns do not match level substrings inside other identifiers and that detection remains case-insensitive.	`tests/test_utils.py`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey - I've left some high level feedback:

You don't need to materialize all matches with list(re.finditer(...)); using next(re.finditer(...), None) and checking that single result would avoid unnecessary allocations while still letting you compare positions.
Consider avoiding float('inf') for earliest_index and instead initializing it to None and adjusting the comparison logic, which can make the intent clearer and removes reliance on magic sentinel values.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- You don't need to materialize all matches with `list(re.finditer(...))`; using `next(re.finditer(...), None)` and checking that single result would avoid unnecessary allocations while still letting you compare positions.
- Consider avoiding `float('inf')` for `earliest_index` and instead initializing it to `None` and adjusting the comparison logic, which can make the intent clearer and removes reliance on magic sentinel values.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

coderabbitai · 2026-06-25T02:04:48Z

Warning

Review limit reached

@HrachShah, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 37 minutes and 3 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: aceeb8e1-ca63-4cde-8c57-1a4816149589

📥 Commits

Reviewing files that changed from the base of the PR and between e93757f and 1cc6fa9.

📒 Files selected for processing (2)

src/log_analyzer_cli/utils.py
tests/test_utils.py

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/detect-log-level-pick-earliest-match

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

sourcery-ai Bot reviewed Jun 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pick the earliest-occurring level keyword in detect_log_level#23

pick the earliest-occurring level keyword in detect_log_level#23
HrachShah wants to merge 1 commit into
mainfrom
fix/detect-log-level-pick-earliest-match

HrachShah commented Jun 25, 2026 •

edited by sourcery-ai Bot

Loading

Uh oh!

sourcery-ai Bot commented Jun 25, 2026 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

coderabbitai Bot commented Jun 25, 2026

Review limit reached

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

HrachShah commented Jun 25, 2026 • edited by sourcery-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Repro

Summary by Sourcery

Uh oh!

sourcery-ai Bot commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Flow diagram for updated detect_log_level logic

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented Jun 25, 2026

Review limit reached

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

HrachShah commented Jun 25, 2026 •

edited by sourcery-ai Bot

Loading

sourcery-ai Bot commented Jun 25, 2026 •

edited

Loading