Skip to content

recognise fractional-second + offset ISO 8601 timestamps in utils and syslog parser#24

Open
HrachShah wants to merge 1 commit into
mainfrom
fix/iso8601-fractional-tz-timestamp-parsing
Open

recognise fractional-second + offset ISO 8601 timestamps in utils and syslog parser#24
HrachShah wants to merge 1 commit into
mainfrom
fix/iso8601-fractional-tz-timestamp-parsing

Conversation

@HrachShah

@HrachShah HrachShah commented Jun 26, 2026

Copy link
Copy Markdown
Owner

What this PR fixes

Two related gaps in the project's timestamp parsing:

  1. utils._try_parse_datetime paired %Y-%m-%dT%H:%M:%S.%f (fractional seconds, no timezone) with %Y-%m-%dT%H:%M:%S%z (whole seconds, with timezone) but never combined the two. Combined with the existing Z → +00:00 normalisation, a timestamp extracted by the upstream regex like 2025-03-20T10:15:32.123Z would normalise to 2025-03-20T10:15:32.123+00:00 and then fail every format in the list, returning None. Any generic / syslog / mixed-format log line that includes both fractional seconds and a timezone hit this path — including --start-time / --end-time filtering in the CLI, since cli._parse_file calls parse_timestamp to drive the time-window filter.

  2. parsers/syslog.SyslogParser._parse_timestamp had the same gap — no %Y-%m-%dT%H:%M:%S.%f%z — and also passed the raw Z suffix straight into strptime, so the timezone-aware formats it did have (%Y-%m-%dT%H:%M:%S%z) silently dropped RFC 5424 lines whose timestamps ended in Z. GenericParser and JSONLogParser already handled fractional + offset correctly; the inconsistency was inside the syslog path.

Reproduction

from log_analyzer_cli.utils import parse_timestamp
parse_timestamp("2025-03-20T10:15:32.123Z")  # -> None (before); tz-aware datetime (after)

from log_analyzer_cli.parsers import SyslogParser
SyslogParser().parse("2025-03-20T10:15:32.123Z host app[123]: msg")
# -> ParsedEntry(..., timestamp=None, ...) (before)
# -> ParsedEntry(..., timestamp=2025-03-20 10:15:32.123000+00:00, ...) (after)

Changes

  • src/log_analyzer_cli/utils.py — add "%Y-%m-%dT%H:%M:%S.%f%z" to the format list in _try_parse_datetime.
  • src/log_analyzer_cli/parsers/syslog.py — add "%Y-%m-%dT%H:%M:%S.%f%z" and apply the Z → +00:00 normalisation to a local copy of ts_str so strptime can match %z-bearing formats against Z-suffixed timestamps.
  • tests/test_parsers.py — add TestISOFractionalTimestampParsing with five tests:
    • test_utils_parses_fractional_z"2025-03-20T10:15:32.123Z" parses to 2025-03-20 10:15:32.123000+00:00.
    • test_utils_parses_fractional_with_offset"2025-03-20T10:15:32.123+05:30" parses to the correct Asia/Kolkata-style offset.
    • test_utils_parses_fractional_with_negative_offset"2025-03-20T10:15:32.5-08:00" parses correctly (single-digit fractional).
    • test_syslog_parses_fractional_z — full RFC 5424 syslog line, microseconds and UTC offset survive.
    • test_syslog_parses_fractional_with_offset — RFC 5424 syslog line with +05:30 offset, microseconds and offset both survive.

All five tests fail on the previous code and pass with the fix applied. Full local suite: 37 passed (32 baseline + 5 new).

$ python3 -m pytest tests/test_parsers.py tests/test_analyzer.py -q
============================== 37 passed in 0.26s ==============================

Summary by Sourcery

Improve timestamp parsing for fractional-second ISO 8601 strings with timezone offsets across utilities and the syslog parser.

Enhancements:

  • Extend datetime parsing formats to support fractional-second ISO 8601 timestamps with timezone offsets in core utilities.
  • Normalize Z-suffixed timestamps to +00:00 in the syslog parser so timezone-aware formats are consistently recognized.

Tests:

  • Add regression tests covering fractional-second ISO 8601 timestamps with UTC Z and positive/negative timezone offsets for both utils and syslog parsing.

Summary by CodeRabbit

  • Bug Fixes
    • Improved timestamp parsing for log entries with fractional seconds and timezone offsets, including Z, +05:30, and -08:00.
    • Syslog messages with these timestamp formats now parse correctly and preserve the right time zone information.
    • Reduced parsing failures for ISO-8601 and syslog-style timestamps, including entries without a year.

… syslog parser

utils._try_parse_datetime and SyslogParser._parse_timestamp both used a format
list that paired fractional seconds (.%f) with the bare 'T' separator but never
combined .%f with a %z timezone offset. Combined with the Z->+00:00 normalisation
already in utils, the result was that timestamps extracted by the upstream regex
- '2025-03-20T10:15:32.123Z' in utils, '2025-03-20T10:15:32.500+05:30' in syslog
RFC 5424 lines - fell through every format and returned None, even though the
other parsers (generic.py, json_log.py) already handled them. SyslogParser also
passed the raw 'Z' string straight into strptime, which then failed on the
timezone-aware formats it did have.

- utils: add %Y-%m-%dT%H:%M:%S.%f%z to the format list.
- syslog: add the same format, and apply Z->+00:00 to a local copy of ts_str
  so the '%z' formats can match Z-suffixed syslog timestamps.
- tests/test_parsers.py: add TestISOFractionalTimestampParsing covering utils
  (Z, +05:30, -08:00 with fractional) and SyslogParser (Z and +05:30 with
  fractional), asserting both the microsecond and the utc offset survive. All
  five tests fail on the previous code and pass on the fix. Local pytest run:
  37 passed (32 baseline + 5 new).
@sourcery-ai

sourcery-ai Bot commented Jun 26, 2026

Copy link
Copy Markdown

Reviewer's Guide

Adds support for ISO 8601 timestamps that combine fractional seconds with timezone offsets (including Z-normalised UTC) in both the generic utils timestamp parser and the syslog parser, and pins the behaviour with regression tests.

Sequence diagram for utils fractional-offset ISO 8601 timestamp parsing

sequenceDiagram
    actor User
    participant Utils as parse_timestamp
    participant Parser as _try_parse_datetime
    participant DateTime as datetime_strptime

    User->>Utils: parse_timestamp(iso_string)
    Utils->>Parser: _try_parse_datetime(normalized_ts_str)
    opt Z suffix
        Parser->>Parser: [normalize Z to +00:00]
    end
    Parser->>DateTime: datetime_strptime("%Y-%m-%dT%H:%M:%S.%f%z")
    DateTime-->>Parser: aware_datetime
    Parser-->>Utils: aware_datetime
    Utils-->>User: aware_datetime
Loading

Sequence diagram for SyslogParser fractional-offset timestamp parsing

sequenceDiagram
    actor User
    participant Syslog as SyslogParser_parse
    participant TsParser as SyslogParser__parse_timestamp
    participant DateTime as datetime_strptime

    User->>Syslog: parse(syslog_line)
    Syslog->>TsParser: _parse_timestamp(ts_str)
    opt Z suffix
        TsParser->>TsParser: [normalize Z to +00:00]
    end
    TsParser->>DateTime: datetime_strptime("%Y-%m-%dT%H:%M:%S.%f%z")
    DateTime-->>TsParser: aware_datetime
    TsParser-->>Syslog: aware_datetime
    Syslog-->>User: ParsedEntry(timestamp=aware_datetime)
Loading

File-Level Changes

Change Details Files
Extend generic timestamp parsing to handle fractional-second + timezone ISO 8601 strings and ensure Z-suffixed timestamps are correctly normalised.
  • Add a strptime format that supports fractional seconds with timezone offset to the datetime format list.
  • Keep existing formats for non-timezone and non-fractional timestamps untouched to preserve backward compatibility.
src/log_analyzer_cli/utils.py
Update syslog timestamp parsing to support fractional-second + timezone ISO 8601 timestamps and correctly treat Z as UTC for all timezone-aware formats.
  • Add a strptime format that supports fractional seconds with timezone offset to the syslog parser format list.
  • Normalise a trailing Z to +00:00 on a local copy of the timestamp string before attempting strptime against timezone-aware formats.
  • Use the normalised timestamp string consistently in the parsing loop while preserving year inference for RFC 3164-style formats.
src/log_analyzer_cli/parsers/syslog.py
Add regression tests to verify parsing of fractional-second + timezone ISO 8601 timestamps via both the utils parser and the syslog parser.
  • Introduce a dedicated test class that documents the previous parsing gap and focuses on fractional-second + timezone combinations.
  • Add tests for Z-suffixed, positive offset, and negative offset ISO 8601 timestamps through the utils parse_timestamp entry point.
  • Add tests that exercise RFC 5424-style syslog lines with fractional seconds and both Z and explicit positive offsets, checking microseconds, timezone offsets, and message extraction.
tests/test_parsers.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@coderabbitai

coderabbitai Bot commented Jun 26, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Timestamp parsing now accepts ISO 8601 values with fractional seconds and timezone offsets. The syslog parser normalizes trailing Z before parsing, and regression tests cover Z, positive offsets, negative offsets, and syslog timestamp extraction.

Changes

ISO fractional timestamp parsing

Layer / File(s) Summary
Parsing formats and normalization
src/log_analyzer_cli/utils.py, src/log_analyzer_cli/parsers/syslog.py
parse_timestamp accepts ISO-like datetimes with fractional seconds and numeric offsets, and SyslogParser._parse_timestamp normalizes trailing Z before parsing.
Regression tests
tests/test_parsers.py
New tests cover fractional ISO timestamps with Z, positive and negative offsets, and syslog parsing with preserved microseconds and timezone offsets.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

A rabbit hopped through clocks tonight,
With Z and offsets fitting right.
Tiny microseconds, neat and fine,
Now parse and purr in tidy time.
🐇⏱️

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly describes the main change: adding ISO 8601 fractional-second and offset parsing in utils and the syslog parser.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/iso8601-fractional-tz-timestamp-parsing

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've reviewed your changes and they look great!


Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/log_analyzer_cli/utils.py (1)

39-45: 🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Keep the space-separated offset contract in sync with the regex.

parse_timestamp() still advertises [T ] timestamps with offsets on Line 22, but _try_parse_datetime() only accepts %z for the T variants. Inputs like 2025-03-20 10:15:32.123+05:30 will still match upstream and then fall through to None.

Proposed fix
     formats = [
         "%Y-%m-%d %H:%M:%S.%f",
+        "%Y-%m-%d %H:%M:%S.%f%z",
         "%Y-%m-%dT%H:%M:%S.%f",
         "%Y-%m-%d %H:%M:%S",
+        "%Y-%m-%d %H:%M:%S%z",
         "%Y-%m-%dT%H:%M:%S",
         "%Y-%m-%dT%H:%M:%S.%f%z",
         "%Y-%m-%dT%H:%M:%S%z",
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/log_analyzer_cli/utils.py` around lines 39 - 45, Update parse_timestamp()
and the _try_parse_datetime() format list so the advertised [T ] offset
timestamps are actually parsed; currently only the T forms accept %z, causing
space-separated offset inputs to fail. Add the space-separated offset variants
alongside the existing datetime formats in _try_parse_datetime(), and keep the
regex contract in parse_timestamp() aligned with those accepted formats so
timestamps like 2025-03-20 10:15:32.123+05:30 are handled correctly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/log_analyzer_cli/utils.py`:
- Around line 39-45: Update parse_timestamp() and the _try_parse_datetime()
format list so the advertised [T ] offset timestamps are actually parsed;
currently only the T forms accept %z, causing space-separated offset inputs to
fail. Add the space-separated offset variants alongside the existing datetime
formats in _try_parse_datetime(), and keep the regex contract in
parse_timestamp() aligned with those accepted formats so timestamps like
2025-03-20 10:15:32.123+05:30 are handled correctly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: dadd17c4-7cc3-452c-aa0d-33c5ed78afbe

📥 Commits

Reviewing files that changed from the base of the PR and between e93757f and fbe2a39.

📒 Files selected for processing (3)
  • src/log_analyzer_cli/parsers/syslog.py
  • src/log_analyzer_cli/utils.py
  • tests/test_parsers.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant