recognise fractional-second + offset ISO 8601 timestamps in utils and syslog parser#24
recognise fractional-second + offset ISO 8601 timestamps in utils and syslog parser#24HrachShah wants to merge 1 commit into
Conversation
… syslog parser utils._try_parse_datetime and SyslogParser._parse_timestamp both used a format list that paired fractional seconds (.%f) with the bare 'T' separator but never combined .%f with a %z timezone offset. Combined with the Z->+00:00 normalisation already in utils, the result was that timestamps extracted by the upstream regex - '2025-03-20T10:15:32.123Z' in utils, '2025-03-20T10:15:32.500+05:30' in syslog RFC 5424 lines - fell through every format and returned None, even though the other parsers (generic.py, json_log.py) already handled them. SyslogParser also passed the raw 'Z' string straight into strptime, which then failed on the timezone-aware formats it did have. - utils: add %Y-%m-%dT%H:%M:%S.%f%z to the format list. - syslog: add the same format, and apply Z->+00:00 to a local copy of ts_str so the '%z' formats can match Z-suffixed syslog timestamps. - tests/test_parsers.py: add TestISOFractionalTimestampParsing covering utils (Z, +05:30, -08:00 with fractional) and SyslogParser (Z and +05:30 with fractional), asserting both the microsecond and the utc offset survive. All five tests fail on the previous code and pass on the fix. Local pytest run: 37 passed (32 baseline + 5 new).
Reviewer's GuideAdds support for ISO 8601 timestamps that combine fractional seconds with timezone offsets (including Z-normalised UTC) in both the generic utils timestamp parser and the syslog parser, and pins the behaviour with regression tests. Sequence diagram for utils fractional-offset ISO 8601 timestamp parsingsequenceDiagram
actor User
participant Utils as parse_timestamp
participant Parser as _try_parse_datetime
participant DateTime as datetime_strptime
User->>Utils: parse_timestamp(iso_string)
Utils->>Parser: _try_parse_datetime(normalized_ts_str)
opt Z suffix
Parser->>Parser: [normalize Z to +00:00]
end
Parser->>DateTime: datetime_strptime("%Y-%m-%dT%H:%M:%S.%f%z")
DateTime-->>Parser: aware_datetime
Parser-->>Utils: aware_datetime
Utils-->>User: aware_datetime
Sequence diagram for SyslogParser fractional-offset timestamp parsingsequenceDiagram
actor User
participant Syslog as SyslogParser_parse
participant TsParser as SyslogParser__parse_timestamp
participant DateTime as datetime_strptime
User->>Syslog: parse(syslog_line)
Syslog->>TsParser: _parse_timestamp(ts_str)
opt Z suffix
TsParser->>TsParser: [normalize Z to +00:00]
end
TsParser->>DateTime: datetime_strptime("%Y-%m-%dT%H:%M:%S.%f%z")
DateTime-->>TsParser: aware_datetime
TsParser-->>Syslog: aware_datetime
Syslog-->>User: ParsedEntry(timestamp=aware_datetime)
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
📝 WalkthroughWalkthroughTimestamp parsing now accepts ISO 8601 values with fractional seconds and timezone offsets. The syslog parser normalizes trailing ChangesISO fractional timestamp parsing
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/log_analyzer_cli/utils.py (1)
39-45: 🎯 Functional Correctness | 🟡 Minor | ⚡ Quick winKeep the space-separated offset contract in sync with the regex.
parse_timestamp()still advertises[T ]timestamps with offsets on Line 22, but_try_parse_datetime()only accepts%zfor theTvariants. Inputs like2025-03-20 10:15:32.123+05:30will still match upstream and then fall through toNone.Proposed fix
formats = [ "%Y-%m-%d %H:%M:%S.%f", + "%Y-%m-%d %H:%M:%S.%f%z", "%Y-%m-%dT%H:%M:%S.%f", "%Y-%m-%d %H:%M:%S", + "%Y-%m-%d %H:%M:%S%z", "%Y-%m-%dT%H:%M:%S", "%Y-%m-%dT%H:%M:%S.%f%z", "%Y-%m-%dT%H:%M:%S%z",🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/log_analyzer_cli/utils.py` around lines 39 - 45, Update parse_timestamp() and the _try_parse_datetime() format list so the advertised [T ] offset timestamps are actually parsed; currently only the T forms accept %z, causing space-separated offset inputs to fail. Add the space-separated offset variants alongside the existing datetime formats in _try_parse_datetime(), and keep the regex contract in parse_timestamp() aligned with those accepted formats so timestamps like 2025-03-20 10:15:32.123+05:30 are handled correctly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@src/log_analyzer_cli/utils.py`:
- Around line 39-45: Update parse_timestamp() and the _try_parse_datetime()
format list so the advertised [T ] offset timestamps are actually parsed;
currently only the T forms accept %z, causing space-separated offset inputs to
fail. Add the space-separated offset variants alongside the existing datetime
formats in _try_parse_datetime(), and keep the regex contract in
parse_timestamp() aligned with those accepted formats so timestamps like
2025-03-20 10:15:32.123+05:30 are handled correctly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: dadd17c4-7cc3-452c-aa0d-33c5ed78afbe
📒 Files selected for processing (3)
src/log_analyzer_cli/parsers/syslog.pysrc/log_analyzer_cli/utils.pytests/test_parsers.py
What this PR fixes
Two related gaps in the project's timestamp parsing:
utils._try_parse_datetimepaired%Y-%m-%dT%H:%M:%S.%f(fractional seconds, no timezone) with%Y-%m-%dT%H:%M:%S%z(whole seconds, with timezone) but never combined the two. Combined with the existingZ → +00:00normalisation, a timestamp extracted by the upstream regex like2025-03-20T10:15:32.123Zwould normalise to2025-03-20T10:15:32.123+00:00and then fail every format in the list, returningNone. Any generic / syslog / mixed-format log line that includes both fractional seconds and a timezone hit this path — including--start-time/--end-timefiltering in the CLI, sincecli._parse_filecallsparse_timestampto drive the time-window filter.parsers/syslog.SyslogParser._parse_timestamphad the same gap — no%Y-%m-%dT%H:%M:%S.%f%z— and also passed the rawZsuffix straight intostrptime, so the timezone-aware formats it did have (%Y-%m-%dT%H:%M:%S%z) silently dropped RFC 5424 lines whose timestamps ended inZ.GenericParserandJSONLogParseralready handled fractional + offset correctly; the inconsistency was inside the syslog path.Reproduction
Changes
src/log_analyzer_cli/utils.py— add"%Y-%m-%dT%H:%M:%S.%f%z"to the format list in_try_parse_datetime.src/log_analyzer_cli/parsers/syslog.py— add"%Y-%m-%dT%H:%M:%S.%f%z"and apply theZ → +00:00normalisation to a local copy ofts_strsostrptimecan match%z-bearing formats againstZ-suffixed timestamps.tests/test_parsers.py— addTestISOFractionalTimestampParsingwith five tests:test_utils_parses_fractional_z—"2025-03-20T10:15:32.123Z"parses to2025-03-20 10:15:32.123000+00:00.test_utils_parses_fractional_with_offset—"2025-03-20T10:15:32.123+05:30"parses to the correct Asia/Kolkata-style offset.test_utils_parses_fractional_with_negative_offset—"2025-03-20T10:15:32.5-08:00"parses correctly (single-digit fractional).test_syslog_parses_fractional_z— full RFC 5424 syslog line, microseconds and UTC offset survive.test_syslog_parses_fractional_with_offset— RFC 5424 syslog line with+05:30offset, microseconds and offset both survive.All five tests fail on the previous code and pass with the fix applied. Full local suite: 37 passed (32 baseline + 5 new).
Summary by Sourcery
Improve timestamp parsing for fractional-second ISO 8601 strings with timezone offsets across utilities and the syslog parser.
Enhancements:
Z-suffixed timestamps to+00:00in the syslog parser so timezone-aware formats are consistently recognized.Tests:
Zand positive/negative timezone offsets for both utils and syslog parsing.Summary by CodeRabbit
Z,+05:30, and-08:00.