Skip to content

kafka: Add log directory offline status metric#23038

Open
piochelepiotr wants to merge 2 commits intomasterfrom
piotr.wolski/kafka-log-dir-offline
Open

kafka: Add log directory offline status metric#23038
piochelepiotr wants to merge 2 commits intomasterfrom
piotr.wolski/kafka-log-dir-offline

Conversation

@piochelepiotr
Copy link
Copy Markdown
Contributor

Summary

  • Adds kafka.log.directory.offline metric from kafka.log:type=LogManager,name=LogDirectoryOffline,logDirectory=<path> JMX bean
  • Reports whether each log directory is offline (0 = healthy), tagged with log_directory path and kafka_cluster_id
  • Useful for monitoring disk health across Kafka brokers

Motivation

Kafka exposes per-log-directory health status via JMX. Monitoring this allows alerting when a log directory goes offline (e.g. disk failure), which can cause data loss if not addressed.

Test plan

  • Deploy against a Kafka cluster
  • Verify kafka.log.directory.offline is emitted with log_directory tag containing the path
  • Confirm value is 0 for healthy directories

🤖 Generated with Claude Code

Add kafka.log.directory.offline metric from the
kafka.log:type=LogManager,name=LogDirectoryOffline JMX bean.
Reports whether each log directory is offline (0 = healthy),
tagged with log_directory path and kafka_cluster_id.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 24, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.03%. Comparing base (3a624a5) to head (e0ad480).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5cf31a577d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

#
- include:
domain: 'kafka.log'
bean_regex: 'kafka\.log:type=LogManager,name=LogDirectoryOffline,logDirectory="(.*)"'
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Accept unquoted logDirectory in bean regex

The new matcher requires logDirectory="...", but this JMX key is frequently exposed as an unquoted value (for example logDirectory=/var/lib/kafka) depending on Kafka/JMX object-name serialization. In that common case this include rule never matches, so kafka.log.directory.offline is not emitted at all despite being added to metadata/tests. Please make the regex tolerate both quoted and unquoted forms to avoid silently dropping the metric.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants