Skip to content

fix(debug): explain restricted dmesg output#3890

Open
jneeee wants to merge 2 commits into
NVIDIA:mainfrom
jneeee:fix/debug-dmesg-nonroot-hint-signed
Open

fix(debug): explain restricted dmesg output#3890
jneeee wants to merge 2 commits into
NVIDIA:mainfrom
jneeee:fix/debug-dmesg-nonroot-hint-signed

Conversation

@jneeee
Copy link
Copy Markdown
Contributor

@jneeee jneeee commented May 20, 2026

Summary

Improve nemoclaw debug --quick so restricted dmesg output is reported as an intentional skipped diagnostic instead of a raw Operation not permitted error. This makes debug reports clearer for non-root Linux users when kernel.dmesg_restrict=1 blocks kernel ring buffer access.

Related Issue

Fixes #3738

Changes

  • Detect restricted non-root dmesg failures in the debug diagnostics path.
  • Replace the raw dmesg: read kernel buffer failed: Operation not permitted output with an actionable skipped message that explains kernel.dmesg_restrict=1.
  • Add regression coverage for the restricted-dmesg behavior while preserving normal debug output behavior.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • make docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Additional checks run:

  • npm run build:cli
  • npx vitest run src/lib/diagnostics/debug.test.ts
  • git diff --check upstream/main...HEAD

Notes:

  • Local pre-push broad Vitest hook was not used as the gating signal for this small debug fix because it failed unrelated broad-suite checks in the local environment (7 failed | 310 passed | 1 skipped, 114 failed | 3760 passed | 8 skipped). The targeted debug test and CLI build both pass on this branch.
  • Commit is GPG-signed and DCO signed off.

Signed-off-by: John Liu lijohn@nvidia.com

Summary by CodeRabbit

  • Bug Fixes

    • Improved diagnostics collection to respect system-level kernel message access restrictions. When restricted for non-root users, the diagnostic process now gracefully skips this step instead of failing.
  • Tests

    • Added comprehensive test coverage for detecting and handling kernel message access restrictions across different user privilege levels.

Review Change Stack

Signed-off-by: John Liu <lijohn@nvidia.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 20, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 407031c3-bb8e-4843-b7dd-35b4fd9d331a

📥 Commits

Reviewing files that changed from the base of the PR and between 11b1937 and 12d5623.

📒 Files selected for processing (2)
  • src/lib/diagnostics/debug.test.ts
  • src/lib/diagnostics/debug.ts

📝 Walkthrough

Walkthrough

Adds permission-aware gating for kernel message diagnostics. Detects when non-root users lack dmesg access via /proc/sys/kernel/dmesg_restrict and /proc/self/stat, then skips dmesg execution and logs a diagnostic skip message instead of surfacing raw permission errors.

Changes

Dmesg Permission-Aware Collection

Layer / File(s) Summary
Permission check helper functions
src/lib/diagnostics/debug.ts
Added isDmesgRestrictedForCurrentUser() to detect whether dmesg is restricted for non-root users by reading /proc/sys/kernel/dmesg_restrict and checking effective UID, plus writeSkippedDiagnostic() helper to record skip messages in the diagnostics output directory.
Conditional kernel message collection
src/lib/diagnostics/debug.ts
Updated collectKernelMessages() to check permissions before executing dmesg; when restricted, writes a skip diagnostic message instead of running dmesg.
Permission restriction test coverage
src/lib/diagnostics/debug.test.ts
Added test suite validating that isDmesgRestrictedForCurrentUser returns true for non-root users when dmesg is restricted, and false for root users or missing restriction state files. Updated test imports to include the new exported function.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 When dmesg denies, our bunny now explains,
No raw permission woes in debug's refrains,
A gentle skip message, kind and quite clear,
"Non-root needs sudo" rings loud for the ear!
Hop forward with grace, let confusion give way. 🌟

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(debug): explain restricted dmesg output' accurately summarizes the main change: handling restricted dmesg access by providing explanatory output instead of raw error messages.
Linked Issues check ✅ Passed The code changes implement the primary objective from #3738: detecting dmesg restrictions for non-root users and replacing raw 'Operation not permitted' errors with actionable skip messages mentioning kernel.dmesg_restrict.
Out of Scope Changes check ✅ Passed All changes are directly related to addressing dmesg restriction detection and conditional skipping. No unrelated modifications to unrelated systems or components are present.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.


Comment @coderabbitai help to get the list of available commands and usage tips.

@wscurran wscurran added fix NemoClaw CLI Use this label to identify issues with the NemoClaw command-line interface (CLI). labels May 20, 2026
@wscurran
Copy link
Copy Markdown
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fix NemoClaw CLI Use this label to identify issues with the NemoClaw command-line interface (CLI).

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[All Platforms][CLI] nemoclaw debug --quick reports raw dmesg: ... not permitted without context

3 participants