docs: add how-to guide for debugging Kubernetes charms by tonyandrewmeyer · Pull Request #2498 · canonical/operator

tonyandrewmeyer · 2026-05-22T07:38:36Z

This PR adds a follow-on guide to the recent how-to for debugging, specifically focused on K8s charms and using Pebble.

At the recent sprints we received a couple of comments that more information was needed for debugging in this specific case, so this is addressing those.

The main focus is on Pebble, but there's a little bit for K8s directly, without going all the way into being a guide for debugging K8s itself.

Preview

Fixes #2489

dwilding

Thanks for compiling this! I need to review in more detail, but have taken a first pass.

I think it would be easier for people to orient themselves if we move "Common failure modes" nearer the beginning of the doc - probably after "Know which container you’re looking at". I think that section is a great quick reference and should be (slightly) expanded by migrating other content from around the doc.

I've commented on the pieces I think we should move.

My thinking is that we should make a cleaner split between the why and the how. If you already know why you need to be reading a particular section, there should be minimal intro text. Get right into the how. But if you don't know which section you should be reading, "Common failure modes" points you in the right direction and helps you understand why.

Let me know if you'd like to discuss this suggestion together. I'm also very happy to experiment with different structures if that would help.

dwilding · 2026-05-26T00:37:16Z

+```{tip}
+If [`Container.can_connect()`](ops.Container.can_connect) returns `False` or your charm raises [`ops.pebble.ConnectionError`](ops.pebble.ConnectionError), the charm container cannot reach the workload's Pebble over that socket. This usually means the workload container hasn't started yet (no [`PebbleReadyEvent`](ops.PebbleReadyEvent) has fired) -- look at the pod first (see [](#k8s-inspect-the-pod)), not at your charm code.
+```


Suggest migrating this tip

dwilding · 2026-05-26T00:38:32Z

+(k8s-debug-from-charm-container)=
+## Debug from the charm container
+
+Many production workload images are stripped down to just the application -- with no shell or utilities -- so `juju ssh --container` lands you nowhere useful. You can still run Pebble commands against that workload from the charm container, because the workload's socket is mounted there:


Suggest migrating most of this

dwilding · 2026-05-26T00:38:56Z

+(k8s-inspect-the-pod)=
+## Inspect the pod at the Kubernetes layer
+
+When a unit is stuck before Pebble is even reachable -- the container is `waiting`, the image won't pull, or the pod won't schedule -- the answer is below Juju, at the Kubernetes layer. Juju puts each model in its own namespace, and names each unit's pod `<app>-<unit-number>`.


Suggest migrating the first sentence

dwilding · 2026-05-26T00:39:17Z

+(k8s-common-failure-modes)=
+## Common failure modes
+
+| Symptom | Where to look |
+| --- | --- |
+| Charm stuck in `maintenance`/`waiting`; `can_connect()` is `False` | The workload container hasn't started -- `kubectl describe pod` for image-pull or scheduling errors ([](#k8s-inspect-the-pod)). |
+| Service shows `backoff` or `error` | `pebble logs` for the crash output, then `pebble changes` / `pebble tasks` for the start failure ([](#k8s-pebble-cli)). |
+| Config change has no effect on the running process | The charm added a layer but didn't [`replan`](#run-workloads-with-a-charm-kubernetes-replan); confirm with `pebble plan` and `pebble services`. |
+| Charm raises `ConnectionError` mid-handler | The workload's Pebble became unreachable -- guard Pebble calls with `try`/`except` rather than `can_connect()` ([](ops.Container.can_connect)). |
+| `pebble_custom_notice` never fires | Confirm the notice was recorded with `pebble notices`; check the `key` your handler matches on ([](#k8s-pebble-cli)). |
+| Workload won't go ready despite running | A health check is failing -- `pebble checks` and `pebble check <name> --refresh` ([](#k8s-pebble-cli)). |


Suggest migrating this

docs: add how-to guide for debugging Kubernetes charms

84e701f

tonyandrewmeyer requested review from dwilding and hpidcock May 22, 2026 07:38

Merge branch 'main' into docs-debug-k8s

bb71eb3

tonyandrewmeyer commented May 25, 2026

View reviewed changes

Comment thread docs/howto/debug-a-kubernetes-charm.md Outdated

Apply suggestion from @tonyandrewmeyer

7477801

dwilding reviewed May 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: add how-to guide for debugging Kubernetes charms#2498

docs: add how-to guide for debugging Kubernetes charms#2498
tonyandrewmeyer wants to merge 3 commits into
canonical:mainfrom
tonyandrewmeyer:docs-debug-k8s

tonyandrewmeyer commented May 22, 2026 •

edited

Loading

Uh oh!

Uh oh!

dwilding left a comment •

edited

Loading

Uh oh!

dwilding May 26, 2026

Uh oh!

dwilding May 26, 2026

Uh oh!

dwilding May 26, 2026

Uh oh!

dwilding May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tonyandrewmeyer commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

dwilding left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dwilding May 26, 2026

Choose a reason for hiding this comment

Uh oh!

dwilding May 26, 2026

Choose a reason for hiding this comment

Uh oh!

dwilding May 26, 2026

Choose a reason for hiding this comment

Uh oh!

dwilding May 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tonyandrewmeyer commented May 22, 2026 •

edited

Loading

dwilding left a comment •

edited

Loading