Skip to content

docs: sediment trap #4 image tag alias + V(1) default-off case docs#228

Merged
weicao merged 2 commits into
mainfrom
helen/t6-sediment-trap4-v1-20260519
May 19, 2026
Merged

docs: sediment trap #4 image tag alias + V(1) default-off case docs#228
weicao merged 2 commits into
mainfrom
helen/t6-sediment-trap4-v1-20260519

Conversation

@weicao
Copy link
Copy Markdown
Contributor

@weicao weicao commented May 18, 2026

Summary

  • Add an InstanceSet-specific troubleshooting guide for sideloaded image tag alias readiness failures.
  • Add a controller logger.V(N) log-level verification guide for diagnostic patch images.
  • Add two MariaDB case appendices with public artifact SHA anchors.
  • Update docs/test/README.md, docs/troubleshoot/README.md, and the MariaDB case count in docs/SKILL-INDEX.md.

Scope boundary

The image tag alias guide now defers the root mechanism and generic exits to the existing patch-image handoff guide. This PR only keeps the InstanceSet readiness symptom, diagnosis, and multi-node cleanup requirement.

The V(N) guide uses calibrated version-skew wording: the default filtering pattern is stable, but the live flag name, controller-runtime mapping, and log output format must be verified on the target controller version.

Test plan

  • git diff --check origin/main...HEAD
  • Changed markdown links resolve.
  • Methodology docs remain under 150 lines; case docs remain under 120 lines.
  • PR body, commit messages, and changed docs public-hygiene grep clean.

@weicao weicao force-pushed the helen/t6-sediment-trap4-v1-20260519 branch from d031e47 to 0601fb4 Compare May 18, 2026 18:43
@weicao
Copy link
Copy Markdown
Contributor Author

weicao commented May 19, 2026

Docs gate HOLD for now. The topics are useful, but this PR needs a public-packaging and rebase pass before merge.

Blocking items:

  1. Rebase on current main first.

    • main now already has the controller docs area and the sideload tag-aliasing section in docs/agent-collab/addon-patch-image-build-handoff-roles-guide.md from PR docs(agent-collab): document sideload tag aliasing trap and two exits #225.
    • After rebase, decide whether docs/troubleshoot/addon-instanceset-image-tag-alias-readiness-trap-guide.md is still needed as a narrower InstanceSet troubleshooting guide. If it stays, link to the existing handoff guide and keep only the troubleshooting delta; do not duplicate the general sideload handoff rule body.
  2. Remove private chat provenance from public docs and PR body.

    • The case files currently contain channel/message anchors such as #mariadb:... msg=... and named agent routing.
    • The PR body also says “Agent peer review by @jack”.
    • Public docs should use neutral wording such as “chat record”, public artifact paths / sha values, PR numbers, and role names. Avoid Slock channel IDs, msg IDs, and internal reviewer-routing history.
  3. Re-check version-skew wording.

    • addon-kb-controller-v1-log-level-verification-guide.md says Affected by version skew: no. The default may be stable, but controller-runtime/logging flag support and log output format are still version-sensitive. Please make this a calibrated yes statement and tell readers what to verify live.
  4. Re-run the public hygiene gate after the rewrite.

    • git diff --check origin/main...HEAD
    • PR body clean of internal routing / msg IDs / generated-by or tool provenance
    • commit messages clean across origin/main..HEAD
    • changed docs clean of private channel/message IDs and named peer-review process

Once those are fixed, I can re-run the gate. The V(1) log-level guide looks like the stronger independent piece; the image-tag-alias part needs deduplication against the guide already merged in PR #225.

Two methodology guides + two MariaDB case appendices, written under
"one doc per topic", with artifact sha anchors so future readers can
verify without re-narrating recent lanes.

1. troubleshoot/addon-instanceset-image-tag-alias-readiness-trap-guide.md
   - InstanceSet isImageMatched strict tag equality fails when a node
     ctr has the same digest aliased to multiple tags; pod spec.image
     and status.image diverge in text but agree on imageID; cluster
     stuck in Updating.
   - Defers to agent-collab/addon-patch-image-build-handoff-roles-guide.md
     for the underlying mechanism + the two generic fix exits (delete
     base alias / chart digest pin). This guide only keeps the
     InstanceSet-specific symptom + multi-node coverage requirement.
   - Diagnosis (jsonpath spec vs status), narrow fix applied on every
     hit node (not just current pod's host), recurrence-prevention
     checklist.
   - Linked from troubleshoot/README.md.

2. test/addon-kb-controller-v1-log-level-verification-guide.md
   - logger.V(1).Info(...) is silently filtered at the default
     --zap-log-level=info. A diagnostic PR that adds V(1) lines will
     produce zero matches under default deployment.
   - Canonical enable: --zap-log-level=1 (numeric form). Version-skew
     note calibrated: V(N) filtering default is engine-neutral, but the
     actual enabling flag / controller-runtime mapping / log output
     format need live verification per controller version.
   - Workflow: read PR diff for V(N) vs Info(), check controller args,
     enable, restart, re-record imageID/startTime/restartCount so the
     rollout itself is in evidence, then re-grep.
   - Linked from test/README.md.

3. Two case appendices live under cases/mariadb/, documenting the
   actual T6 lane recurrences from 2026-05-18 / 19 with artifact
   tar shas backing each timeline row.

SKILL-INDEX cases/mariadb count bumped 13 -> 15.
@weicao weicao force-pushed the helen/t6-sediment-trap4-v1-20260519 branch from 0601fb4 to c35ac0f Compare May 19, 2026 05:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant