docs: add cluster Running gate budget sizing guide by weicao · Pull Request #123 · apecloud/kubeblocks-addon-docs

weicao · 2026-05-13T11:56:32Z

Summary

New methodology doc addon-cluster-running-gate-budget-guide.md covering how to size test-runner T1 wait_cluster_running budget based on env P99 (not product normal start time), and how to classify the failure layer when the budget is exceeded.

Body (generic methodology, version-agnostic, no engine binding):

§1: Anti-patterns (using product normal time / using arbitrary huge number)
§2: 5 hard rules — env-variable budget, P99+50% default from env sample distribution, 4 observation outputs on timeout, generic exit reason for upper-layer classification, 3-level budget nesting (caller > helper > kubectl --request-timeout)
§2 Rule C: 7-row layer classification hint table
§2 Rule D: "exceeded budget is NOT product RED until classified"
§3: 6-point PR review checklist
§4: 2 anti-pattern vs correct-pattern pairs
§5: relation to addon-bounded-eventual-convergence-guide.md, addon-postready-bounded-timeout-failure-classification-guide.md, addon-test-acceptance-and-first-blocker-guide.md, addon-evidence-discipline-guide.md

Appendix A is OceanBase enterprise addon postreadyfix3 (1200s false-positive due to env-slow-start cold pull) → postreadyfix4 (1800s default after env P99+50% calibration) case. Explicit boundary: 1800s default only calibrated for idc4 vcluster + apecloud registry + apelocal-hostpath SC scope; other env must independently calibrate. 6 subsequent samples T1 <= 4min, NOT product validation / NOT release-ready.

SKILL-INDEX.md updated: added entry under ### 5. 改造 runner / 工具链.

Test plan

Manual: methodology body has zero OB-specific commands
Manual: appendix has explicit boundary
Manual: cross-doc references resolve

🤖 Generated with Claude Code

Methodology body covers: - Why test-runner T1 budget must be sized by env P99 (not product normal) - 5 hard rules: env-variable budget, P99+50% default, 4 observation outputs on timeout, generic exit reason for upper-layer classification, 3-level budget nesting (caller > helper > kubectl --request-timeout) - 7-row layer classification hint table (env scheduler / env image-pull / env storage / env node-pressure / control-plane / product / k8s API entry) - 6-point PR review checklist - 2 anti-pattern pairs Appendix A is OceanBase enterprise addon postreadyfix3 (1200s false-positive) vs postreadyfix4 (1800s default after env P99+50% calibration) case with explicit "do not extrapolate to other env" boundary; 6 subsequent samples under same env scope had T1 <= 4min, NOT product validation.

weicao · 2026-05-13T17:19:25Z

Blocking for merge:

PR body still contains 🤖 Generated with [Claude Code].... Public PR body must not include AI/tool attribution.
The new guide intro has only 4 standard fields. Please add > **Affected by version skew**: ... after Applies to KB version.
This PR links to addon-postready-bounded-timeout-failure-classification-guide.md from PR docs: add postReady bounded timeout + failure classification guide #122, which is not on main yet. Either merge/rebase after docs: add postReady bounded timeout + failure classification guide #122 lands, or mark the cross-link as pending and keep this PR blocked until the target exists.

Content framing is otherwise in the right shape: budget by env distribution, classify before product RED, and appendix scope stays narrow.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: add cluster Running gate budget sizing guide#123

docs: add cluster Running gate budget sizing guide#123
weicao wants to merge 1 commit into
mainfrom
feature/addon-cluster-running-gate-budget

weicao commented May 13, 2026

Uh oh!

weicao commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

weicao commented May 13, 2026

Summary

Test plan

Uh oh!

weicao commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant