docs: add cluster Running gate budget sizing guide#123
Open
weicao wants to merge 1 commit into
Open
Conversation
Methodology body covers: - Why test-runner T1 budget must be sized by env P99 (not product normal) - 5 hard rules: env-variable budget, P99+50% default, 4 observation outputs on timeout, generic exit reason for upper-layer classification, 3-level budget nesting (caller > helper > kubectl --request-timeout) - 7-row layer classification hint table (env scheduler / env image-pull / env storage / env node-pressure / control-plane / product / k8s API entry) - 6-point PR review checklist - 2 anti-pattern pairs Appendix A is OceanBase enterprise addon postreadyfix3 (1200s false-positive) vs postreadyfix4 (1800s default after env P99+50% calibration) case with explicit "do not extrapolate to other env" boundary; 6 subsequent samples under same env scope had T1 <= 4min, NOT product validation.
Contributor
Author
|
Blocking for merge:
Content framing is otherwise in the right shape: budget by env distribution, classify before product RED, and appendix scope stays narrow. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
New methodology doc
addon-cluster-running-gate-budget-guide.mdcovering how to size test-runner T1wait_cluster_runningbudget based on env P99 (not product normal start time), and how to classify the failure layer when the budget is exceeded.Body (generic methodology, version-agnostic, no engine binding):
--request-timeout)addon-bounded-eventual-convergence-guide.md,addon-postready-bounded-timeout-failure-classification-guide.md,addon-test-acceptance-and-first-blocker-guide.md,addon-evidence-discipline-guide.mdAppendix A is OceanBase enterprise addon postreadyfix3 (1200s false-positive due to env-slow-start cold pull) → postreadyfix4 (1800s default after env P99+50% calibration) case. Explicit boundary: 1800s default only calibrated for idc4 vcluster + apecloud registry + apelocal-hostpath SC scope; other env must independently calibrate. 6 subsequent samples T1 <= 4min, NOT product validation / NOT release-ready.
SKILL-INDEX.md updated: added entry under
### 5. 改造 runner / 工具链.Test plan
🤖 Generated with Claude Code