Revert "ci(images): extend base image build timeout (#3881)"#3928
Revert "ci(images): extend base image build timeout (#3881)"#3928jyaunches wants to merge 2 commits into
Conversation
This reverts commit 3a473cf. Signed-off-by: Julie Yaunches <jyaunches@nvidia.com>
E2E Advisor RecommendationRequired E2E: Dispatch hint: Auto-dispatched E2E: Full advisor summaryE2E Recommendation AdvisorBase: Required E2E
Optional E2E
New E2E recommendations
Dispatch hint
|
PR Review AdvisorRecommendation: blocked This is an automated advisory review. A human maintainer must make the final merge decision. Limitations: Review used trusted supplied PR metadata, diff, and read-only inspection of .github/workflows/base-image.yaml and Dockerfile; no tests, Docker builds, workflow commands, package-manager commands, or PR scripts were executed.; Current CI was still pending/queued/in-progress for head 2ca60b1, so runtime validation of the restored timeout and Dockerfile Patch 5 behavior is unavailable.; Issue #3855 details were not included, so the likelihood of reintroducing the original base-image timeout failure cannot be fully assessed from supplied evidence.; The latest E2E Advisor check is still in progress for the current head; an older E2E Advisor comment exists but may not cover the newer Dockerfile commit.; CodeRabbit is still processing the latest changes and has selected Dockerfile for review, so automated review thread state may change after this advisory result. Full advisor summaryPR Review AdvisorBase: Blocked by pending CI/merge state and incomplete validation for a trusted workflow timeout revert plus Dockerfile runtime patch behavior change. Gate status
🔴 Blockers
🟡 Warnings
🔵 Suggestions
Acceptance coverage
Security review
Test / E2E status
✅ What looks good
Review completeness
|
📝 WalkthroughWalkthroughThis PR shortens the base-image workflow job timeouts from 45 to 15 minutes and makes the Dockerfile's Patch 5 replacement for DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS conditional so the image build doesn't fail if the constant is absent. ChangesBuild timeout configuration & Dockerfile patch
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
OpenClaw 2026.5.18 no longer exposes DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS, so the Dockerfile patch should skip Patch 5 instead of failing the sandbox image build.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@Dockerfile`:
- Around line 222-228: The grep/sed patch for
DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS should remain but add an
environment-variable fallback for OpenClaw 2026.5.18+ by setting
OPENCLAW_HANDSHAKE_TIMEOUT_MS to 60000 in the Dockerfile; keep the existing
conditional that logs "handshake-timeout constant not found" and, when that
branch is hit, ensure the Dockerfile exports ENV
OPENCLAW_HANDSHAKE_TIMEOUT_MS=60000 so the 60s timeout is applied when the
hardcoded DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS symbol is absent. Also ensure you
validate the change with the nightly E2E workflow mentioned
(cloud-e2e,sandbox-survival-e2e,hermes-e2e,rebuild-openclaw-e2e).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: b9b0e38b-51a4-4b50-9e75-fb3219a41205
📒 Files selected for processing (1)
Dockerfile
| hto_files="$(grep -RIlE --include='*.js' 'DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS = 1e4' "$OC_DIST" || true)"; \ | ||
| if [ -n "$hto_files" ]; then \ | ||
| printf '%s\n' "$hto_files" | xargs sed -i -E 's|DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS = 1e4|DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS = 6e4|g'; \ | ||
| if grep -REq --include='*.js' 'DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS = 1e4' "$OC_DIST"; then echo "ERROR: Patch 5 left a 1e4 constant" >&2; exit 1; fi; \ | ||
| else \ | ||
| echo "INFO: handshake-timeout constant not found; OpenClaw no longer needs Patch 5"; \ | ||
| fi |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Search for handshake timeout constants in OpenClaw to detect renames
# Check if the old constant name exists
rg -n 'DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS' --type js || echo "Old constant not found (expected)"
# Search for similar timeout constants that might be the renamed version
rg -n 'HANDSHAKE.*TIMEOUT|PREAUTH.*TIMEOUT|CONNECT.*TIMEOUT' --type js -A2 -B2 || echo "No related timeout constants found"
# Check OpenClaw version
openclaw --versionRepository: NVIDIA/NemoClaw
Length of output: 196
🌐 Web query:
OpenClaw 2026.5.18 release notes handshake timeout configuration
💡 Result:
In OpenClaw, the WebSocket handshake timeout is a configurable setting that governs how long the gateway waits for a client to complete the pre-authentication handshake [1][2]. As of the OpenClaw 2026.5.18 release, the default handshake timeout is 15,000 milliseconds (15 seconds) [1][2]. This setting was previously adjusted from a highly aggressive 3-second default (which caused spurious connection failures on slower hardware or during high event-loop load) to the current 15-second default to better accommodate cold-start scenarios [1][3][4]. To configure this value, you can add the handshakeTimeoutMs field to the gateway section of your OpenClaw configuration file (typically located at ~/.openclaw/openclaw.json) [2]: { "gateway": { "handshakeTimeoutMs": 30000 } } The accepted range for this configuration value is 1,000 to 120,000 milliseconds [1]. Additionally, the environment variable OPENCLAW_HANDSHAKE_TIMEOUT_MS can be used to override this setting for one-off service or shell sessions, as it takes precedence over the configuration file [1][2]. Note that the official documentation recommends addressing the underlying causes of startup or event-loop stalls, such as plugin loading times or hardware constraints, before relying on this timeout knob [2].
Citations:
- 1: gateway: add gateway.handshakeTimeoutMs config and raise default to 15s openclaw/openclaw#51503
- 2: https://docs.openclaw.ai/gateway/configuration
- 3: bug: DEFAULT_HANDSHAKE_TIMEOUT_MS too aggressive (3s) — causes spurious disconnects openclaw/openclaw#47931
- 4: [Bug]:
openclaw logs --followfails after 2026.3.12 upgrade with handshake timeout while gateway remains healthy openclaw/openclaw#44714
Update the patch strategy for OpenClaw 2026.5.18+: the hardcoded constant no longer exists; use environment variables instead.
The DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS constant has been removed from OpenClaw 2026.5.18, so the conditional logic correctly skips the patch when not found. However, OpenClaw changed its configuration model entirely—the timeout is now set via the OPENCLAW_HANDSHAKE_TIMEOUT_MS environment variable or config file, not a hardcoded constant.
The build will succeed but the 60-second timeout won't be applied with OpenClaw 2026.5.18+, reverting to the new default of 15 seconds. If the 60-second timeout is critical to prevent handshake failures under load (as noted in lines 208–212), set it via environment variable in the Dockerfile instead:
ENV OPENCLAW_HANDSHAKE_TIMEOUT_MS=60000Per coding guidelines, validate with E2E tests:
gh workflow run nightly-e2e.yaml --ref <branch> -f jobs=cloud-e2e,sandbox-survival-e2e,hermes-e2e,rebuild-openclaw-e2e
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@Dockerfile` around lines 222 - 228, The grep/sed patch for
DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS should remain but add an
environment-variable fallback for OpenClaw 2026.5.18+ by setting
OPENCLAW_HANDSHAKE_TIMEOUT_MS to 60000 in the Dockerfile; keep the existing
conditional that logs "handshake-timeout constant not found" and, when that
branch is hit, ensure the Dockerfile exports ENV
OPENCLAW_HANDSHAKE_TIMEOUT_MS=60000 so the 60s timeout is applied when the
hardcoded DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS symbol is absent. Also ensure you
validate the change with the nightly E2E workflow mentioned
(cloud-e2e,sandbox-survival-e2e,hermes-e2e,rebuild-openclaw-e2e).
Selective E2E Results — ✅ All requested jobs passedRun: 26191022422
|
Summary
ci(images): extend base image build timeout).github/workflows/base-image.yamltimeout values to the pre-ci(images): extend base image build timeout #3881 stateVerification
git revert --signoff --no-edit 3a473cf06afc9baa88cc8c13195f8fc38784d8aegit diff --check HEAD^ HEADSummary by CodeRabbit
Note: No user-facing changes; updates are limited to build and infrastructure processes.