Skip to content

Revert "ci(images): extend base image build timeout (#3881)"#3928

Closed
jyaunches wants to merge 2 commits into
mainfrom
revert/pr-3881-base-image-timeout
Closed

Revert "ci(images): extend base image build timeout (#3881)"#3928
jyaunches wants to merge 2 commits into
mainfrom
revert/pr-3881-base-image-timeout

Conversation

@jyaunches
Copy link
Copy Markdown
Contributor

@jyaunches jyaunches commented May 20, 2026

Summary

Verification

  • git revert --signoff --no-edit 3a473cf06afc9baa88cc8c13195f8fc38784d8ae
  • git diff --check HEAD^ HEAD

Summary by CodeRabbit

  • Chores
    • Reduced CI build timeout limits for base-image builds to speed up workflow throughput.
    • Made image build patching more resilient: if certain optional build-time values are absent, the build logs and continues instead of failing.

Note: No user-facing changes; updates are limited to build and infrastructure processes.

Review Change Stack

This reverts commit 3a473cf.

Signed-off-by: Julie Yaunches <jyaunches@nvidia.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 20, 2026

E2E Advisor Recommendation

Required E2E: test-e2e-sandbox, test-e2e-gateway-isolation, sandbox-operations-e2e, cloud-inference-e2e
Optional E2E: network-policy-e2e, messaging-compatible-endpoint-e2e, test-non-root-sandbox-smoke

Dispatch hint: sandbox-operations-e2e,cloud-inference-e2e

Auto-dispatched E2E: sandbox-operations-e2e, cloud-inference-e2e via nightly-e2e.yaml at 2ca60b1040b824568ec718a23fb998294dad38dfnightly run

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • test-e2e-sandbox (medium): Builds the production sandbox image from Dockerfile through the self-hosted PR image pipeline and runs the in-container sandbox E2E smoke suite, covering OpenClaw CLI availability, plugin install, blueprint runner behavior, and basic image integrity after the Dockerfile patch change.
  • test-e2e-gateway-isolation (medium): Validates gateway/sandbox isolation and entrypoint hardening on the production image. This is merge-blocking because the modified Dockerfile patch block is adjacent to OpenClaw proxy/SSRF and install-path security patches in the sandbox image.
  • sandbox-operations-e2e (high): Exercises real sandbox lifecycle operations including list, connect, status, logs, destroy, gateway recovery, registry rebuild, process recovery, and cross-sandbox isolation. This directly covers the handshake/connect timeout behavior referenced by the modified Patch 5 block.
  • cloud-inference-e2e (high): Runs a real cloud-backed assistant flow through inference.local and the OpenClaw runtime. Required because making the OpenClaw handshake timeout patch optional could regress live assistant turns or gateway/agent connection setup under CI load.

Optional E2E

  • network-policy-e2e (high): Useful adjacent confidence for the Dockerfile OpenClaw patch block because nearby patches alter fetch/proxy SSRF handling and the L7 proxy trust boundary, although this PR does not directly edit those commands.
  • messaging-compatible-endpoint-e2e (high): Exercises openclaw agent --json through a compatible endpoint and proxy rewrite path, providing extra signal that OpenClaw runtime networking still works when the Dockerfile patch set is applied or skipped as expected.
  • test-non-root-sandbox-smoke (low): Additional image-level smoke coverage for the production Dockerfile to confirm the sandbox still works for the non-root runtime user after image patch changes.

New E2E recommendations

  • openclaw-handshake-timeout-patch (high): Existing E2E jobs can detect broad connect/inference regressions, but there is no focused E2E that proves Patch 5 is applied when OpenClaw still has DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS, intentionally skipped only when the constant is truly gone, and not silently skipped on an upstream rename while the 10s limit remains elsewhere.
    • Suggested test: Add an OpenClaw handshake-timeout patch E2E that builds the sandbox image, inspects the installed OpenClaw dist for the effective preauth handshake timeout behavior, and runs several concurrent openclaw agent/connect attempts against a deliberately slow startup path.
  • base-image-publish-time-budget (medium): The base-image workflow timeout was reduced from 45 to 15 minutes for multi-arch builds, but PR E2E does not dry-run the base-image publish workflow or verify that amd64+arm64 base image builds fit the new timeout budget.
    • Suggested test: Add a workflow-dispatchable base-image build validation job that performs a no-push Docker Buildx build for Dockerfile.base and agents/hermes/Dockerfile.base under the intended timeout budget.

Dispatch hint

  • Workflow: nightly-e2e.yaml
  • jobs input: sandbox-operations-e2e,cloud-inference-e2e

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 20, 2026

PR Review Advisor

Recommendation: blocked
Confidence: high
Analyzed HEAD: 2ca60b1040b824568ec718a23fb998294dad38df
Findings: 1 blocker(s), 7 warning(s), 1 suggestion(s)

This is an automated advisory review. A human maintainer must make the final merge decision.

Limitations: Review used trusted supplied PR metadata, diff, and read-only inspection of .github/workflows/base-image.yaml and Dockerfile; no tests, Docker builds, workflow commands, package-manager commands, or PR scripts were executed.; Current CI was still pending/queued/in-progress for head 2ca60b1, so runtime validation of the restored timeout and Dockerfile Patch 5 behavior is unavailable.; Issue #3855 details were not included, so the likelihood of reintroducing the original base-image timeout failure cannot be fully assessed from supplied evidence.; The latest E2E Advisor check is still in progress for the current head; an older E2E Advisor comment exists but may not cover the newer Dockerfile commit.; CodeRabbit is still processing the latest changes and has selected Dockerfile for review, so automated review thread state may change after this advisory result.

Workflow run

Full advisor summary

PR Review Advisor

Base: origin/main
Head: HEAD
Analyzed SHA: 2ca60b1040b824568ec718a23fb998294dad38df
Recommendation: blocked
Confidence: high

Blocked by pending CI/merge state and incomplete validation for a trusted workflow timeout revert plus Dockerfile runtime patch behavior change.

Gate status

  • CI: pending — Trusted GraphQL status rollup for 2ca60b1 shows multiple pending/in-progress/queued contexts, including E2E recommendation, PR review advisor, CodeQL, unit-vitest-linux, checks, ShellCheck SARIF, build-sandbox-images, build-sandbox-images-arm64, and CodeRabbit.
  • Mergeability: fail — Trusted PR metadata reports mergeStateStatus=BLOCKED and REST mergeable_state=blocked for head 2ca60b1.
  • Review threads: warning — GraphQL reviewThreads.nodes is empty and reviewDecision is APPROVED, but CodeRabbit is still processing the latest Dockerfile change and the CodeRabbit status context is PENDING.
  • Risky code tested: fail — Risky area workflow/enforcement was detected; changed files include .github/workflows/base-image.yaml and Dockerfile, but no test files changed and relevant current-head build/runtime checks are still queued or in progress.

🔴 Blockers

  • Required gates are not complete for the current head SHA: The PR cannot be considered merge-ready while CI and mergeability gates are unresolved for 2ca60b1.
    • Recommendation: Wait for all required checks, especially CodeQL, unit-vitest-linux, build-sandbox-images, build-sandbox-images-arm64, PR review advisor, E2E recommendation, and CodeRabbit, to complete successfully for the current head SHA and resolve the BLOCKED merge state.
    • Evidence: Trusted context: ci.status=pending with 10 pending contexts; GraphQL mergeStateStatus=BLOCKED; REST mergeable_state=blocked.

🟡 Warnings

  • Standard base-image timeout reverts to the shorter limit that ci(images): extend base image build timeout #3881 was meant to fix (.github/workflows/base-image.yaml:45): The workflow reduces the multi-architecture build-and-push timeout from 45 minutes back to 15 minutes. Linked PR ci(images): extend base image build timeout #3881 states the longer timeout was added so Docker builds have more time before GitHub Actions cancels them and references fixing [Multi Platforms][Onboard] openclaw tui rejects onboard-generated config: "channels.openclaw-weixin: unknown channel id" #3855.
  • Hermes base-image timeout also reverts to the shorter limit (.github/workflows/base-image.yaml:93): The Hermes base-image job also builds linux/amd64 and linux/arm64 and pushes to GHCR, but its timeout is restored from 45 minutes to 15 minutes.
    • Recommendation: Require current-head evidence that build-and-push-hermes completes within 15 minutes, preferably from build-sandbox-images-arm64/build-sandbox-images or an equivalent trusted dry-run/build workflow.
    • Evidence: Diff changes build-and-push-hermes timeout-minutes from 45 to 15; linked PR ci(images): extend base image build timeout #3881 listed a separate change: "Increase the build-and-push-hermes job timeout from 15 minutes to 45 minutes."
  • Handshake timeout patch now skips silently when the exact OpenClaw constant is absent (Dockerfile:224): The Dockerfile changes Patch 5 from fail-close to tolerant: if grep does not find DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS = 1e4, the build continues with an INFO message. That may be correct if OpenClaw removed the constant, but it can also hide a syntax-only drift where the timeout still exists and remains too short.
    • Recommendation: Add or require current-head build/runtime evidence proving the new OpenClaw bundle no longer needs Patch 5, or tighten the detection to distinguish a removed timeout from a renamed/refactored still-active 10s cap.
    • Evidence: Diff changes grep ... DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS = 1e4 to grep ... || true and emits INFO: handshake-timeout constant not found; OpenClaw no longer needs Patch 5 instead of failing the build.
  • Risky workflow and Docker runtime changes lack completed validation for this head SHA: Unit tests cannot prove that multi-arch base-image publishing completes within restored timeouts or that OpenClaw runtime handshake behavior remains stable when Patch 5 is skipped.
    • Recommendation: Use current-head CI/build results as validation and consider adding a PR-safe base-image dry-run plus a runtime smoke test for OpenClaw connection startup when the timeout constant is absent.
    • Evidence: Trusted testDepth verdict=e2e_required; GraphQL shows E2E recommendation IN_PROGRESS and build-sandbox-images/build-sandbox-images-arm64 QUEUED for 2ca60b1.
  • Touched workflow is a trusted publish boundary with package write permission (.github/workflows/base-image.yaml:31): The workflow publishes sandbox base images to GHCR using secrets.GITHUB_TOKEN and packages: write. This PR does not expand the permission, trigger, or token usage, but workflow edits in this boundary require careful review.
    • Recommendation: Confirm the change remains limited to intended timeout values and that the workflow only runs in trusted contexts. Keep permissions minimal and avoid expanding publish conditions.
    • Evidence: Workflow permissions include contents: read and packages: write; login uses password: ${{ secrets.GITHUB_TOKEN }}; jobs are gated with if: github.repository == 'NVIDIA/NemoClaw'.
  • Dockerfile change expands the PR beyond the stated timeout-only revert (Dockerfile:224): The PR body says it reverts ci(images): extend base image build timeout #3881 and restores .github/workflows/base-image.yaml timeout values, but the current head also changes Dockerfile OpenClaw patch behavior.
    • Recommendation: Update the PR description and verification notes to cover the Dockerfile Patch 5 behavior change, including why fail-close is no longer appropriate for this OpenClaw timeout patch.
    • Evidence: PR body summary only mentions reverting ci(images): extend base image build timeout #3881 and restoring .github/workflows/base-image.yaml timeout values; changedFiles includes Dockerfile and diff changes the Patch 5 grep failure behavior.
  • Dockerfile overlaps many active PRs (Dockerfile): The Dockerfile is actively modified by several open PRs, including security/runtime-related work. This increases drift and conflict risk for OpenClaw patching and sandbox image behavior.

🔵 Suggestions

  • Several third-party Docker actions remain tag-pinned rather than SHA-pinned (.github/workflows/base-image.yaml:51): actions/checkout is pinned to a full commit SHA, but Docker actions are referenced by version tags. This is not introduced by the PR, but the workflow is a trusted-code boundary.
    • Recommendation: Consider pinning docker/setup-qemu-action, docker/setup-buildx-action, docker/login-action, docker/metadata-action, and docker/build-push-action to full commit SHAs in a separate hardening change.
    • Evidence: Workflow uses docker/setup-qemu-action@v4, docker/setup-buildx-action@v4, docker/login-action@v3, docker/metadata-action@v6, and docker/build-push-action@v6.

Acceptance coverage

Security review

  • pass — 1. Secrets and Credentials: No hardcoded secrets, API keys, passwords, or credential material are added. The workflow continues to reference secrets.GITHUB_TOKEN through GitHub Actions secret handling for GHCR login.
  • pass — 2. Input Validation and Data Sanitization: No new untrusted input parser is added. Existing workflow_dispatch openclaw_version validation remains an allowlisted numeric dotted-version grep before being passed as a build arg.
  • pass — 3. Authentication and Authorization: No application endpoint, user authentication, or authorization logic is changed. Existing publish jobs remain gated to github.repository == 'NVIDIA/NemoClaw' and use GitHub workflow permissions.
  • warning — 4. Dependencies and Third-Party Libraries: No dependency version is changed by this PR, but the touched trusted workflow still uses several Docker actions pinned only to major tags rather than full commit SHAs.
  • warning — 5. Error Handling and Logging: The Dockerfile Patch 5 path changes from fail-close to informational logging when the OpenClaw timeout constant is absent. This may be intentional for a removed constant, but can hide unexpected upstream drift unless validated.
  • pass — 6. Cryptography and Data Protection: Not applicable — no cryptographic operations or data-protection code are changed.
  • warning — 7. Configuration and Security Headers: The workflow has package write permission and publishes sandbox base images; the PR changes operational configuration by reducing timeouts. No trigger or permission expansion is introduced, but the trusted publish workflow requires careful review because failed/stale base image builds can affect sandbox supply-chain reliability.
  • warning — 8. Security Testing: No security or workflow validation tests are added, and relevant current-head build/runtime checks have not completed. For sandbox base-image publishing and Dockerfile OpenClaw patching, real CI/build execution is needed to validate behavior.
  • warning — 9. Holistic Security Posture: No direct sandbox escape, SSRF bypass, policy bypass, credential leakage, or blueprint tampering is newly introduced by the diff. However, reducing base-image build timeouts and skipping a runtime timeout patch on pattern absence may reduce reliability of security-sensitive sandbox image rebuilds and OpenClaw startup behavior if not validated.

Test / E2E status

  • Test depth: e2e_required — Runtime/sandbox/infrastructure paths need real execution coverage: .github/workflows/base-image.yaml changes trusted GHCR sandbox base-image build timeouts and Dockerfile changes OpenClaw patch application behavior. Unit tests cannot prove multi-architecture Docker builds complete within 15 minutes or that runtime handshake behavior remains correct when Patch 5 is skipped.
  • E2E Advisor: ambiguous
  • Required E2E jobs: E2E recommendation, build-sandbox-images, build-sandbox-images-arm64
  • Missing for analyzed SHA: E2E recommendation is IN_PROGRESS for 2ca60b1040b824568ec718a23fb998294dad38df, build-sandbox-images is QUEUED for 2ca60b1040b824568ec718a23fb998294dad38df, build-sandbox-images-arm64 is QUEUED for 2ca60b1040b824568ec718a23fb998294dad38df

✅ What looks good

  • The workflow diff is narrowly scoped to timeout-minutes values and does not expand triggers, repository guards, publish targets, or token expressions.
  • The base-image workflow retains contents: read and a repository guard before publishing images.
  • actions/checkout is pinned to a full commit SHA.
  • The Dockerfile Patch 5 change is localized and emits an INFO message when the expected OpenClaw constant is absent.
  • No hardcoded secret values or new credential files are introduced.

Review completeness

  • Review used trusted supplied PR metadata, diff, and read-only inspection of .github/workflows/base-image.yaml and Dockerfile; no tests, Docker builds, workflow commands, package-manager commands, or PR scripts were executed.
  • Current CI was still pending/queued/in-progress for head 2ca60b1, so runtime validation of the restored timeout and Dockerfile Patch 5 behavior is unavailable.
  • Issue [Multi Platforms][Onboard] openclaw tui rejects onboard-generated config: "channels.openclaw-weixin: unknown channel id" #3855 details were not included, so the likelihood of reintroducing the original base-image timeout failure cannot be fully assessed from supplied evidence.
  • The latest E2E Advisor check is still in progress for the current head; an older E2E Advisor comment exists but may not cover the newer Dockerfile commit.
  • CodeRabbit is still processing the latest changes and has selected Dockerfile for review, so automated review thread state may change after this advisory result.
  • Human maintainer review required: yes

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 20, 2026

📝 Walkthrough

Walkthrough

This PR shortens the base-image workflow job timeouts from 45 to 15 minutes and makes the Dockerfile's Patch 5 replacement for DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS conditional so the image build doesn't fail if the constant is absent.

Changes

Build timeout configuration & Dockerfile patch

Layer / File(s) Summary
Build and Hermes job timeout reduction
.github/workflows/base-image.yaml
The timeout-minutes for build-and-push (line 47) and build-and-push-hermes (line 97) are reduced from 45 to 15.
Conditional handshake timeout patch
Dockerfile
Patch 5 now runs a non-failing grep for DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS = 1e4, applies the 1e4 -> 6e4 sed replacement only when matches exist, and logs/skips the replacement if the constant is absent instead of failing the build.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

  • NVIDIA/NemoClaw#3881: Prior PR that modified the same workflow timeout settings for the same jobs.

Suggested labels

bug

Poem

🐰 In CI fields the pipelines leap,
From forty-five to fifteen they keep,
A gentle patch that checks and skips,
So builds won't fall on missing bits,
Hop on—these tweaks keep things neat. 🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title states a revert of a previous PR, but the actual changes show timeout reduction (reversed from 45→15 to presumably 15→45) plus a Dockerfile change for OpenClaw timeout handling. Clarify whether this is purely a revert or includes the Dockerfile fix. Consider: 'Revert base image timeout extension and handle missing OpenClaw constant' if both changes are intentional.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch revert/pr-3881-base-image-timeout

Comment @coderabbitai help to get the list of available commands and usage tips.

OpenClaw 2026.5.18 no longer exposes DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS, so the Dockerfile patch should skip Patch 5 instead of failing the sandbox image build.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@Dockerfile`:
- Around line 222-228: The grep/sed patch for
DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS should remain but add an
environment-variable fallback for OpenClaw 2026.5.18+ by setting
OPENCLAW_HANDSHAKE_TIMEOUT_MS to 60000 in the Dockerfile; keep the existing
conditional that logs "handshake-timeout constant not found" and, when that
branch is hit, ensure the Dockerfile exports ENV
OPENCLAW_HANDSHAKE_TIMEOUT_MS=60000 so the 60s timeout is applied when the
hardcoded DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS symbol is absent. Also ensure you
validate the change with the nightly E2E workflow mentioned
(cloud-e2e,sandbox-survival-e2e,hermes-e2e,rebuild-openclaw-e2e).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: b9b0e38b-51a4-4b50-9e75-fb3219a41205

📥 Commits

Reviewing files that changed from the base of the PR and between e86399c and 2ca60b1.

📒 Files selected for processing (1)
  • Dockerfile

Comment thread Dockerfile
Comment on lines +222 to +228
hto_files="$(grep -RIlE --include='*.js' 'DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS = 1e4' "$OC_DIST" || true)"; \
if [ -n "$hto_files" ]; then \
printf '%s\n' "$hto_files" | xargs sed -i -E 's|DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS = 1e4|DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS = 6e4|g'; \
if grep -REq --include='*.js' 'DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS = 1e4' "$OC_DIST"; then echo "ERROR: Patch 5 left a 1e4 constant" >&2; exit 1; fi; \
else \
echo "INFO: handshake-timeout constant not found; OpenClaw no longer needs Patch 5"; \
fi
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Search for handshake timeout constants in OpenClaw to detect renames

# Check if the old constant name exists
rg -n 'DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS' --type js || echo "Old constant not found (expected)"

# Search for similar timeout constants that might be the renamed version
rg -n 'HANDSHAKE.*TIMEOUT|PREAUTH.*TIMEOUT|CONNECT.*TIMEOUT' --type js -A2 -B2 || echo "No related timeout constants found"

# Check OpenClaw version
openclaw --version

Repository: NVIDIA/NemoClaw

Length of output: 196


🌐 Web query:

OpenClaw 2026.5.18 release notes handshake timeout configuration

💡 Result:

In OpenClaw, the WebSocket handshake timeout is a configurable setting that governs how long the gateway waits for a client to complete the pre-authentication handshake [1][2]. As of the OpenClaw 2026.5.18 release, the default handshake timeout is 15,000 milliseconds (15 seconds) [1][2]. This setting was previously adjusted from a highly aggressive 3-second default (which caused spurious connection failures on slower hardware or during high event-loop load) to the current 15-second default to better accommodate cold-start scenarios [1][3][4]. To configure this value, you can add the handshakeTimeoutMs field to the gateway section of your OpenClaw configuration file (typically located at ~/.openclaw/openclaw.json) [2]: { "gateway": { "handshakeTimeoutMs": 30000 } } The accepted range for this configuration value is 1,000 to 120,000 milliseconds [1]. Additionally, the environment variable OPENCLAW_HANDSHAKE_TIMEOUT_MS can be used to override this setting for one-off service or shell sessions, as it takes precedence over the configuration file [1][2]. Note that the official documentation recommends addressing the underlying causes of startup or event-loop stalls, such as plugin loading times or hardware constraints, before relying on this timeout knob [2].

Citations:


Update the patch strategy for OpenClaw 2026.5.18+: the hardcoded constant no longer exists; use environment variables instead.

The DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS constant has been removed from OpenClaw 2026.5.18, so the conditional logic correctly skips the patch when not found. However, OpenClaw changed its configuration model entirely—the timeout is now set via the OPENCLAW_HANDSHAKE_TIMEOUT_MS environment variable or config file, not a hardcoded constant.

The build will succeed but the 60-second timeout won't be applied with OpenClaw 2026.5.18+, reverting to the new default of 15 seconds. If the 60-second timeout is critical to prevent handshake failures under load (as noted in lines 208–212), set it via environment variable in the Dockerfile instead:

ENV OPENCLAW_HANDSHAKE_TIMEOUT_MS=60000

Per coding guidelines, validate with E2E tests:

gh workflow run nightly-e2e.yaml --ref <branch> -f jobs=cloud-e2e,sandbox-survival-e2e,hermes-e2e,rebuild-openclaw-e2e
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Dockerfile` around lines 222 - 228, The grep/sed patch for
DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS should remain but add an
environment-variable fallback for OpenClaw 2026.5.18+ by setting
OPENCLAW_HANDSHAKE_TIMEOUT_MS to 60000 in the Dockerfile; keep the existing
conditional that logs "handshake-timeout constant not found" and, when that
branch is hit, ensure the Dockerfile exports ENV
OPENCLAW_HANDSHAKE_TIMEOUT_MS=60000 so the 60s timeout is applied when the
hardcoded DEFAULT_PREAUTH_HANDSHAKE_TIMEOUT_MS symbol is absent. Also ensure you
validate the change with the nightly E2E workflow mentioned
(cloud-e2e,sandbox-survival-e2e,hermes-e2e,rebuild-openclaw-e2e).

@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26191022422
Target ref: 2ca60b1040b824568ec718a23fb998294dad38df
Workflow ref: main
Requested jobs: sandbox-operations-e2e,cloud-inference-e2e
Summary: 2 passed, 0 failed, 0 skipped

Job Result
cloud-inference-e2e ✅ success
sandbox-operations-e2e ✅ success

@ericksoa ericksoa closed this May 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants