fix: fully detach manifest service starts so Daytona evals don't wedge (#676)#734
Open
ElegantLin wants to merge 1 commit into
Open
fix: fully detach manifest service starts so Daytona evals don't wedge (#676)#734ElegantLin wants to merge 1 commit into
ElegantLin wants to merge 1 commit into
Conversation
#676) ManifestEnvironment.provision() started declared services with `{cmd} > {log} 2>&1 &` — stdout/stderr redirected, but stdin left attached and no nohup. On Daytona, `exec` is a *session* command that only reports an exit code once nothing holds the session's streams, so a backgrounded service inheriting the session fds keeps the call "running" until the hard timeout cap (#670) — the manifest eval wedges (~1h) instead of starting. Docker is unaffected (`communicate()` returns when the foreground shell exits), so the bug is invisible locally and in CI. Same failure class as the service-hook wedge root-caused in smolclaws#85. Fix: start services with `nohup {cmd} </dev/null >{log} 2>&1 &` — all three fds redirected (the missing `</dev/null` is the operative change) plus nohup to detach from the session. No `disown`: it is a bash/zsh builtin and the Daytona DinD shell is `sh`/dash. The existing test only asserted the start command ends with `&`, which is why the missing stdin redirect slipped through. Added test_provision_fully_detaches_service_fds asserting nohup + all three fd redirections + no disown; verified to fail on the old `> {log} 2>&1 &` form (fail-then-pass). Closes #676 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ManifestEnvironment.provision()started declared services with{cmd} > {log} 2>&1 &— stdout/stderr redirected, but stdin left attached and nonohup.On Daytona,
execruns as a session command that only reports an exit code once nothing holds the session's streams (the failure class root-caused in smolclaws#85). A backgrounded service that inherits the session's stdin/stdout/stderr keeps the call "running" until the hard timeout cap (#670) → a manifest-driven eval wedges (~1h) instead of starting.Docker is unaffected —
communicate()returns when the foreground shell exits — so the bug is invisible in local runs and CI, and only bites on Daytona.Fix
Start services with full detachment:
nohup {cmd} </dev/null >/tmp/benchflow-env-<svc>.log 2>&1 &</dev/null(stdin) is the operative change; the old form only covered stdout/stderrnohupto detach from the session's SIGHUPdisown— it is a bash/zsh builtin and the Daytona DinD shell issh/dash/tmp/benchflow-env-<svc>.logMirrors the fix that resolved the identical wedge downstream.
Test plan
test_provision_fully_detaches_service_fds— assertsnohup+ all three fd redirections + backgrounded + nodisown; verified to fail on the old> {log} 2>&1 &form (fail-then-pass). The pre-existing test only checked the command ended with&, which is exactly why the missing stdin redirect slipped through.tests/environment/+ manifest inbound suite: 75 passedCloses #676
🤖 Generated with Claude Code