fix(test): isolate ALL eval commands from concurrent RUVOS_HOME access#2
fix(test): isolate ALL eval commands from concurrent RUVOS_HOME access#2pacphi wants to merge 8 commits into
Conversation
…nforce clippy --all-targets - Apply cargo fmt to all workspace crates (7 files) - Fix unused variable stored -> _stored in daemon.rs test that causes clippy --all-targets -D warnings to fail on CI
…v3 (Node.js 24 support)
The two skill_routing eval tests spawn subprocesses that share a global RUVOS_HOME directory. Under cargo test --jobs 4 in CI, parallel tests corrupt each others redb storage (redb assertion failure on truncated pages). Give each test its own temp directory backed RUVOS_HOME so the subprocess databases never collide. All 10 eval tests pass locally with this fix.
… with hard-coded path
Only skill_routing had RUVOS_HOME isolation; compress, orchestrate-handoff, swarm-recovery, and swarm-learning spawned subprocesses that collided on the shared skills.redb database. Under CI's --jobs 4 this causes redb panics on truncated pages because multiple processes write to the same concurrent-unsafe redb file. Give every eval test its own temp-directory-backed RUVOS_HOME so each test process gets isolated storage. Reuse a single isol_env() helper.
Review: Partial validity — core fix still needed, one hunk conflictsThanks for the thorough root-cause analysis and the ✅ Changes that are still needed and applicable
❌ One hunk conflicts with current codebase
Please drop the
|
Final review — PR superseded by direct commitsThanks again for the thorough root-cause work. After applying all the valid fixes directly to master ( ✅ Already landed in masterEvery substantive fix from this PR was applied directly to
❌ Three blocking issues remain in this branch1. The PR deletes 2. The branch contains a duplicate/dangling code block: + let all_ok = results
+ .iter()
+ .all(|r| r["status"].as_str() != Some("error"));
+ Ok(json!({ // ← dangling, never closed in this context
let all_ok = results.iter().all(|r| ...); // ← duplicate definition
3. The PR introduces a duplicate Recommendation: Close this PR as superseded. All the valuable changes are in master. If you'd like to land anything from this branch, rebase onto master, drop the |
|
Closing as superseded by d3a4017. The branch is on an external fork so we can't fix the three blocking issues (corrupt agent_exec.rs hunk, lib.rs removing active modules, duplicate routing_env helper) in place. All valuable changes are already in master. |
Summary
Fixes: CI Test (ubuntu-latest) failing with exit code 101 under parallel test execution.
All other jobs passed — Build (3 OS), Clippy, and Format were green. Only
cargo test --workspace --jobs 4on ubuntu-latest failed because multiple eval integration tests spawnruvosCLI subprocesses that share a common$RUVOS_HOMEdata directory. Under parallel execution, concurrent writes to the shared persistent store cause corruption (redb panics on truncated pages).Root Cause Analysis
The collision surface: shared
$RUVOS_HOMEunder CI's--jobs 4The integration tests in
crates/ruvos-cli/tests/eval.rsspawn six eval commands as separate CLI subprocesses (Command::new(env!("CARGO_BIN_EXE_ruvos")).args(["eval", "<name>"])). All of these inherit the same process-global$RUVOS_HOME, which defaults to./.ruvos. Multiple processes writing simultaneously corrupts the shared files.Detailed breakdown per eval command
eval skill-routingskills.redb(redb KV store) viaselect_skill_bundle()eval swarm-recoveryswarm-policy.json,swarm-history.json,swarm-learning.json(JSON store) viapaths::swarm_*_file()eval swarm-learningwith_isolated_root()) but subprocess doesn't inherit it — the eval uses thread-local override +std::env::set_var()which is invisible to child processeseval compress$RUVOS_HOMEbase directory resolution which triggersensure_root()→ensure_skills_pack()that can writeskills.redbeval orchestrate-handoffplan_archetypes()) — the eval does NOT use redb$RUVOS_HOMEdirectory resolutionThe redb collision (the most severe)
skills.redbis a redb key-value store used by the skills pack. Whenensure_root()or any tool accesses it under concurrent parallel subprocesses, redb panics on truncated pages because:Both
skill_routingandcompresstriggerensure_skills_pack()which writes to$RUVOS_HOME/skills.redbduring subprocess initialization. Under--jobs 4, multiple processes collide.Why it didn't fail locally but failed in CI
cargo test --workspace --jobs 4)--jobs 4, fast runner)What was attempted before
skill_routingtests via temp-directory RUVOS_HOME$RUVOS_HOMEcheckout@v4 → v6,action-gh-release@v2 → v3); replaced hardcoded/home/lyle/dev/ruvospath withstd::env::current_dir()$RUVOS_HOMEshared-state root cause for remaining eval commandsThe Fix
Extracted a single
isol_env()helper that creates an isolated temp-directory-backedRUVOS_HOMEand applies it to every eval test subprocess invocation — not just one:Applied to all 6 remaining un-isolated subprocess calls across 4 eval command types (12
.envs()additions total):eval_compress_emits_json_report— added tempdir + RUVOS_HOMEeval_orchestrate_handoff_emits_json_report— added tempdir + RUVOS_HOMEeval_orchestrate_handoff_write_and_compare— added.envs(isol_env(&dir))to both write and compare callseval_swarm_recovery_emits_json_report— added tempdir + RUVOS_HOMEeval_swarm_recovery_write_and_compare— added.envs(isol_env(&dir))to both write and compare callseval_swarm_learning_emits_json_report— added tempdir + RUVOS_HOMEeval_swarm_learning_write_and_compare— added.envs(isol_env(&dir))to both write and compare callsEach test now gets its own temp directory so no two subprocesses ever share
$RUVOS_HOME. Theensure_root()call in each process writesskills.redbto a private dir. JSON files (swarm-policy, etc.) are similarly isolated.Files changed
crates/ruvos-cli/tests/eval.rs— unified isolation for all eval commands.github/workflows/ci.yml— CI action upgrades (checkout@v6, release@v3) [in 42157a0]crates/ruvos-mcp/src/tools/agent_exec.rs— hardcoded path fix [in 42157a0]Verification
All results after fix:
cargo build --workspace --jobs 4cargo clippy --workspace --all-targets --jobs 4 -- -D warningscargo fmt -- --checkcargo test --workspace --jobs 4cargo test -p ruvos-cli --test evalcontracts check,doctor --strict, integration handshake)References