This workspace is intentionally multi-component. Tests should stay with the component they verify, while root orchestration stays in mn-system-tests/test_all.py for developer convenience.
Preferred layout for new tests:
tests/unit/: pure logic, config parsing, command formatting, manifest validation, component-local behavior.tests/integration/: one component talking to another through a stable interface, such as CLI to gRPC or API to gRPC.tests/regression/: focused repros for previously fixed bugs.tests/e2e/: runtime workflows, live API/CLI flows, Docker or service-backed checks.
Current component mapping:
MirrorNeuron/tests/unit: Elixir unit tests.MirrorNeuron/tests/api: core gRPC/API boundary tests.MirrorNeuron/tests/e2e: runtime and stream/live execution e2e tests.MirrorNeuron/tests/regression: script-style regression repros.mn-api/tests,mn-cli/tests,mn-python-sdk/tests,mn-web-ui/src/test: component unit tests today. Split these intounit/,integration/,regression/, ande2e/as each package grows.mn-system-tests/contracts: fast injected API/CLI/SDK contract tests that do not need Redis, Elixir, Docker, or gRPC.mn-system-tests/integrationandmn-system-tests/e2e: live cross-component tests. These are opt-in because they need running services.otterdesk-blueprints/testsandmn-skills/*/tests: blueprint catalog and skill package tests.
Fast local signal:
.venv/bin/python mn-system-tests/test_all.py --fastInjected cross-component contracts:
.venv/bin/python mn-system-tests/test_all.py --contractsComponent unit tests, including Node and core when dependencies are present:
.venv/bin/python mn-system-tests/test_all.py --unitBlueprint quick checks without external APIs:
.venv/bin/python mn-system-tests/test_all.py --blueprintsKey interface performance benchmark:
.venv/bin/python mn-system-tests/test_all.py --performanceThis records mn-system-tests/results/performance.txt and
mn-system-tests/results/performance.json with hardware, software, package,
git, latency, and throughput metadata.
Changed workspace checks:
.venv/bin/python mn-system-tests/test_all.py --changedCore runtime e2e, including stream/live backpressure:
.venv/bin/python mn-system-tests/test_all.py --runtime-e2eSecurity checks in reporting mode:
.venv/bin/python mn-system-tests/test_all.py --security --skip-core --skip-node --skip-blueprintsStrict security mode fails on dependency-audit or secret-scan findings:
.venv/bin/python mn-system-tests/test_all.py --security --strict-security --skip-core --skip-node --skip-blueprintsLive integration/e2e against running services:
.venv/bin/python mn-system-tests/test_all.py --integration --live
.venv/bin/python mn-system-tests/test_all.py --e2e --liveFull default non-live suite:
.venv/bin/python mn-system-tests/test_all.pyOffline pytest gate:
cd mn-system-tests
.venv/bin/python -m pytest contracts benchmarks installer -qtest_all.py records every runner-driven suite under
mn-system-tests/results/: system-tests.txt, system-tests.json,
pytest-*.json, and raw per-step logs in results/logs/. Use
--results-dir PATH or MN_SYSTEM_TEST_RESULTS_DIR to redirect artifacts in CI.
Use isolated ports and namespaces so tests do not collide with a developer instance:
cd MirrorNeuron
MN_GRPC_PORT=55200 \
MN_REDIS_NAMESPACE=mirror_neuron_manual_e2e \
mix run --no-haltIn another terminal:
cd mn-api
MN_API_PORT=4001 \
MN_GRPC_TARGET=localhost:55200 \
mn-apiThen run:
MN_GRPC_TARGET=localhost:55200 \
MN_API_BASE_URL=http://localhost:4001/api/v1 \
RUN_MN_SYSTEM_TESTS=1 \
.venv/bin/python -m pytest mn-system-tests/integration mn-system-tests/e2eMirrorNeuron includes Redis Sentinel smoke tests for the runtime's durable state store.
Local Docker test:
cd MirrorNeuron
bash scripts/test_redis_sentinel_ha.shTwo-box Docker test:
cd MirrorNeuron
bash scripts/test_redis_sentinel_two_box_ha.sh \
--remote-host 192.168.4.173 \
--local-ip 192.168.4.25 \
--remote-ip 192.168.4.173Expected success markers:
two_box_initial_write_ok=...
two_box_post_failover_write_read_ok
The two-box test starts Redis and Sentinel on both machines, writes MirrorNeuron state, kills the initial Redis primary, waits for Sentinel failover, then writes and reads again through the promoted replica. If the remote box cannot route to the local Redis test port, the script automatically uses the remote Redis as the initial primary and tests failover back to the local replica.
Run the same path through the workspace test runner:
.venv/bin/python mn-system-tests/test_all.py --redis-ha \
--redis-ha-remote-host 192.168.4.173 \
--redis-ha-local-ip 192.168.4.25 \
--redis-ha-remote-ip 192.168.4.173Expected output:
All selected test suites passed.
The Nomad-inspired runtime features should be covered at three levels:
- component unit tests in
MirrorNeuron,mn-cli, andmn-python-sdk - live cross-component tests in
mn-system-tests - two-box joined-cluster smoke tests when placement, drain, recovery, or schedule uniqueness is involved
Core areas to keep covered:
| Feature | Test focus |
|---|---|
| Reconciliation | Node-loss recovery, live coordinator agent movement, whole-job recovery after lease loss, pause for unsafe work. |
| Job types | service, batch, system, and sysbatch lifecycle behavior. |
| Restart/reschedule policy | Sliding windows, delay functions, mode: fail, mode: delay, disabled and unlimited reschedule. |
| Drain and maintenance | Eligibility, dry run, service migration, batch waiting, system ignore, cancellation, undrain. |
| Services and checks | Manifest validation, service preflight, registry filtering, HTTP/TCP/script/gRPC checks. |
| Resources and devices | CUDA vs Metal, GPU memory, device ID exclusivity, explicit port conflicts, host volume placement. |
| Deployments | Rolling, canary isolation, promotion, rollback, service discovery roles. |
| Schedules and events | Cron parsing, timezone, delayed runs, overlap prevention, missed policies, event filters, idempotent dispatch. |
Two-box cluster checks should use a shared Redis namespace and sync the remote workspace through Git rather than editing files directly on the remote machine.
Typical joined-cluster verification:
mn runtime start
mn runtime start --worker-node
mn node join <worker-ip> --token <worker-token>
mn node list
mn resource listThen run targeted system tests for:
- one due schedule dispatching exactly once across both boxes
systemjobs expanding across both eligible nodes- service jobs moving from a drained node to the other node
- resource placement avoiding a node without the requested CUDA/Metal/device capability
- required service preflight blocking before agents launch
All runtime/config overrides must use MN_.
Useful test isolation vars:
MN_GRPC_PORT: use a non-default port for test core instances.MN_GRPC_TARGET: point CLI/API/SDK tests at the test core.MN_API_BASE_URL: point system e2e tests at a non-default API port.MN_REDIS_NAMESPACE: isolate Redis keys per test run.MN_ENV=test: use test-mode validation where appropriate.MN_API_TOKEN: test API bearer auth.MN_BENCHMARK_WORKER_COUNT: worker count for the generated live parallel-worker smoke test; defaults small for local runs and can be set to100for stress checks.MN_PERF_ITERATIONS: measured iterations per performance probe; defaults to30.MN_PERF_WARMUP: warmup iterations per performance probe; defaults to5.RUN_MN_PERF_LIVE: set to1to add live API, gRPC, LLM, or Web UI interface probes.
Live tests are opt-in. If RUN_MN_SYSTEM_TESTS=1 is not set, mn-system-tests marks live tests skipped.
- Move flat Python tests into
tests/unit/first, then addtests/integration/for API/CLI behavior that exercises the SDK boundary. - Convert useful
MirrorNeuron/tests/regression/*.scriptrepros into ExUnit tests where possible. - Add Web UI visual/layout regression tests when Playwright is introduced.
- Add a real secret scanner such as
gitleaksordetect-secretsonce the team chooses a tool. - For cross-component regressions, prefer
mn-system-tests/contractswith injected clients/openers before adding a live test.