Reflaxe.Elixir uses a dual-mode testing system with shared utilities to provide both sequential and parallel test execution with identical behavior and 100% reliability.
GitHub Actions runs a few higher-level “integration” checks in addition to snapshot tests:
- Examples (Elixir WAE): compiles each example’s generated Elixir under
mix compile --warnings-as-errors.- WAE = “warnings as errors”. This is a release-hygiene gate: warnings often indicate subtle codegen/stdlib issues (unused vars, undefined funcs, module conflicts) that would otherwise slip through.
- Keep example dependencies compatible with the CI Elixir version. Newer Elixir releases can introduce stricter warnings that older deps still emit (which can slow compiles and break WAE gates).
mix-01,mix-02, … shards: the examples list is split into multiple matrix shards so the WAE job:- finishes faster (parallel execution),
- avoids per-job timeouts,
- and isolates failures (you get a single shard log instead of one huge run).
The same WAE philosophy is used by the todo-app QA sentinel when it runs mix compile --warnings-as-errors.
The WAE job is implemented by:
- CI workflow:
.github/workflows/ci.yml(matrix shardsmix-01…mix-11) - Runner script:
scripts/test-examples-elixir.sh(called bynpm run test:examples-elixir)
Each shard sets EXAMPLES_ELIXIR_WAE_ONLY=<example-name> so we compile one example per job. This keeps
wall-clock time predictable, and it makes the failure logs focused (CI uploads _tmp/examples-elixir-wae/
as an artifact on failure).
This repo uses GitHub Actions concurrency cancellation (cancel-in-progress). If you push multiple commits
quickly, older runs will often show cancelled even when nothing is wrong. Only the latest run for the
current HEAD matters.
Quick no-auth sanity check:
- Use the GitHub API to query the CI workflow run for the current
HEADSHA and ensure all jobs concludesuccess.
When a bug is found in a real app (especially examples/todo-app/), add a focused regression snapshot under:
test/snapshot/regression/**
These tests are intentionally small and deterministic. They lock in compiler behavior at the AST/lowering level and prevent “it works in the todo-app but regresses in a later refactor” outcomes.
The todo-app “online users” UI exposed a correctness bug in the Enum.reduce_while lowering pipeline.
- Symptom: Presence state existed, but derived UI lists stayed empty (
@online_user_countrendered0, no avatars). - Root cause: Haxe
forloops lower intoEnum.reduce_while(...), and list “updates” likeArray.push/Array.concatbecomelist ++ [value]on the Elixir target. A bug in accumulator-threading meant updates inside nested control-flow (ifbranches in the reducer) did not escape the recursion, so the reducer returned the original accumulator. - Fix: implemented in
src/reflaxe/elixir/ast/transformers/ReduceWhileAccumulatorTransform.hxand supporting passes to ensure nested-branch updates become explicit accumulator rebindings that survivetrywrappers and branch lowering. - Regression snapshot:
test/snapshot/regression/ReduceWhileAccumulatorBranchUpdates/guards:- nested
ifupdates vianames = names.concat([k]) - statement-position
views.push({...})lowering into explicit accumulator rebindings
- nested
This regression is intentionally generic: any code that builds lists inside loops benefits, not just Presence.
┌─────────────────┐ ┌─────────────────┐
│ TestRunner │ │ ParallelTestRunner │
│ (Sequential) │ │ (Parallel) │
└─────────┬───────┘ └─────────┬───────┘
│ │
│ ┌─────────────────┐
└────────┤ TestCommon ├────────┘
│ (Shared Utils) │
└─────────────────┘
- Purpose: Traditional sequential test execution
- Usage: Development debugging, CI fallback
- Behavior: Processes tests one at a time
- Output: Detailed differences for debugging
- Purpose: High-performance parallel test execution
- Usage: Default test mode, development workflow
- Behavior: 16 workers with file-based locking
- Output: Boolean success/failure for speed
- Purpose: Eliminate code duplication and ensure consistency
- Functions: File operations, content normalization, directory comparison
- Benefit: Single source of truth for test logic
Purpose: Recursively collect all files from a directory
// Usage examples
TestCommon.getAllFiles("test/out") // ["Main.ex", "User.ex"]
TestCommon.getAllFiles("test/out", "src/") // ["src/Main.ex", "src/User.ex"]Features:
- Handles non-existent directories gracefully (returns
[]) - Supports optional prefix for relative path construction
- Platform-agnostic file system operations
Purpose: Normalize file content for reliable comparison
// Standard normalization
var normalized = TestCommon.normalizeContent(fileContent);
// Special handling for generated files
var normalized = TestCommon.normalizeContent(jsonContent, "_GeneratedFiles.json");Features:
- Line ending normalization:
\r\n→\n,\r→\n - Whitespace cleanup: Trim trailing spaces, remove trailing empty lines
- Special file handling: Filters incremental ID fields from
_GeneratedFiles.json - Error resilience: Graceful fallback on parsing failures
Purpose: Detailed comparison for TestRunner debugging
var differences = TestCommon.compareDirectoriesDetailed("out/", "intended/");
// Returns: ["Missing file: Main.ex", "Content differs: User.ex", "Extra file: Debug.ex"]Used by: TestRunner for detailed error reporting
Purpose: Fast boolean comparison for ParallelTestRunner
var matches = TestCommon.compareDirectoriesSimple("out/", "intended/");
// Returns: true or falseUsed by: ParallelTestRunner for performance
// Both functions handle missing directories correctly
TestCommon.getAllFiles("/nonexistent/path") // Returns []
TestCommon.compareDirectoriesSimple("out/", "intended/") // Handles missing 'out'// _GeneratedFiles.json contains incrementing ID field that must be ignored
{
"filesGenerated": ["Main.ex", "User.ex"],
"id": 123, // ← This changes on each compilation
"version": 1
}
// normalizeContent() filters out the "id" line for consistent comparison- Detailed Pattern: Returns difference descriptions for debugging
- Simple Pattern: Returns boolean for performance
- Same Logic: Both use identical file processing and content normalization
Problem: Incremental ID field caused test failures
// Intended
{"id": 120, "files": ["Main.ex"]}
// Actual
{"id": 121, "files": ["Main.ex"]}Solution: normalizeContent() filters ID lines with regex /^\s*"id"\s*:\s*\d+,?$/
Problem: ParallelTestRunner failed when no output generated
// OLD ParallelTestRunner (FAILED)
if (!sys.FileSystem.exists(actualDir) || !sys.FileSystem.exists(intendedDir)) {
return false; // ← Failed for missing actualDir
}
// NEW TestCommon approach (WORKS)
if (!sys.FileSystem.exists(intendedDir)) return false; // Only check intended
var intended = getAllFiles(intendedDir); // []
var actual = getAllFiles(actualDir); // [] (handles missing dir)
return intended.length == actual.length; // 0 == 0 = true ✅Problem: ~100 lines of duplicated logic between test runners Solution: Centralized shared functions in TestCommon.hx Benefit: Single point of maintenance, guaranteed consistency
- Test Success: 57/57 (100%) - All tests passing with new file naming conventions
- Maintenance: Duplicated logic in both runners
- Reliability: Inconsistent behavior between sequential and parallel
- Test Success: 57/57 (100%) - Zero failures
- Maintenance: Single source of truth for test logic
- Reliability: Identical behavior guaranteed
- Performance: No impact on 87-90% speed improvement
import test.TestCommon;
// Use shared functions for consistent behavior
var files = TestCommon.getAllFiles(outputDir);
var content = TestCommon.normalizeContent(fileContent, fileName);
var matches = TestCommon.compareDirectoriesSimple(actual, intended);// Sequential runner - get detailed differences
var differences = TestCommon.compareDirectoriesDetailed(outPath, intendedPath);
if (differences.length > 0) {
// Show detailed error messages
}
// Parallel runner - get fast boolean result
var success = TestCommon.compareDirectoriesSimple(outPath, intendedPath);
return success;- Shared Utilities Prevent Divergence: Code duplication leads to inconsistent behavior
- Special Case Handling: Generated files need special normalization (ID filtering)
- Directory Existence Logic: Be careful about existence checks - intended vs actual
- Performance vs Detail Trade-off: Two comparison functions serve different needs
- Testing the Tests: Test infrastructure needs its own reliability validation
- Test Result Caching: Cache normalized content for unchanged files
- Parallel Directory Operations: Parallelize file reading within directory comparison
- Smart Content Filtering: Configurable filtering rules for other generated files
- Test Isolation: Enhanced directory sandboxing for even better isolation
- Metrics Collection: Track test timing and failure patterns
The TestCommon.hx architecture ensures that Reflaxe.Elixir's testing infrastructure is:
- Reliable: 100% test success rate
- Maintainable: Single source of truth for test logic
- Performant: No impact on parallel execution speed
- Consistent: Identical behavior between sequential and parallel modes
- Robust: Handles edge cases like missing directories and special file formats
This foundation supports the project's commitment to reliable, fast development workflows.