Testing Infrastructure Architecture

Overview

Reflaxe.Elixir uses a dual-mode testing system with shared utilities to provide both sequential and parallel test execution with identical behavior and 100% reliability.

CI lanes (why “WAE” and “mix-0x” exist)

GitHub Actions runs a few higher-level “integration” checks in addition to snapshot tests:

Examples (Elixir WAE): compiles each example’s generated Elixir under mix compile --warnings-as-errors.
- WAE = “warnings as errors”. This is a release-hygiene gate: warnings often indicate subtle codegen/stdlib issues (unused vars, undefined funcs, module conflicts) that would otherwise slip through.
- Keep example dependencies compatible with the CI Elixir version. Newer Elixir releases can introduce stricter warnings that older deps still emit (which can slow compiles and break WAE gates).
mix-01, mix-02, … shards: the examples list is split into multiple matrix shards so the WAE job:
- finishes faster (parallel execution),
- avoids per-job timeouts,
- and isolates failures (you get a single shard log instead of one huge run).

The same WAE philosophy is used by the todo-app QA sentinel when it runs mix compile --warnings-as-errors.

How shards map to the repo

The WAE job is implemented by:

CI workflow: .github/workflows/ci.yml (matrix shards mix-01…mix-11)
Runner script: scripts/test-examples-elixir.sh (called by npm run test:examples-elixir)

Each shard sets EXAMPLES_ELIXIR_WAE_ONLY=<example-name> so we compile one example per job. This keeps wall-clock time predictable, and it makes the failure logs focused (CI uploads _tmp/examples-elixir-wae/ as an artifact on failure).

Why you sometimes see “cancelled” checks

This repo uses GitHub Actions concurrency cancellation (cancel-in-progress). If you push multiple commits quickly, older runs will often show cancelled even when nothing is wrong. Only the latest run for the current HEAD matters.

Quick no-auth sanity check:

Use the GitHub API to query the CI workflow run for the current HEAD SHA and ensure all jobs conclude success.

Regression Snapshots (Compiler Correctness)

When a bug is found in a real app (especially examples/todo-app/), add a focused regression snapshot under:

test/snapshot/regression/**

These tests are intentionally small and deterministic. They lock in compiler behavior at the AST/lowering level and prevent “it works in the todo-app but regresses in a later refactor” outcomes.

Case Study: reduce_while Accumulator Updates (Todo-App Presence)

The todo-app “online users” UI exposed a correctness bug in the Enum.reduce_while lowering pipeline.

Symptom: Presence state existed, but derived UI lists stayed empty (@online_user_count rendered 0, no avatars).
Root cause: Haxe for loops lower into Enum.reduce_while(...), and list “updates” like Array.push/Array.concat become list ++ [value] on the Elixir target. A bug in accumulator-threading meant updates inside nested control-flow (if branches in the reducer) did not escape the recursion, so the reducer returned the original accumulator.
Fix: implemented in src/reflaxe/elixir/ast/transformers/ReduceWhileAccumulatorTransform.hx and supporting passes to ensure nested-branch updates become explicit accumulator rebindings that survive try wrappers and branch lowering.
Regression snapshot: test/snapshot/regression/ReduceWhileAccumulatorBranchUpdates/ guards:
- nested if updates via names = names.concat([k])
- statement-position views.push({...}) lowering into explicit accumulator rebindings

This regression is intentionally generic: any code that builds lists inside loops benefits, not just Presence.

Architecture Components

┌─────────────────┐    ┌─────────────────┐
│   TestRunner    │    │ ParallelTestRunner │
│  (Sequential)   │    │   (Parallel)    │
└─────────┬───────┘    └─────────┬───────┘
          │                      │
          │        ┌─────────────────┐
          └────────┤   TestCommon    ├────────┘
                   │ (Shared Utils)  │
                   └─────────────────┘

TestRunner.hx (Sequential)

Purpose: Traditional sequential test execution
Usage: Development debugging, CI fallback
Behavior: Processes tests one at a time
Output: Detailed differences for debugging

ParallelTestRunner.hx (Parallel)

Purpose: High-performance parallel test execution
Usage: Default test mode, development workflow
Behavior: 16 workers with file-based locking
Output: Boolean success/failure for speed

TestCommon.hx (Shared Utilities)

Purpose: Eliminate code duplication and ensure consistency
Functions: File operations, content normalization, directory comparison
Benefit: Single source of truth for test logic

Shared Functions

`getAllFiles(dir: String, prefix: String = ""): Array<String>`

Purpose: Recursively collect all files from a directory

// Usage examples
TestCommon.getAllFiles("test/out")           // ["Main.ex", "User.ex"]
TestCommon.getAllFiles("test/out", "src/")   // ["src/Main.ex", "src/User.ex"]

Features:

Handles non-existent directories gracefully (returns [])
Supports optional prefix for relative path construction
Platform-agnostic file system operations

`normalizeContent(content: String, fileName: String = ""): String`

Purpose: Normalize file content for reliable comparison

// Standard normalization
var normalized = TestCommon.normalizeContent(fileContent);

// Special handling for generated files
var normalized = TestCommon.normalizeContent(jsonContent, "_GeneratedFiles.json");

Features:

Line ending normalization: \r\n → \n, \r → \n
Whitespace cleanup: Trim trailing spaces, remove trailing empty lines
Special file handling: Filters incremental ID fields from _GeneratedFiles.json
Error resilience: Graceful fallback on parsing failures

Directory Comparison Functions

`compareDirectoriesDetailed(actualDir, intendedDir): Array<String>`

Purpose: Detailed comparison for TestRunner debugging

var differences = TestCommon.compareDirectoriesDetailed("out/", "intended/");
// Returns: ["Missing file: Main.ex", "Content differs: User.ex", "Extra file: Debug.ex"]

Used by: TestRunner for detailed error reporting

`compareDirectoriesSimple(actualDir, intendedDir): Bool`

Purpose: Fast boolean comparison for ParallelTestRunner

var matches = TestCommon.compareDirectoriesSimple("out/", "intended/");
// Returns: true or false

Used by: ParallelTestRunner for performance

Key Design Patterns

1. Graceful Directory Handling

// Both functions handle missing directories correctly
TestCommon.getAllFiles("/nonexistent/path")  // Returns []
TestCommon.compareDirectoriesSimple("out/", "intended/")  // Handles missing 'out'

2. Special File Processing

// _GeneratedFiles.json contains incrementing ID field that must be ignored
{
  "filesGenerated": ["Main.ex", "User.ex"],
  "id": 123,  // ← This changes on each compilation
  "version": 1
}

// normalizeContent() filters out the "id" line for consistent comparison

3. Two-Pattern Comparison

Detailed Pattern: Returns difference descriptions for debugging
Simple Pattern: Returns boolean for performance
Same Logic: Both use identical file processing and content normalization

Critical Bug Fixes Resolved

Issue 1: _GeneratedFiles.json False Failures

Problem: Incremental ID field caused test failures

// Intended
{"id": 120, "files": ["Main.ex"]}
// Actual  
{"id": 121, "files": ["Main.ex"]}

Solution: normalizeContent() filters ID lines with regex /^\s*"id"\s*:\s*\d+,?$/

Issue 2: Empty Directory Handling Divergence

Problem: ParallelTestRunner failed when no output generated

// OLD ParallelTestRunner (FAILED)
if (!sys.FileSystem.exists(actualDir) || !sys.FileSystem.exists(intendedDir)) {
    return false;  // ← Failed for missing actualDir
}

// NEW TestCommon approach (WORKS)
if (!sys.FileSystem.exists(intendedDir)) return false;  // Only check intended
var intended = getAllFiles(intendedDir);  // []
var actual = getAllFiles(actualDir);      // [] (handles missing dir)
return intended.length == actual.length;  // 0 == 0 = true ✅

Issue 3: Code Duplication Issues

Problem: ~100 lines of duplicated logic between test runners Solution: Centralized shared functions in TestCommon.hx Benefit: Single point of maintenance, guaranteed consistency

Performance Impact

Before TestCommon

Test Success: 57/57 (100%) - All tests passing with new file naming conventions
Maintenance: Duplicated logic in both runners
Reliability: Inconsistent behavior between sequential and parallel

After TestCommon

Test Success: 57/57 (100%) - Zero failures
Maintenance: Single source of truth for test logic
Reliability: Identical behavior guaranteed
Performance: No impact on 87-90% speed improvement

Usage Guidelines

For Test Development

import test.TestCommon;

// Use shared functions for consistent behavior
var files = TestCommon.getAllFiles(outputDir);
var content = TestCommon.normalizeContent(fileContent, fileName);
var matches = TestCommon.compareDirectoriesSimple(actual, intended);

For Test Runner Implementation

// Sequential runner - get detailed differences
var differences = TestCommon.compareDirectoriesDetailed(outPath, intendedPath);
if (differences.length > 0) {
    // Show detailed error messages
}

// Parallel runner - get fast boolean result  
var success = TestCommon.compareDirectoriesSimple(outPath, intendedPath);
return success;

Lessons Learned

Shared Utilities Prevent Divergence: Code duplication leads to inconsistent behavior
Special Case Handling: Generated files need special normalization (ID filtering)
Directory Existence Logic: Be careful about existence checks - intended vs actual
Performance vs Detail Trade-off: Two comparison functions serve different needs
Testing the Tests: Test infrastructure needs its own reliability validation

Future Enhancements

Test Result Caching: Cache normalized content for unchanged files
Parallel Directory Operations: Parallelize file reading within directory comparison
Smart Content Filtering: Configurable filtering rules for other generated files
Test Isolation: Enhanced directory sandboxing for even better isolation
Metrics Collection: Track test timing and failure patterns

Conclusion

The TestCommon.hx architecture ensures that Reflaxe.Elixir's testing infrastructure is:

Reliable: 100% test success rate
Maintainable: Single source of truth for test logic
Performant: No impact on parallel execution speed
Consistent: Identical behavior between sequential and parallel modes
Robust: Handles edge cases like missing directories and special file formats

This foundation supports the project's commitment to reliable, fast development workflows.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Testing Infrastructure Architecture

Overview

CI lanes (why “WAE” and “mix-0x” exist)

How shards map to the repo

Why you sometimes see “cancelled” checks

Regression Snapshots (Compiler Correctness)

Case Study: reduce_while Accumulator Updates (Todo-App Presence)

Architecture Components

TestRunner.hx (Sequential)

ParallelTestRunner.hx (Parallel)

TestCommon.hx (Shared Utilities)

Shared Functions

`getAllFiles(dir: String, prefix: String = ""): Array<String>`

`normalizeContent(content: String, fileName: String = ""): String`

Directory Comparison Functions

`compareDirectoriesDetailed(actualDir, intendedDir): Array<String>`

`compareDirectoriesSimple(actualDir, intendedDir): Bool`

Key Design Patterns

1. Graceful Directory Handling

2. Special File Processing

3. Two-Pattern Comparison

Critical Bug Fixes Resolved

Issue 1: _GeneratedFiles.json False Failures

Issue 2: Empty Directory Handling Divergence

Issue 3: Code Duplication Issues

Performance Impact

Before TestCommon

After TestCommon

Usage Guidelines

For Test Development

For Test Runner Implementation

Lessons Learned

Future Enhancements

Conclusion

FilesExpand file tree

TESTING_INFRASTRUCTURE.md

Latest commit

History

TESTING_INFRASTRUCTURE.md

File metadata and controls

Testing Infrastructure Architecture

Overview

CI lanes (why “WAE” and “mix-0x” exist)

How shards map to the repo

Why you sometimes see “cancelled” checks

Regression Snapshots (Compiler Correctness)

Case Study: reduce_while Accumulator Updates (Todo-App Presence)

Architecture Components

TestRunner.hx (Sequential)

ParallelTestRunner.hx (Parallel)

TestCommon.hx (Shared Utilities)

Shared Functions

getAllFiles(dir: String, prefix: String = ""): Array<String>

normalizeContent(content: String, fileName: String = ""): String

Directory Comparison Functions

compareDirectoriesDetailed(actualDir, intendedDir): Array<String>

compareDirectoriesSimple(actualDir, intendedDir): Bool

Key Design Patterns

1. Graceful Directory Handling

2. Special File Processing

3. Two-Pattern Comparison

Critical Bug Fixes Resolved

Issue 1: _GeneratedFiles.json False Failures

Issue 2: Empty Directory Handling Divergence

Issue 3: Code Duplication Issues

Performance Impact

Before TestCommon

After TestCommon

Usage Guidelines

For Test Development

For Test Runner Implementation

Lessons Learned

Future Enhancements

Conclusion

`getAllFiles(dir: String, prefix: String = ""): Array<String>`

`normalizeContent(content: String, fileName: String = ""): String`

`compareDirectoriesDetailed(actualDir, intendedDir): Array<String>`

`compareDirectoriesSimple(actualDir, intendedDir): Bool`