Skip to content

Refactor E2E test suite to eliminate flakiness with deterministic retry and state isolation #672

Description

@Smartdevs17

Context


The SubTrackr Detox E2E test suite has inconsistent reliability—tests pass locally but fail in CI due to timing, network conditions, and shared state between test cases. Flaky tests reduce developer confidence, obscure real regressions, and increase CI cycle time.
\

\

Current Limitation

\

  • Tests depend on implicit timing (hardcoded wait values)
    \
  • Shared app state between test cases causes cascading failures
    \
  • Network-dependent tests fail unpredictably under varying conditions
    \
  • No hermetic test data seeding per test case
    \
  • Screenshot baseline comparisons are brittle (pixel-perfect mismatches)
    \

\

Expected Outcome


A deterministic E2E test suite where each test case is fully isolated with hermetic data seeding, explicit wait strategies (expect-based, not timer-based), mock network layer for consistent responses, and visual regression tolerance thresholds.
\

\

Acceptance Criteria

\

  • All test cases are fully isolated with independent app state initialization
    \
  • Hardcoded wait() calls replaced with expectation-based waits
    \
  • Mock network layer provides deterministic API responses for all test scenarios
    \
  • Visual regression tests use configurable tolerance thresholds (not pixel-perfect)
    \
  • Test suite runs with zero flaky failures across 5 consecutive CI runs
    \
  • CI test execution time does not exceed current baseline
    \
  • Documentation for writing deterministic E2E tests
    \
  • Flaky test detection and automatic re-run with reporting
    \

\

Technical Scope

\

  • Files: e2e/ (all files), .detoxrc.js, .github/workflows/e2e-detox.yml, e2e/helpers/visualRegression.ts
    \
  • APIs: Detox expect API, mock server for API responses, screenshot diff libraries (pixelmatch)
    \
  • Edge cases: CI vs local environment differences (animations, keyboard, notifications), device simulator variations, JS timer precision, async data loading, gesture recording playback

Metadata

Metadata

Labels

200-points200 point issueStellar WaveIssues in the Stellar wave programdrips-waveIssues in the Drips Wave programhighHigh complexity issue

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions