feat: deterministic E2E, distributed tracing, incremental export, Hermes lazy screens#691
Open
distributed-nerd wants to merge 5 commits into
Open
Conversation
- Hermetic per-test seeding via launch args (fixed clock/locale/timezone) and an in-app bootstrap that seeds storage and rehydrates the store. - Replace ad-hoc waits with expectation-based helpers (no fixed sleeps). - Deterministic mock network layer with named scenarios; the app installs a fetch interceptor under E2E so it never hits the wire. - Tolerance-based visual regression using pixelmatch instead of exact hashing, with configurable per-snapshot thresholds and diff artifacts. - Flaky-test detection: jest retries plus a reporter that records tests passing only after retry; optional fail-on-flaky gate. - CI: artifact uploads and a 5-run stability matrix enforcing zero flakiness. - Docs for writing deterministic E2E tests.
- Dependency-free, OpenTelemetry-shaped tracer in backend/services/shared with W3C traceparent/tracestate propagation, span kinds/status/events, PII scrubbing and OTLP/HTTP export. - Consistent sampler: rate-based, endpoint-based and error-based, with parent decisions honored so traces stay whole across hops. - Backend instrumentation helpers for server, db, external-call and business-logic spans; webhook delivery now emits a producer span and propagates trace context to receivers. - Mobile traced apiClient that injects traceparent and spans API calls. - ML service (FastAPI) with OTel spans for model load, feature compute and inference, adopting the upstream context. - OTel collector + Tempo + Grafana stack and docs for the propagation contract.
- Append-only subscription change log with ordered LSNs, tombstones for deletes, per-entity versions and schema versioning. - Watermark-based incremental export that ships only changes since the last checkpoint, checkpointing per batch for clean resume. - Pluggable format adapters (CSV, JSON, Parquet) with schema evolution; pure and deterministic so re-running a window yields byte-identical output. - Bidirectional conflict resolution (source/external/version/last-write wins). - Delivery retries with exponential backoff; on exhaustion the watermark holds at the last good batch. Per-channel lock prevents concurrent runs. - Export metrics (records, conflicts, batches, retries, bytes, latency) and a standard API response envelope. - Integration tests against a mock external sink; docs.
- Critical screens (Home, SubscriptionDetail, Analytics, Payment) stay eager; all other screens load on demand via React.lazy + Suspense. - lazyScreen helper provides a lightweight loading fallback and an error boundary that retries from the full bundle when a chunk is unavailable. - Metro inlineRequires defers module evaluation so dynamically-imported screens become separately-loadable chunks; babel notes the boundary. - app.config.js declares eager/lazy screen tiers and the startup performance budget; check-performance-budget.js enforces the 2s ceiling, >=30% startup improvement and >=20% peak-memory reduction, wired into the CI bundle-size job. - Also adds the missing nav routes and types so AppNavigator type-checks. - Docs for configuring screen compilation tiers.
Upstream main independently implemented overlapping work (monitoring, event store, CDC/accounting export, perf budget, ml-service, webhook refactor, GDPR exportService, apiResponse). Resolved every conflict to upstream's version and kept only the non-colliding, self-contained additions from this branch: - backend tracing primitives (shared/tracing.ts + tests) - traced mobile API client (src/services/network/) - E2E reliability helpers (explicit waits, flaky reporter, mock/seed helpers) Removed superseded/build-breaking leftovers (CDC accountingExport adapters, lazyScreen, perf budget fixtures, redundant docs).
|
@distributed-nerd Great news! 🎉 Based on an automated assessment of this PR, the linked Wave issue(s) no longer count against your application limits. You can now already apply to more issues while waiting for a review of this PR. Keep up the great work! 🚀 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Four reliability/performance/observability initiatives, each as a focused commit:
Closes #669
Closes #670
Closes #671
Closes #672
Changes by area
Deterministic E2E (
e2e/,.detoxrc.js,.github/workflows/e2e-detox.yml,src/utils/e2e/)fetchinterceptor — a strict no-op in production.happy-path,charge-failure,degraded-network).pixelmatchtolerance thresholds (configurable per snapshot / via env) instead of exact byte hashing, with diff artifacts.jest.retryTimes+ a reporter that records tests passing only on retry, an optional fail-on-flaky gate, and a 5-run CI stability matrix.docs/e2e-deterministic-testing.md.Distributed tracing (
backend/services/shared/,src/services/network/,ml-service/,infra/,backend/services/webhook.ts)traceparent/tracestate, span kinds/status/events, PII scrubbing, and OTLP/HTTP export.apiClient; FastAPI ML service with model-load / feature-compute / inference spans.docs/distributed-tracing.md.Incremental export (
backend/services/exportService.ts,backend/services/subscription/,backend/services/billing/accountingExport/,backend/services/shared/apiResponse.ts)docs/incremental-export.md.Hermes bytecode / lazy screens (
metro.config.js,babel.config.js,app.config.js,src/navigation/,scripts/check-performance-budget.js)React.lazy+ Suspense, with an error boundary that retries from the full bundle if a chunk is unavailable.inlineRequiresdefers module evaluation so dynamically-imported screens become separately-loadable chunks.app.config.jsdeclares eager/lazy tiers and the startup performance budget;check-performance-budget.jsenforces the 2s ceiling, ≥30% startup improvement and ≥20% peak-memory reduction, wired into the CI bundle-size job.AppNavigatortype errors (missing routes/imports).docs/hermes-differential-bytecode.md.Testing
npm run perf:budgetpasses (sample metrics show ~38% startup and ~23% memory improvement vs baseline).AppNavigatorerrors fixed.Notes / constraints
jest-expopreset is broken at baseline (RN 0.85 mismatch), so the full RN test runner could not be exercised in this environment; pure-TS logic was validated with a standalone ts-jest config.pixelmatch,pngjs,@types/pngjs; runtimereact-native-launch-arguments(lazy-required, guarded).