Skip to content

fix(ci): Fix failing E2E without retries#5888

Open
antonis wants to merge 8 commits intomainfrom
fix/e2e-stable-checks
Open

fix(ci): Fix failing E2E without retries#5888
antonis wants to merge 8 commits intomainfrom
fix/e2e-stable-checks

Conversation

@antonis
Copy link
Copy Markdown
Contributor

@antonis antonis commented Mar 25, 2026

📢 Type of change

  • Bugfix

📜 Description

Fixes E2E test flakiness without retries. Alternative to #5830.

  • Per-flow process isolation — prevents crash cascade from shared Maestro sessions
  • Maestro driver warm-up — stabilizes first launchApp on Cirrus Labs Tart VMs
  • Simulator readinesssimctl bootstatus + Settings.app warm-up + 180s driver timeout
  • captureReplay fix — test app's beforeSend was replacing the replay integration's; now chains correctly
  • Android feedback waitextendedWaitUntil before tapping "Report a Bug"
  • Sample app — all-envelope search, timestamp sorting, TTID/TTFD allow-list, Maestro warm-up step
  • execSyncexecFileSync to avoid shell interpolation

#skip-changelog

💡 Motivation and Context

E2E tests consistently fail on main — crash cascade, Maestro driver flakiness, broken replay beforeSend chain, and Android feedback timing.

Fixes #5926

💚 How did you test it?

CI

📝 Checklist

  • No new PII added or SDK only sends newly added PII if sendDefaultPII is enabled
  • All tests passing
  • No breaking changes

🔮 Next steps

#skip-changelog

@antonis antonis added the ready-to-merge Triggers the full CI test suite label Mar 25, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 25, 2026

Semver Impact of This PR

None (no version bump detected)

📋 Changelog Preview

This is how your changes will appear in the changelog.
Entries from this PR are highlighted with a left border (blockquote style).


This PR will not appear in the changelog.


🤖 This preview updates automatically when you update the PR.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 25, 2026

Android (legacy) Performance metrics 🚀

  Plain With Sentry Diff
Startup time 402.08 ms 423.42 ms 21.34 ms
Size 43.75 MiB 48.08 MiB 4.33 MiB

Previous results on branch: fix/e2e-stable-checks

Startup times

Revision Plain With Sentry Diff
8295078+dirty 440.48 ms 470.32 ms 29.84 ms
9530cff+dirty 399.24 ms 427.22 ms 27.98 ms
16888c0+dirty 414.88 ms 470.83 ms 55.95 ms
f5fb57c+dirty 405.82 ms 423.92 ms 18.10 ms

App size

Revision Plain With Sentry Diff
8295078+dirty 43.75 MiB 48.08 MiB 4.33 MiB
9530cff+dirty 43.75 MiB 48.08 MiB 4.33 MiB
16888c0+dirty 43.75 MiB 48.08 MiB 4.33 MiB
f5fb57c+dirty 43.75 MiB 48.08 MiB 4.33 MiB

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 25, 2026

iOS (legacy) Performance metrics 🚀

  Plain With Sentry Diff
Startup time 1225.53 ms 1222.66 ms -2.88 ms
Size 3.38 MiB 4.73 MiB 1.35 MiB

Previous results on branch: fix/e2e-stable-checks

Startup times

Revision Plain With Sentry Diff
16888c0+dirty 1192.35 ms 1190.87 ms -1.48 ms
8295078+dirty 1220.02 ms 1221.72 ms 1.70 ms
f5fb57c+dirty 1195.00 ms 1190.48 ms -4.52 ms
9530cff+dirty 1230.51 ms 1231.96 ms 1.45 ms

App size

Revision Plain With Sentry Diff
16888c0+dirty 3.38 MiB 4.73 MiB 1.35 MiB
8295078+dirty 3.38 MiB 4.73 MiB 1.35 MiB
f5fb57c+dirty 3.38 MiB 4.73 MiB 1.35 MiB
9530cff+dirty 3.38 MiB 4.73 MiB 1.35 MiB

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 25, 2026

iOS (new) Performance metrics 🚀

  Plain With Sentry Diff
Startup time 1221.74 ms 1219.23 ms -2.51 ms
Size 3.38 MiB 4.73 MiB 1.35 MiB

Previous results on branch: fix/e2e-stable-checks

Startup times

Revision Plain With Sentry Diff
16888c0+dirty 1204.51 ms 1211.77 ms 7.26 ms
8295078+dirty 1212.93 ms 1210.57 ms -2.36 ms
f5fb57c+dirty 1246.91 ms 1241.61 ms -5.30 ms
9530cff+dirty 1220.98 ms 1216.18 ms -4.80 ms

App size

Revision Plain With Sentry Diff
16888c0+dirty 3.38 MiB 4.73 MiB 1.35 MiB
8295078+dirty 3.38 MiB 4.73 MiB 1.35 MiB
f5fb57c+dirty 3.38 MiB 4.73 MiB 1.35 MiB
9530cff+dirty 3.38 MiB 4.73 MiB 1.35 MiB

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 25, 2026

Android (new) Performance metrics 🚀

  Plain With Sentry Diff
Startup time 365.62 ms 420.06 ms 54.44 ms
Size 43.94 MiB 48.94 MiB 5.00 MiB

Previous results on branch: fix/e2e-stable-checks

Startup times

Revision Plain With Sentry Diff
8295078+dirty 348.48 ms 390.08 ms 41.60 ms
9530cff+dirty 395.80 ms 426.40 ms 30.60 ms
16888c0+dirty 378.71 ms 386.84 ms 8.13 ms
f5fb57c+dirty 444.45 ms 473.36 ms 28.91 ms

App size

Revision Plain With Sentry Diff
8295078+dirty 43.94 MiB 48.94 MiB 5.00 MiB
9530cff+dirty 43.94 MiB 48.94 MiB 5.00 MiB
16888c0+dirty 43.94 MiB 48.94 MiB 5.00 MiB
f5fb57c+dirty 43.94 MiB 48.94 MiB 5.00 MiB

@antonis antonis force-pushed the fix/e2e-stable-checks branch from 05eb0bd to 605f13a Compare March 26, 2026 10:07
@antonis antonis changed the title fix(ci): Fix E2E flakiness with stable checks instead of retries fix(ci): Fix E2E flakiness without retries Mar 26, 2026
@antonis antonis force-pushed the fix/e2e-stable-checks branch 3 times, most recently from 8f483e2 to 2ff292c Compare March 27, 2026 13:46
@antonis antonis mentioned this pull request Mar 27, 2026
5 tasks
@antonis antonis force-pushed the fix/e2e-stable-checks branch 4 times, most recently from cdfefc9 to af0c664 Compare March 30, 2026 11:09
Replace retry-based approach (PR #5830) with deterministic fixes:

### Simulator stability (Cirrus Labs Tart VMs)
- `wait_for_boot: true` / `erase_before_boot: false` on simulator-action
- `xcrun simctl bootstatus booted -b` to block until boot completes
- Settings.app warm-up for SpringBoard/system service initialization
- `MAESTRO_DRIVER_STARTUP_TIMEOUT` bumped to 180s

### e2e-v2 test runner (cli.mjs)
- Per-flow process isolation via individual `maestro test` calls
- Maestro driver warm-up flow before real tests (non-fatal)
- crash.yml runs first so the next flow verifies post-crash recovery
- `execSync` → `execFileSync` to avoid shell interpolation
- SENTRY_AUTH_TOKEN redaction in debug logs

### Sample application test fixes
- Search all envelopes for app start transaction (slow VM delivery)
- Sort envelopes by timestamp for deterministic ordering
- Allow-list for TTID/TTFD ops (`navigation`, `ui.load`)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@antonis antonis force-pushed the fix/e2e-stable-checks branch from af0c664 to f509434 Compare March 30, 2026 11:10
@antonis antonis marked this pull request as ready for review March 30, 2026 12:02
Copy link
Copy Markdown
Contributor Author

@antonis antonis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lucas-zimerman I think we can also close #4913 if this solution works

@antonis antonis changed the title fix(ci): Fix E2E flakiness without retries fix(ci): Fix failing E2E without retries Mar 31, 2026
antonis and others added 2 commits March 31, 2026 13:01
With stdio: 'pipe', failed flow output was swallowed. Now the full
Maestro stdout/stderr is printed when a flow fails, making it easier
to diagnose flaky tests in CI logs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

@lucas-zimerman
Copy link
Copy Markdown
Collaborator

Other than CI AI comment, the code looks good!

antonis and others added 2 commits March 31, 2026 13:39
The iOS simulator readiness and Maestro warm-up steps were
incorrectly placed in the test-android job from a merge conflict
resolution. They had conditions referencing matrix.platform which
doesn't exist in that job, so they never ran.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-to-merge Triggers the full CI test suite

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix Flaky E2E test

2 participants