fix(ci): Fix E2E test flakiness by antonis · Pull Request #5830 · getsentry/sentry-react-native

antonis · 2026-03-17T15:08:26Z

📢 Type of change

Bugfix

📜 Description

Fixes iOS E2E test flakiness across e2e-v2 and sample-application workflows caused by the migration to Cirrus Labs Tart VMs (nested virtualisation).

Root cause: crash-loop cascade after `nativeCrash()` test

After crash.yml triggers Sentry.nativeCrash(), the plain launchApp (without clearState) causes the app to crash immediately on relaunch because the Sentry SDK reads the pending crash report during init and hits a failure path. This writes a second crash report, triggering iOS's crash-loop protection for the bundle ID. The cascade causes subsequent flows (feedback, close, captureException) to also fail with "App crashed or stopped while executing flow".

Fix: clearState: true on the post-crash launchApp in crash.yml, so Maestro reinstalls the app and clears the crash-loop state.

Root cause: simulator not fully ready on Tart VMs

Tart VMs use nested virtualisation, making the simulator significantly slower to stabilise after boot. Maestro's XCTest driver races to connect before SpringBoard and system services finish post-boot init.

Fixes:

wait_for_boot: true on simulator-action. Blocks until the simulator fully boots
erase_before_boot: false. Skip redundant erase (each flow already uses clearState)
Simulator warm-up step. Launch/terminate Settings.app so system services finish init
MAESTRO_DRIVER_STARTUP_TIMEOUT: 180000 (3 min). Extra headroom for Tart VM startup

Safety net: per-flow retries (3 attempts)

Even with the above fixes, the Tart VM environment has residual timing flakiness (~1 in 18 flows needed a retry in CI). Both test harnesses now retry each Maestro flow individually up to 3 times:

dev-packages/e2e-tests/cli.mjs. For the e2e-v2 workflow
samples/react-native/e2e/utils/maestro.ts. For the sample-application workflow

Sample app test fixes

Search all envelopes for app start transaction. On slow emulators the app start transaction may arrive in a separate envelope from the navigation transaction
Sort news envelopes by timestamp. Ensures consistent ordering regardless of arrival time
Exclude auto.app.start from time-to-display assertions. App start transactions have app_start_cold measurements, not TTID/TTFD

Supersedes #5752 and #5755 by unifying them and simplifying the changes

💡 Motivation and Context

iOS E2E tests have been failing on every main . The consistent failure pattern on main:

feedback, close, crash, captureException: all fail with "App crashed or stopped"
captureMessage, captureReplay: pass intermittently
sample-application iOS: fails on captureErrorsScreenTransaction (app start transaction not found in first envelope)

💚 How did you test it?

CI

📝 Checklist

I added tests to verify changes
No new PII added or SDK only sends newly added PII if sendDefaultPII is enabled
I updated the docs if needed.
I updated the wizard if needed.
All tests passing
No breaking changes

🔮 Next steps

#skip-changelog

After the react-native-test job was moved from GitHub-hosted macos-26 to Cirrus Labs Tart VMs (macos-tahoe-xcode:26.2.0), iOS simulators take longer to fully boot in the new virtualised environment. With `wait_for_boot` defaulting to false, Maestro was racing to connect before the simulator was ready, causing different failures on each run. - Add `wait_for_boot: true` to `futureware-tech/simulator-action` so the job blocks until the simulator has fully completed booting before Maestro connects. - Bump `MAESTRO_DRIVER_STARTUP_TIMEOUT` from 120s to 180s to give additional headroom for the Cirrus Labs runner environment. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

After crash.yml taps "Crash" (Sentry.nativeCrash()), the plain `launchApp` (without clearState) causes the app to crash immediately on relaunch (~82ms) because the Sentry SDK reads the pending crash report during initialisation and hits a failure path. This writes a second crash report on top of the first, triggering iOS's simulator crash-loop guard for the bundle ID. The cascade: 1. nativeCrash → crash report #1 written 2. launchApp (no clearState) → app crashes on startup → crash report #2 3. Next test (captureMessage) gets the crash-loop ban → instant exit on launch Fix: add `clearState: true` to the post-crash launchApp so Maestro reinstalls the app, clearing both the crash report and the crash-loop state before assertTestReady runs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

… VMs The iOS E2E tests have been consistently failing since the migration to Cirrus Labs Tart VMs (c1cade4). The nested virtualisation makes the simulator slower to stabilise, causing Maestro's XCTest driver to lose communication with the app on first launches. Two fixes: 1. Set erase_before_boot: false — each Maestro flow already reinstalls the app via clearState, so erasing the entire simulator is redundant and adds overhead that destabilises the simulator on Tart VMs. 2. Add a warm-up step that launches and terminates Settings.app so that SpringBoard and other system services finish post-boot initialisation before Maestro connects. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Cirrus Labs Tart VMs intermittently fail individual app launches — the app process exits before the JS bundle finishes loading, causing Maestro to report "App crashed or stopped". A single retry of the full suite is the most reliable way to absorb this flakiness. Also increased the warmup sleep from 3s to 5s to give SpringBoard more time to settle on the slow nested-virtualisation runners. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Instead of retrying the entire test suite, run each flow file individually with up to 3 attempts. This is more effective because different flows fail randomly on Tart VMs — retrying only the failed flow is faster and avoids re-running flows that already passed. The CLI now: 1. Lists all .yml files in the maestro/ directory 2. Runs each flow with `maestro test <flow.yml>` 3. On failure, retries the same flow up to 2 more times 4. Prints a summary of all results at the end Removes the suite-level retry wrapper from the workflow since per-flow retries in the CLI are more targeted. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Address CodeQL finding by using execFileSync with an argument array instead of execSync with a template string. This avoids shell interpolation of filesystem-sourced flow file names. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ners - Increase MAESTRO_DRIVER_STARTUP_TIMEOUT to 180s for slow Tart VMs - Add wait_for_boot and erase_before_boot: false to simulator-action - Add simulator warm-up step before running iOS tests - Sort spaceflight news envelopes by timestamp instead of arrival order - Relax HTTP spans assertion to >= 1 (not all layers complete on slow VMs) - Search all envelopes for app start transaction (may arrive separately) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

On slow Cirrus Labs Tart VMs, the app may crash during Maestro flow execution. Add up to 3 retries to handle transient app crashes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

App start transactions (origin: auto.app.start) have app_start_cold measurements but not time_to_initial_display/time_to_full_display. The filter already excluded ui.action.touch but not app start transactions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Use nullish coalescing for httpSpans length check to avoid TypeError when spans is undefined - Document maestro retry envelope contamination limitation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The warm-up step is best-effort and should not fail the build if the Preferences app fails to launch or terminate. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Use consistent comment and sleep 5 across both workflows, as suggested in PR review. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge both PRs that fix E2E test flakiness on Cirrus Labs Tart VMs: - iOS E2E fixes: simulator warm-up, per-flow retries, crash-loop prevention (#5752) - Sample app E2E fixes: increased timeouts, sorted envelopes, relaxed assertions (#5755) Conflict resolution: kept Maestro 2.3.0 from main with 180s timeout from #5755. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-03-17T15:08:49Z

Semver Impact of This PR

⚪ None (no version bump detected)

📋 Changelog Preview

This is how your changes will appear in the changelog.
Entries from this PR are highlighted with a left border (blockquote style).

This PR will not appear in the changelog.

_{🤖 This preview updates automatically when you update the PR.}

Reverts whitespace-only changes (@{ } -> @{}) in ObjC files that cause clang-format CI failures. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ness-combined

github-actions · 2026-03-17T16:17:45Z

Android (legacy) Performance metrics 🚀

	Plain	With Sentry	Diff
Startup time	500.66 ms	494.86 ms	-5.80 ms
Size	43.75 MiB	48.08 MiB	4.32 MiB

Baseline results on branch: main

Startup times

Revision	Plain	With Sentry	Diff
`4a17c8f`+dirty	406.62 ms	400.58 ms	-6.04 ms
`df1f7df`+dirty	442.64 ms	427.16 ms	-15.48 ms
`a483f9f`+dirty	396.82 ms	453.28 ms	56.46 ms
`60cd796`+dirty	445.84 ms	492.45 ms	46.61 ms
`5c16cdc`+dirty	423.48 ms	452.35 ms	28.88 ms
`80e4616`+dirty	411.58 ms	462.12 ms	50.54 ms
`55b77fc`+dirty	411.87 ms	417.16 ms	5.29 ms
`bca62c0`+dirty	414.36 ms	451.06 ms	36.70 ms
`0b64753`+dirty	448.67 ms	474.61 ms	25.94 ms
`4e6d7d7`+dirty	480.73 ms	515.73 ms	35.00 ms

App size

Revision	Plain	With Sentry	Diff
`4a17c8f`+dirty	43.75 MiB	47.99 MiB	4.24 MiB
`df1f7df`+dirty	43.75 MiB	48.08 MiB	4.33 MiB
`a483f9f`+dirty	43.75 MiB	48.41 MiB	4.66 MiB
`60cd796`+dirty	43.75 MiB	48.07 MiB	4.32 MiB
`5c16cdc`+dirty	17.75 MiB	19.68 MiB	1.94 MiB
`80e4616`+dirty	43.75 MiB	48.55 MiB	4.80 MiB
`55b77fc`+dirty	43.75 MiB	47.99 MiB	4.24 MiB
`bca62c0`+dirty	43.75 MiB	48.41 MiB	4.66 MiB
`0b64753`+dirty	17.75 MiB	19.70 MiB	1.95 MiB
`4e6d7d7`+dirty	43.75 MiB	48.40 MiB	4.64 MiB

Previous results on branch: antonis/fix-e2e-flakiness-combined

Startup times

Revision	Plain	With Sentry	Diff
`7e6fe7f`+dirty	425.16 ms	471.08 ms	45.92 ms
`c96c5b7`+dirty	446.10 ms	449.47 ms	3.37 ms
`d34c279`+dirty	440.10 ms	477.96 ms	37.86 ms
`ca6a32c`+dirty	396.66 ms	442.52 ms	45.86 ms
`b6f917a`+dirty	468.08 ms	529.56 ms	61.48 ms

App size

Revision	Plain	With Sentry	Diff
`7e6fe7f`+dirty	43.75 MiB	48.08 MiB	4.32 MiB
`c96c5b7`+dirty	43.75 MiB	48.08 MiB	4.32 MiB
`d34c279`+dirty	43.75 MiB	48.32 MiB	4.57 MiB
`ca6a32c`+dirty	43.75 MiB	48.08 MiB	4.32 MiB
`b6f917a`+dirty	43.75 MiB	48.07 MiB	4.32 MiB

github-actions · 2026-03-17T16:26:47Z

iOS (legacy) Performance metrics 🚀

	Plain	With Sentry	Diff
Startup time	1214.98 ms	1218.98 ms	4.00 ms
Size	3.38 MiB	4.73 MiB	1.35 MiB

Baseline results on branch: main

Startup times

Revision	Plain	With Sentry	Diff
`ea3e26e`+dirty	1229.13 ms	1228.46 ms	-0.67 ms
`80e4616`+dirty	1221.32 ms	1225.64 ms	4.32 ms
`818a608`+dirty	1205.76 ms	1208.00 ms	2.24 ms
`77061ed`+dirty	1233.16 ms	1234.88 ms	1.71 ms
`bef3709`+dirty	1222.07 ms	1220.24 ms	-1.83 ms
`a206511`+dirty	1185.00 ms	1186.35 ms	1.35 ms
`74979ac`+dirty	1210.49 ms	1213.31 ms	2.82 ms
`a2bb688`+dirty	1223.53 ms	1232.90 ms	9.37 ms
`8a868fe`+dirty	1221.50 ms	1230.78 ms	9.28 ms
`d590428`+dirty	1211.77 ms	1220.51 ms	8.75 ms

App size

Revision	Plain	With Sentry	Diff
`ea3e26e`+dirty	3.41 MiB	4.58 MiB	1.17 MiB
`80e4616`+dirty	3.38 MiB	4.60 MiB	1.22 MiB
`818a608`+dirty	2.63 MiB	3.91 MiB	1.28 MiB
`77061ed`+dirty	2.63 MiB	3.98 MiB	1.34 MiB
`bef3709`+dirty	3.38 MiB	4.78 MiB	1.40 MiB
`a206511`+dirty	3.41 MiB	4.67 MiB	1.25 MiB
`74979ac`+dirty	3.38 MiB	4.60 MiB	1.22 MiB
`a2bb688`+dirty	2.63 MiB	3.99 MiB	1.36 MiB
`8a868fe`+dirty	3.38 MiB	4.60 MiB	1.22 MiB
`d590428`+dirty	3.38 MiB	4.78 MiB	1.39 MiB

Previous results on branch: antonis/fix-e2e-flakiness-combined

Startup times

Revision	Plain	With Sentry	Diff
`ca6a32c`+dirty	1229.87 ms	1228.40 ms	-1.47 ms
`d34c279`+dirty	1226.58 ms	1225.89 ms	-0.69 ms
`c96c5b7`+dirty	1194.79 ms	1193.75 ms	-1.04 ms
`7e6fe7f`+dirty	1194.50 ms	1192.54 ms	-1.96 ms
`b6f917a`+dirty	1230.39 ms	1223.63 ms	-6.76 ms

App size

Revision	Plain	With Sentry	Diff
`ca6a32c`+dirty	3.38 MiB	4.73 MiB	1.35 MiB
`d34c279`+dirty	3.38 MiB	4.72 MiB	1.34 MiB
`c96c5b7`+dirty	3.38 MiB	4.73 MiB	1.35 MiB
`7e6fe7f`+dirty	3.38 MiB	4.73 MiB	1.35 MiB
`b6f917a`+dirty	3.38 MiB	4.72 MiB	1.34 MiB

github-actions · 2026-03-17T16:39:26Z

iOS (new) Performance metrics 🚀

	Plain	With Sentry	Diff
Startup time	1220.13 ms	1222.59 ms	2.46 ms
Size	3.38 MiB	4.73 MiB	1.35 MiB

Baseline results on branch: main

Startup times

Revision	Plain	With Sentry	Diff
`ea3e26e`+dirty	1216.61 ms	1214.15 ms	-2.47 ms
`80e4616`+dirty	1206.90 ms	1205.94 ms	-0.96 ms
`818a608`+dirty	1218.84 ms	1223.18 ms	4.34 ms
`77061ed`+dirty	1210.77 ms	1218.45 ms	7.68 ms
`bef3709`+dirty	1217.79 ms	1225.33 ms	7.54 ms
`a206511`+dirty	1225.02 ms	1223.74 ms	-1.28 ms
`74979ac`+dirty	1212.33 ms	1212.54 ms	0.21 ms
`a2bb688`+dirty	1244.82 ms	1238.60 ms	-6.22 ms
`8a868fe`+dirty	1206.85 ms	1215.04 ms	8.19 ms
`d590428`+dirty	1221.23 ms	1225.27 ms	4.03 ms

App size

Revision	Plain	With Sentry	Diff
`ea3e26e`+dirty	3.41 MiB	4.58 MiB	1.17 MiB
`80e4616`+dirty	3.38 MiB	4.60 MiB	1.22 MiB
`818a608`+dirty	3.19 MiB	4.48 MiB	1.29 MiB
`77061ed`+dirty	3.19 MiB	4.54 MiB	1.36 MiB
`bef3709`+dirty	3.38 MiB	4.78 MiB	1.40 MiB
`a206511`+dirty	3.41 MiB	4.67 MiB	1.25 MiB
`74979ac`+dirty	3.38 MiB	4.60 MiB	1.22 MiB
`a2bb688`+dirty	3.19 MiB	4.56 MiB	1.37 MiB
`8a868fe`+dirty	3.38 MiB	4.60 MiB	1.22 MiB
`d590428`+dirty	3.38 MiB	4.78 MiB	1.39 MiB

Previous results on branch: antonis/fix-e2e-flakiness-combined

Startup times

Revision	Plain	With Sentry	Diff
`ca6a32c`+dirty	1231.83 ms	1241.28 ms	9.45 ms
`d34c279`+dirty	1210.63 ms	1224.85 ms	14.22 ms
`c96c5b7`+dirty	1223.89 ms	1228.02 ms	4.13 ms
`7e6fe7f`+dirty	1211.67 ms	1210.47 ms	-1.20 ms
`b6f917a`+dirty	1212.11 ms	1220.00 ms	7.89 ms

App size

Revision	Plain	With Sentry	Diff
`ca6a32c`+dirty	3.38 MiB	4.73 MiB	1.35 MiB
`d34c279`+dirty	3.38 MiB	4.72 MiB	1.34 MiB
`c96c5b7`+dirty	3.38 MiB	4.73 MiB	1.35 MiB
`7e6fe7f`+dirty	3.38 MiB	4.73 MiB	1.35 MiB
`b6f917a`+dirty	3.38 MiB	4.72 MiB	1.34 MiB

github-actions · 2026-03-17T16:43:16Z

Android (new) Performance metrics 🚀

	Plain	With Sentry	Diff
Startup time	376.63 ms	426.27 ms	49.64 ms
Size	43.94 MiB	48.93 MiB	5.00 MiB

Baseline results on branch: main

Startup times

Revision	Plain	With Sentry	Diff
`70250df`+dirty	418.08 ms	480.84 ms	62.76 ms
`8d89cc9`+dirty	357.69 ms	415.79 ms	58.10 ms
`1853710`+dirty	360.67 ms	396.28 ms	35.61 ms
`55b77fc`+dirty	410.46 ms	414.11 ms	3.65 ms
`69602ce`+dirty	375.37 ms	405.28 ms	29.91 ms
`c1573b3`+dirty	355.65 ms	448.82 ms	93.17 ms
`90afdd3`+dirty	367.79 ms	404.84 ms	37.05 ms
`955f2eb`+dirty	388.13 ms	433.56 ms	45.44 ms
`80e4616`+dirty	427.31 ms	461.15 ms	33.84 ms
`276d348`+dirty	356.30 ms	405.27 ms	48.97 ms

App size

Revision	Plain	With Sentry	Diff
`70250df`+dirty	43.94 MiB	48.91 MiB	4.97 MiB
`8d89cc9`+dirty	7.15 MiB	8.41 MiB	1.26 MiB
`1853710`+dirty	7.15 MiB	8.41 MiB	1.26 MiB
`55b77fc`+dirty	43.94 MiB	48.82 MiB	4.88 MiB
`69602ce`+dirty	7.15 MiB	8.41 MiB	1.26 MiB
`c1573b3`+dirty	7.15 MiB	8.42 MiB	1.27 MiB
`90afdd3`+dirty	7.15 MiB	8.43 MiB	1.28 MiB
`955f2eb`+dirty	7.15 MiB	8.42 MiB	1.27 MiB
`80e4616`+dirty	43.94 MiB	49.38 MiB	5.44 MiB
`276d348`+dirty	7.15 MiB	8.42 MiB	1.26 MiB

Previous results on branch: antonis/fix-e2e-flakiness-combined

Startup times

Revision	Plain	With Sentry	Diff
`7e6fe7f`+dirty	477.04 ms	520.53 ms	43.49 ms
`c96c5b7`+dirty	380.02 ms	436.37 ms	56.35 ms
`d34c279`+dirty	422.73 ms	453.91 ms	31.18 ms
`ca6a32c`+dirty	413.82 ms	475.83 ms	62.02 ms
`b6f917a`+dirty	367.91 ms	412.94 ms	45.03 ms

App size

Revision	Plain	With Sentry	Diff
`7e6fe7f`+dirty	43.94 MiB	48.93 MiB	5.00 MiB
`c96c5b7`+dirty	43.94 MiB	48.93 MiB	5.00 MiB
`d34c279`+dirty	43.94 MiB	49.18 MiB	5.24 MiB
`ca6a32c`+dirty	43.94 MiB	48.93 MiB	5.00 MiB
`b6f917a`+dirty	43.94 MiB	48.93 MiB	4.99 MiB

# Conflicts: # packages/core/android/libs/replay-stubs.jar

…ness-combined

Revert the relaxation of the HTTP spans check from >= 1 back to exactly 2. The other fixes (simulator warm-up, wait_for_boot, clearState, per-flow retries) address the actual flakiness; weakening this assertion would permanently hide regressions where one tracing layer stops producing spans. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

.github/workflows/e2e-v2.yml

- Add platform-check skip guard to warm-up steps in both e2e-v2 and sample-application workflows to avoid wasting CI time when skipped - Write maestro debug logs to per-flow/per-attempt dirs so failed attempt logs are preserved for debugging - Use path.parse() for flow name extraction - Add empty results guard in cli.mjs - Remove retry logic from sample app maestro.ts to avoid mock server envelope accumulation across retries (retries stay in cli.mjs only) - Revert unrelated expo/app.json formatting change Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

antonis · 2026-03-24T11:28:54Z

@cursor review

antonis · 2026-03-24T11:29:06Z

@sentry review

cursor

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

…up flow The first Maestro flow after simulator boot consistently fails because the app isn't fully ready ("E2E Tests Ready" not visible). The Settings.app warm-up didn't help because it only warms SpringBoard, not the XCTest driver or the test app itself. Replace both the Settings.app warm-up step (in e2e-v2 and sample-application workflows) and the per-flow retry logic (in cli.mjs) with a dedicated Maestro warm-up flow that launches the actual test app and waits for "E2E Tests Ready" before running the real test suite. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Now that the warm-up flow eliminates the first-launch flakiness, remove the per-flow orchestration, results tracking, and summary logging that were only needed for retry support. The only change vs main is the warm-up flow call before the existing `maestro test maestro` command. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Running all flows via `maestro test maestro` shares a single runner session. When crash.yml kills the app, Maestro's XCTest driver loses the connection and subsequent flows fail with "App crashed or stopped". Run each flow in its own maestro process instead. This is the minimal change needed — no retries, no summary, just per-flow isolation with a warm-up flow before the test suite. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

After nativeCrash() the app process is dead but Maestro's XCTest driver may be in a bad state. Adding killApp before launchApp with clearState ensures a clean restart. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The post-crash launchApp + assertTestReady was unreliable on Tart VMs because Maestro's XCTest driver gets into a bad state after nativeCrash(). Since each flow now runs in its own maestro process, the next flow starts fresh regardless — the post-crash recovery is unnecessary. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The warm-up flow helps but doesn't fully eliminate first-launch timing issues on Cirrus Labs Tart VMs. On especially slow boots, the warm-up itself can fail and the first few flows fail before the simulator stabilises. Retry each flow up to 3 times to handle this. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The sample-application workflow uses its own test runner (maestro.ts) and doesn't go through cli.mjs, so it doesn't benefit from the Maestro warm-up flow. Restore the Settings.app warm-up step with the proper platform-check skip guard to keep it resilient against Tart VM timing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The Maestro warm-up flow alone isn't reliable enough — it depends on Maestro's XCTest driver which itself needs the simulator to be ready. Add a Settings.app warm-up step to e2e-v2.yml (matching sample-application) to warm up OS-level services first, then the Maestro warm-up flow handles the XCTest driver and test app. Remove per-flow retries — the layered warm-up approach (Settings.app + Maestro warm-up flow) should be sufficient. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

antonis and others added 18 commits March 3, 2026 15:26

Merge branch 'main' into antonis/e2e-ios-flakiness-fix

56f999d

Merge branch 'main' into antonis/e2e-ios-flakiness-fix

4a309b3

fix(e2e): Add retry logic to sample app Maestro test runner

564e323

On slow Cirrus Labs Tart VMs, the app may crash during Maestro flow execution. Add up to 3 retries to handle transient app crashes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge branch 'main' into antonis/sample-e2e-flakiness-fix

19770a2

fix(e2e): Address PR review feedback

2101a2c

- Use nullish coalescing for httpSpans length check to avoid TypeError when spans is undefined - Document maestro retry envelope contamination limitation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge branch 'main' into antonis/sample-e2e-flakiness-fix

73bdf0c

fix(ci): Add || true to simulator warm-up commands

4d9b775

The warm-up step is best-effort and should not fail the build if the Preferences app fails to launch or terminate. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix(ci): Align simulator warm-up with e2e-v2 workflow

9112cb8

Use consistent comment and sleep 5 across both workflows, as suggested in PR review. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge branch 'main' into antonis/e2e-ios-flakiness-fix

f83dd9a

antonis added the ready-to-merge Triggers the full CI test suite label Mar 17, 2026

This was referenced Mar 17, 2026

fix(ci): Fix Sample Application E2E test flakiness #5755

Closed

fix(ci): Fix consistent iOS E2E flakiness on Cirrus Labs runners #5752

Closed

antonis and others added 2 commits March 17, 2026 16:23

fix(ios): Revert ObjC formatting changes that fail CI lint

e248f7a

Reverts whitespace-only changes (@{ } -> @{}) in ObjC files that cause clang-format CI failures. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Merge remote-tracking branch 'origin/main' into antonis/fix-e2e-flaki…

6517e21

…ness-combined

antonis added 2 commits March 18, 2026 16:55

Merge branch 'main' into antonis/fix-e2e-flakiness-combined

0f7cc11

# Conflicts: # packages/core/android/libs/replay-stubs.jar

Revert unneeded stubs change

ce364f0

antonis and others added 5 commits March 19, 2026 16:54

Merge branch 'main' into antonis/fix-e2e-flakiness-combined

ba80648

Merge branch 'main' into antonis/fix-e2e-flakiness-combined

bb24132

Merge branch 'main' into antonis/fix-e2e-flakiness-combined

01c920e

Merge remote-tracking branch 'origin/main' into antonis/fix-e2e-flaki…

97d17ea

…ness-combined

antonis changed the title ~~fix(ci): Fix E2E test flakiness on Cirrus Labs runners~~ fix(ci): Fix E2E test flakiness Mar 24, 2026

cursor bot reviewed Mar 24, 2026

View reviewed changes

.github/workflows/e2e-v2.yml Outdated Show resolved Hide resolved

cursor bot reviewed Mar 24, 2026

View reviewed changes

antonis and others added 9 commits March 24, 2026 13:04

fix(e2e): Kill app before relaunch in crash flow

36b3cb6

After nativeCrash() the app process is dead but Maestro's XCTest driver may be in a bad state. Adding killApp before launchApp with clearState ensures a clean restart. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Merge branch 'main' into antonis/fix-e2e-flakiness-combined

41a7f1c

Uh oh!

Conversation

antonis commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📢 Type of change

📜 Description

Root cause: crash-loop cascade after nativeCrash() test

Root cause: simulator not fully ready on Tart VMs

Safety net: per-flow retries (3 attempts)

Sample app test fixes

💡 Motivation and Context

💚 How did you test it?

📝 Checklist

🔮 Next steps

Uh oh!

github-actions bot commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Semver Impact of This PR

Uh oh!

github-actions bot commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Android (legacy) Performance metrics 🚀

Baseline results on branch: main

Startup times

App size

Previous results on branch: antonis/fix-e2e-flakiness-combined

Startup times

App size

Uh oh!

github-actions bot commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

iOS (legacy) Performance metrics 🚀

Baseline results on branch: main

Startup times

App size

Previous results on branch: antonis/fix-e2e-flakiness-combined

Startup times

App size

Uh oh!

github-actions bot commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

iOS (new) Performance metrics 🚀

Baseline results on branch: main

Startup times

App size

Previous results on branch: antonis/fix-e2e-flakiness-combined

Startup times

App size

Uh oh!

github-actions bot commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Android (new) Performance metrics 🚀

Baseline results on branch: main

Startup times

App size

Previous results on branch: antonis/fix-e2e-flakiness-combined

Startup times

App size

Uh oh!

Uh oh!

antonis commented Mar 24, 2026

Uh oh!

antonis commented Mar 24, 2026

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

antonis commented Mar 17, 2026 •

edited

Loading

Root cause: crash-loop cascade after `nativeCrash()` test

github-actions bot commented Mar 17, 2026 •

edited

Loading

github-actions bot commented Mar 17, 2026 •

edited

Loading

github-actions bot commented Mar 17, 2026 •

edited

Loading

github-actions bot commented Mar 17, 2026 •

edited

Loading

github-actions bot commented Mar 17, 2026 •

edited

Loading