fix: add IDB healing mechanism for backing store corruption and connection lost errors#780
Conversation
bd7a14f to
32a6cc8
Compare
| * https://github.com/Expensify/App/issues/87862 | ||
| */ | ||
| function isBackingStoreError(error: unknown): boolean { | ||
| return error instanceof DOMException && error.name === 'UnknownError' && error.message.includes('Internal error opening backing store'); |
There was a problem hiding this comment.
| return error instanceof DOMException && error.name === 'UnknownError' && error.message.includes('Internal error opening backing store'); | |
| return error instanceof Error && error.message.includes('Internal error opening backing store'); |
I think we can simplify to this
| * Detects the Chromium-specific IDB backing store corruption error. | ||
| * Fires when LevelDB files backing IndexedDB are corrupted and Chrome's | ||
| * internal recovery (RepairDB -> delete -> recreate) also fails. | ||
| * https://github.com/Expensify/App/issues/87862 |
There was a problem hiding this comment.
| * https://github.com/Expensify/App/issues/87862 |
I dont think its needed to link issue here
| executeTransaction(txMode, callback) | ||
| .then(resetHealBudget) | ||
| .catch((error) => { | ||
| if (error instanceof DOMException && error.name === 'InvalidStateError') { |
There was a problem hiding this comment.
| if (error instanceof DOMException && error.name === 'InvalidStateError') { | |
| if (error instanceof Error && error.name === 'InvalidStateError') { |
Same
|
Adds a Dexie-style heal pattern to createStore for Chromium's Internal error opening backing store error (884K errors/month). - isBackingStoreError() detects the Chromium-specific corruption - Shared healAttemptsRemaining counter (3, reset on success) - On backing store error: clear cached connection, retry once - Clear dbp on rejection so retries get fresh indexedDB.open() - 5 new tests: mid-session heal, init heal, budget exhaustion, budget reset, error classification No deleteDatabase(), no provider swap, no UI changes. Scoped to IDBKeyValProvider only -- SQLite provider untouched. Ref: Expensify/App#90636 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Capture dbp reference before attaching reject handler; only clear if dbp hasn't been replaced by a concurrent heal/retry (prevents stale rejection handler from clearing a newer promise) - Add comment documenting concurrent store() budget drain behavior - Fix test formatting Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ose) The heal path clears the cached dbp and reopens via indexedDB.open(), but does not call db.close() on the old IDBDatabase. Updated comments and log messages from 'close + reopen' to 'drop cached connection and reopen' to match what the code actually does. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- isBackingStoreError: use Error instead of DOMException, drop .name check - InvalidStateError catch: same simplification - Remove issue link from JSDoc Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add isConnectionLostError() to detect 'Connection to Indexed Database server lost' and 'Connection is closing' — Safari/WebKit errors that fire when the browser terminates IDB connections for backgrounded tabs. Uses the same heal-and-retry mechanism as backing store corruption: drop cached dbp, retry once with fresh indexedDB.open(), shared budget. Addresses Expensify/App#87864. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
3a900d8 to
beb75d5
Compare
|
|
@leshniak let's handle |
Extract module-level helpers: isInvalidStateError, isBudgetedHealError, getBudgetedHealErrorLabel. Extract cacheOpenPromise to deduplicate rejected-promise cleanup in getDB and verifyStoreExists. Pure refactor — no behavior change. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
beb75d5 to
d1f95a0
Compare
- Heal attempts: logAlert with error type, action taken, remaining budget - Heal success: logInfo confirming recovery after each error type - Budget exhaustion: explicit logAlert when heal budget drains - Non-recoverable errors: logAlert with error details - Updated test assertions to match new log messages and levels Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Probe start: logInfo when tab becomes visible and probe begins - Probe healthy: logInfo confirming connection is healthy - Probe stale: logAlert with error details when stale connection detected - Heal attempts/success/exhaustion/non-recoverable: same as Expensify#780 - Updated test assertions to match new log messages and levels Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fabioh8010
left a comment
There was a problem hiding this comment.
- I checked the E/App PR and description looks broken – also we are still missing videos/evidence there
- Please add the same videos/evidence to this PR description in order for CI to pass.
| function isConnectionLostError(error: unknown): boolean { | ||
| if (!(error instanceof Error)) return false; | ||
| const msg = error.message.toLowerCase(); | ||
| return msg.includes('connection to indexed database server lost') || msg.includes('connection is closing'); |
There was a problem hiding this comment.
Using msg.includes('connection is closing') means we will include other connection closing errors too, e.g.:
- InvalidStateError: Failed to execute 'transaction' on 'IDBDatabase': The database connection is closing. (initially handled in fix the database connection is closing issue. #748)
- UnknownError: Connection is closing because of: IO error:
- UnknownError: Connection is closing because of: Failed to remove blob file.
- UnknownError: Connection is closing.
- UnknownError: Connection is closing because of: Force close delete origin
- UnknownError: Connection is closing because of: Corruption: block checksum mismatch
It's intended, right?
Only budget exhaustion and non-recoverable errors should trigger alerts. Heal attempts are handled gracefully and only need informational logging. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Details
Adds an IDB healing mechanism in
createStore.ts— the IDB connection manager forIDBKeyValProvider— addressing two sibling storage error classes:UnknownError: Internal error opening backing store for indexedDB.open.— 884K errors/month, 26.3% of all storage errors, Chrome/Edge only (investigation, solution design)UnknownError: Connection to Indexed Database server lost.— 637K errors/month, 19% of all storage errors, 100% WebKit (investigation, solution)Both errors share the same structural problem: a stale cached
dbpis never cleared, so all retries reuse the dead/corrupt connection. This is a Dexie-style heal pattern (PR1398_maxLoop).What it does:
isBackingStoreError()— detects Chromium LevelDB corruption ('Internal error opening backing store')isConnectionLostError()— detects Safari/WebKit connection termination ('connection to indexed database server lost','connection is closing'). WebKit #197050, #201483healAttemptsRemainingcounter (3 attempts, reset on every successful IDB operation)dbp, retryexecuteTransactiononce (forces a freshindexedDB.open())dbp(capture reference, only clear if unchanged)dbpon rejection ingetDB()andverifyStoreExists(fixes pre-existing bug where a cached rejected promise caused infinite failures)Logger.logAlert/logInfoat every step: error detection, heal attempt, successful recovery, budget exhaustion, and non-recoverable errorsWhat it does NOT do (by design per #90636):
deleteDatabase()— proven to also fail when LevelDB files are corruptMemoryOnlyProviderdegradation — cache already absorbs all writes during the sessionstorage/index.tsorOnyxUtils.ts— those are separate issues (#90632, #90633)E/App Test PR
Expensify/App#91085 — pins
react-native-onyxto this PR's HEAD commit for integration testing.Related Issues
Expensify/App#90636
Expensify/App#87862
Expensify/App#87864
Automated Tests
19 tests in
tests/unit/storage/providers/createStoreTest.ts:InvalidStateError retry (5): retry + succeed, retry + propagate, non-InvalidState DOMException skipped, non-DOMException skipped, data integrity after retry
Diagnostic logging (1): alert with all fields on retry
Onclose/onversionchange handlers (4): log + recover for each
Backing store healing (5): mid-session heal, init-time heal, budget exhaustion, budget reset, error classification (wrong message / QuotaExceeded bypass)
Connection lost healing (4): heal + reopen, budget exhaustion, "connection is closing" variant, shared budget with backing store
All tests pass.
Manual Tests
Each test injects a monkey-patch via the browser console to simulate the error class, then triggers an Onyx write (e.g. navigate to a different chat). No code changes needed.
Test 1 — Backing store error (Chromium path):
[Onyx] IDB heal: backing store error detected — dropping cached connection and reopening (2 attempts left)[Onyx] IDB heal: successfully recovered after backing store errorTest 2 — Connection lost (Safari path):
[Onyx] IDB heal: connection lost error detected — dropping cached connection and reopening (2 attempts left)[Onyx] IDB heal: successfully recovered after connection lost errorTest 3 — Budget exhaustion (3 consecutive failures, then gives up):
[Onyx] IDB heal: backing store error detected — dropping cached connection and reopening (2 attempts left)[Onyx] IDB heal: backing store error detected — dropping cached connection and reopening (1 attempts left)[Onyx] IDB heal: backing store error detected — dropping cached connection and reopening (0 attempts left)[Onyx] IDB heal: backing store error — heal budget exhausted, giving upAuthor Checklist
### Related Issuessection aboveTestssectiontoggleReportand notonIconClick)myBool && <MyComponent />.STYLE.md) were followedAvatar, I verified the components usingAvatarare working as expected)/** comment above it */thisproperly so there are no scoping issues (i.e. foronClick={this.submit}the methodthis.submitshould be bound tothisin the constructor)thisare necessary to be bound (i.e. avoidthis.submit = this.submit.bind(this);ifthis.submitis never passed to a component event handler likeonClick)Avataris modified, I verified thatAvataris working as expected in all cases)mainbranch was merged into this PR after a review, I tested again and verified the outcome was still expected according to theTeststeps.Screenshots/Videos
Android: Native
N/A — IDB is web-only, no native changes.
Android: mWeb Chrome
iOS: Native
N/A — IDB is web-only, no native changes.
iOS: mWeb Safari
MacOS: Chrome / Safari
Healed (simulated error):
healed.mp4
Killed connection (simulated error):
killed.mp4
Exhausted (simulated error):
exhausted.mp4
Healed (simulated on Safari):
safari_error.mp4
Killed connection (simulated on Safari):
safari_killed.mp4
Exhausted (simulated on Safari):
safari_exhausted.mp4