Skip to content

mobile: gate all c.bl.* blockchain methods through beginOp to prevent nil-deref SIGSEGV racing Shutdown (#241 follow-up) #242

@ehsan6sha

Description

@ehsan6sha

Summary

The ~40 c.bl.* blockchain/hardware methods on the mobile Client (in mobile/blockchain.go) are not gated by beginOp(), so they can run concurrently with Shutdown() and dereference a torn-down client → Go nil-pointer SIGSEGV → SIGABRT (hard crash on Android).

PR #241 added the beginOp/inflight-drain/ErrClientClosed lifecycle gate but only applied it to ConnectToBlox and Ping. Every blockchain read (BloxFreeSpace, GetFolderSize, GetDatastoreSize, ListActivePlugins, GetClusterInfo, PoolList, AccountExists, …) still does ctx := context.TODO(); return c.bl.X(ctx, c.bloxPid, …) with no closed-check, so they race Shutdown's c.h.Close() / c.ds.Close().

Impact (downstream)

In the fx-components box app, one shared native client is reused across bloxes. Re-setting-up the currently-selected blox runs initFula = logout + Shutdown + newClient while the still-mounted Blox screen keeps polling BloxFreeSpace/GetFolderSize/ListActivePlugins. The in-flight read nil-derefs the closing client. On-device logcat:

ReactNativeJS: Error listActivePlugins: ... all dials failed
ReactNative:   Previous Fula client shutdown successfully.
ReactNative:   Creating a new Fula instance / Fula initialized
E Go:  panic: runtime error: invalid memory address or nil pointer dereference
E Go:  [signal SIGSEGV: segmentation violation code=0x1 addr=0xb8 ...]
F libc: Fatal signal 6 (SIGABRT) ... land.fx.blox

Reproduced deterministically in a unit test: calling BloxFreeSpace() on a client after Shutdown() panics with nil pointer dereference ... BloxFreeSpace(0x0?) — identical to the field crash.

Fix

Gate all 44 c.bl.* methods through the existing beginOp():

ctx, done, err := c.beginOp()
if err != nil {
    return nil, err  // ErrClientClosed once Shutdown has started
}
defer done()
return c.bl.X(ctx, c.bloxPid, ...)

This (1) rejects calls that start after Shutdown with ErrClientClosed instead of nil-dereferencing, and (2) registers each call in inflight so Shutdown waits for in-flight calls to finish (and opCancel cancels them) before freeing the host/datastore. Two methods need bespoke handling: ReplicateInPool (returns bare []byte, so returns an error-JSON on closed) and ChatWithAI (gates the setup call but keeps context.TODO() for the streaming buffer that outlives the call).

Tests

mobile/client_shutdown_test.go: TestBlockchainMethodsAfterShutdownReturnErrClosed — table test asserting BloxFreeSpace/GetFolderSize/GetDatastoreSize/ListActivePlugins/GetClusterInfo/PoolList/AccountExists/ChatWithAI all return ErrClientClosed after Shutdown. Verified to panic (reproduce the crash) when the gate is removed, and pass with it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions