Restore refresh-token silent renew with renew-or-signout on 401#5700
Draft
samsondav wants to merge 2 commits into
Draft
Restore refresh-token silent renew with renew-or-signout on 401#5700samsondav wants to merge 2 commits into
samsondav wants to merge 2 commits into
Conversation
…on-network#5682) PR canton-network#5545 set `automaticSilentRenew={false}` on the `react-oidc-context` provider at the same time as it removed `offline_access` from the scope request. With both in place, the UIs no longer use any refresh token, so users are forced to re-authenticate every time the access token expires - every 5 minutes against a default Keycloak deployment. This change restores token renewal across both IdP shapes without introducing the iframe-based silent-auth fallback that is broken by default on Safari ITP and Firefox Total Cookie Protection. Two layers: 1. Re-enable `automaticSilentRenew` on `OidcAuthProvider`. The library uses the `refresh_token` grant when one is present in the stored `User`. The iframe fallback (oidc-client-ts.js:2540) is structurally disabled because Splice does not pass `silent_redirect_uri`, so `signinSilent()` throws "No silent_redirect_uri configured" instead of attempting the iframe path. 2. Replace `signoutOnExpiry` with `renewOrSignout` in `UserProvider`. On `splice:auth-expired` (fired from `BaseApiMiddleware` / `LedgerApiContext` on any 401), the handler attempts one `auth.signinSilent()` before giving up and signing the user out. This recovers two scenarios: - The proactive `automaticSilentRenew` timer never fired (suspended tab, clock skew, or the access token expired before the `accessTokenExpiring` notification window). - A request was sent with a valid token but reached the server after expiry, producing a 401 concurrent with an in-flight silent renew. Single-flighted via `renewInFlight` so concurrent 401s from a fan-out collapse into one `signinSilent` call. On successful refresh we `invalidateQueries` so queries that already failed (react-query's `retryQuery` short-circuits on 401) pick up the new token. `enable_offline_scope: boolean` is added as an optional rs256 `AuthConfig` field for Auth0 deployments where the IdP only returns a refresh token when `offline_access` is explicitly requested. Default false (Keycloak issues a session-bound refresh token without it; using `offline_access` requests a long-lived offline refresh token, a larger XSS surface for browser-based clients). Tests: - `AuthProvider.test.tsx`: asserts `automaticSilentRenew=true` and that `silent_redirect_uri` is not passed. - `UserContextRenewOrSignout.test.tsx`: `signinSilent` is called when a refresh token is present; `removeUser` is called when not; failure path falls through to signout; concurrent 401 events collapse to a single `signinSilent` call. - `authSchema.test.ts`: parse behavior of `enable_offline_scope`. - `oidcAuthToProviderProps.test.ts`: scope construction with and without the opt-in, the `daml_ledger_api` no-openid carve-out, and no-leak of the field into provider props. Fixes canton-network#5682 Signed-off-by: Sam Davies <sam@avrofi.com>
After running a full review fleet (correctness, security, test coverage,
style, cross-cutting, holistic, codex), several findings:
- Codex high finding: with `automaticSilentRenew={true}` AND the new
401-driven `renewOrSignout`, both paths can fire `signinSilent`
concurrently. With IdPs that rotate refresh tokens, the second call
fails (refresh token already exchanged) and the catch block signs the
user out even though the first renew succeeded. Fix: drop
`automaticSilentRenew` to `false`. The 401-driven path becomes the
only renewal mechanism, single-flighted via `renewInFlight`. Trade:
one extra 401 round-trip per token expiry per active session;
eliminates the rotation collision.
- Correctness + codex: `invalidateQueries()` was called immediately
after `signinSilent` resolved, before `userAccessToken` state had
propagated. Refetches went out through clients still holding the
stale token and produced a second 401. Fix: defer invalidation via
a `pendingInvalidation` ref set in renewOrSignout and consumed in a
new effect that watches `userAccessToken`.
- Cross-cutting + codex critical: `apps/sv/frontend/.../redirect.test.tsx`
rendered `UserProvider` without wrapping it in `QueryClientProvider`.
After this PR's `useQueryClient()` call inside UserProvider, the test
throws at render. Fix: wrap the test harness in QueryClientProvider.
- Cross-cutting: `apps/sv/frontend/package.json` did not declare
`@tanstack/react-query` directly, relying on transitive hoist.
Version drift could give two react-query instances in the bundle.
Fix: pin `@tanstack/react-query: 5.90.20` matching the other apps.
- Style: `AuthProvider.tsx` comment said "structurally disabled" — a
vague LLM-tell phrase. Rewritten to state the mechanical fact
directly. The `renewOrSignout` block comment was trimmed from 10
lines to 6.
- Test coverage: added assertions that `invalidateQueries` fires on
successful refresh (after the deferred-effect path completes) and is
NOT called on the signout path. Added a listener-cleanup test that
ensures `signinSilent` is not invoked after unmount.
Signed-off-by: Sam Davies <sam@avrofi.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Alternative to #5683 — same bug, different design. Posting both so maintainers can pick.
PR #5545 set
automaticSilentRenew={false}on thereact-oidc-contextprovider and removedoffline_accessfrom the scope request. Either change on its own would have been fine; together they leave the UIs with no refresh-token mechanism at all, so every access-token expiry forces a full re-login (5 minutes on a default Keycloak deployment).This PR restores token renewal for both Keycloak and Auth0 without re-enabling the library's timer-based silent renew (which races the new 401 handler under refresh-token rotation, see "Why a single renewal path" below).
What changes
1.
renewOrSignoutinUserProvider(UserContext.tsx). Onsplice:auth-expired(fired fromBaseApiMiddleware/LedgerApiContexton any 401), try oneauth.signinSilent()before signing the user out. Single-flighted via arenewInFlightref so concurrent 401s collapse to one grant. On success, react-query cache invalidation is deferred until the refreshed access token has propagated intouserAccessTokenstate — a separate effect watches that state and firesqueryClient.invalidateQueries()so the deferred invalidation doesn't race the state update.2. Library's
automaticSilentRenewstaysfalse(matches what #5545 set). With Splice owning the renewal path, there is no need for the library's proactive timer — and re-enabling it would race the 401 handler (next section).3. Optional
enable_offline_scope: booleanon the rs256AuthConfigfor Auth0 deployments where the IdP only returns a refresh token whenoffline_accessis in the scope. Default false: Keycloak issues a session-bound refresh token without it, and on Keycloakoffline_accessproduces a long-lived offline refresh token (larger XSS surface in a browser-based SPA).Why a single renewal path
An earlier revision of this PR re-enabled
automaticSilentRenew={true}alongsiderenewOrSignout, on the theory that the proactive timer would cover normal expiry and the 401 handler would cover edge cases (suspended tabs, clock skew). Adversarial review surfaced a race: the library timer firessigninSilentat exp - 60s using the stored refresh token; a near-simultaneous user request 401s and the handler fires its ownsigninSilent. With IdPs that rotate refresh tokens (Auth0 by default, Keycloak when configured), one of the two concurrent grants is rejected — the refresh token was already exchanged. The handler's catch block then signs the user out even though the other renewal succeeded.Dropping
automaticSilentRenewtofalseand routing all renewal through the single-flighted handler eliminates the race. Trade: one extra 401 round-trip per token expiry per active session (the user's first request after expiry takes a renewal + retry instead of using a pre-refreshed token). That cost is bounded and the net UX is far better than the forced re-login this PR replaces.How this compares to #5683
#5683 takes a different approach: keep
automaticSilentRenew={true}and makeenable_offline_scopea mandatory operator-set boolean. That forces the Auth0-vs-Keycloak decision at configuration time but leaves two cases open:offline_accessopt-in: no refresh token, silent renew has nothing to use, user is logged out at access-token TTL.This PR keeps the schema field optional (no breaking config change) and uses the
renewOrSignoutdesign to handle both cases at runtime. Pick whichever shape fits.Testing
New tests under
apps/common/frontend/src/__tests__/auth/:UserContextRenewOrSignout.test.tsx: signinSilent called when a refresh token is present;invalidateQueriesfires after token state propagates (deferred path); removeUser called when no refresh token; failure path falls through to signout;invalidateQueriesnot called on the signout path; single-flight under concurrent expired events; listener removed on unmount.AuthProvider.test.tsx: assertsautomaticSilentRenew=false(the design intent).authSchema.test.ts: parse behaviour of optionalenable_offline_scope.oidcAuthToProviderProps.test.ts: scope construction with and without the opt-in.Existing 401-fallback coverage in
BaseApiMiddleware.test.tsandLedgerApiClient.test.tsis untouched — the handler still firessplice:auth-expiredon 401, only the handler's response changes.apps/sv/frontend/src/__tests__/auth/redirect.test.tsxis updated to wrapUserProviderinQueryClientProvider(now required becauseUserProvidercallsuseQueryClient).apps/sv/frontend/package.jsonpins@tanstack/react-query: 5.90.20directly to match the other apps and prevent singleton-resolution drift.References
oidc-client-ts.js:2533(if (user?.refresh_token) return this._useRefreshToken(state)).oidc-client-ts.js:2540(throws "No silent_redirect_uri configured" when no refresh token and nosilent_redirect_uri— Splice never setssilent_redirect_uri, so the iframe path is unreachable).offline_accessto receive a refresh token: https://auth0.com/docs/secure/tokens/refresh-tokens/get-refresh-tokensFixes #5682