Skip to content

feat(jwt): public API — IUserJwtInvalidatedListener, updateUserJwt, login JWT (5/6)#2633

Open
nan-li wants to merge 5 commits intofeat/iv-http-executors-04from
feat/iv-public-api-05
Open

feat(jwt): public API — IUserJwtInvalidatedListener, updateUserJwt, login JWT (5/6)#2633
nan-li wants to merge 5 commits intofeat/iv-http-executors-04from
feat/iv-public-api-05

Conversation

@nan-li
Copy link
Copy Markdown
Contributor

@nan-li nan-li commented May 1, 2026

Description

One Line Summary

PR 5 of 6 against the Identity Verification integration branch — exposes the developer-facing IV API (IUserJwtInvalidatedListener, updateUserJwt, login JWT storage) and bridges PR 3's JwtTokenStore invalidation events to a multi-subscriber EventProducer on UserManager.

Details

Motivation

PR 3 moved the JWT-invalidation event source onto JwtTokenStore.invalidateJwt and removed the cross-cutting setJwtInvalidatedHandler from IOperationRepo. The replacement was supposed to be: UserManager subscribes to JwtTokenStore directly, and exposes the developer-facing event surface. PR 4 added executor-side JWT attachment but left the API unwired.

This PR closes the loop:

  1. Stores the JWT. OneSignal.login(externalId, jwt) finally writes the token to JwtTokenStore so PR 4's executor extensions can find it. Also handles same-user-with-new-token (refresh) and adds a dedicated updateUserJwt(externalId, token) method for the post-401 refresh case.
  2. Surfaces invalidation events. JwtTokenStore.onJwtInvalidatedUserManagerEventProducer<IUserJwtInvalidatedListener> → developer listener. Multi-subscriber (matches the codebase's existing EventProducer idiom). Async dispatch on Dispatchers.Default so the developer's callback doesn't run on the SDK's internal thread.
  3. Listener replay for late registration (fix(jwt): race conditions and IAM reliability issues with identity verification enabled #2613 commit 2 equivalent). If the SDK invalidates a JWT before the developer's listener is wired up (typical during early app startup), the cached event is replayed to listeners that subscribe later. Cleared on logout.

Scope

New public types (in com.onesignal):

  • IUserJwtInvalidatedListenerfun interface for Kotlin SAM-conversion convenience
  • UserJwtInvalidatedEvent(externalId: String) — event payload

IOneSignal gains three methods:

  • updateUserJwt(externalId, token) — store JWT + force-execute the queue so deferred ops dispatch with the fresh token.
  • addUserJwtInvalidatedListener(listener) — multi-subscriber, with replay on late registration.
  • removeUserJwtInvalidatedListener(listener).

The pre-existing login(externalId, jwtBearerToken) overload + loginSuspend were already in IOneSignal; this PR finally wires the JWT param through to JwtTokenStore.

UserManager:

  • Implements IJwtUpdateListener and subscribes to JwtTokenStore in init — replaces PR 3's setJwtInvalidatedHandler bridge with a direct event-source subscription.
  • Holds an EventProducer<IUserJwtInvalidatedListener> and a CoroutineScope(SupervisorJob() + Dispatchers.Default) for async fan-out.
  • fireJwtInvalidated(externalId) — caches the event for replay, schedules async fire with per-listener runCatching (one throwing subscriber doesn't poison others — matches the same pattern from PR 3's JwtTokenStore.invalidateJwt fix).
  • addJwtInvalidatedListener — subscribes, then if lastJwtInvalidatedExternalId != null, schedules a replay to the new listener.
  • clearLastJwtInvalidated() — clears the replay cache. Called from OneSignalImp.logout() / logoutSuspend() so a stale event from a previous user doesn't fire for a new session.
  • New constructor param: JwtTokenStore. DI registration adds provides<UserManager>() so OneSignalImp can resolve the concrete class for listener delegation.

LoginHelper:

  • New constructor param: JwtTokenStore.
  • In switchUser(externalId, jwtBearerToken):
    • New-user path: stores the JWT before userSwitcher.createAndSwitchToNewUser so when the queued LoginUserOperation dispatches, the JWT lookup in hasValidJwtIfRequired already succeeds.
    • Same-user path (no switch needed): still updates the stored token if a new JWT was supplied — supports developers calling login(sameId, refreshedJwt) to refresh.
  • Class visibility tightened to internal class (was public) since the constructor now exposes internal JwtTokenStore.

OneSignalImp:

  • Injects JwtTokenStore via services.getService<JwtTokenStore>() (it was already in DI from PR 1).
  • Implements the three new IOneSignal methods, delegating listener add/remove to services.getService<UserManager>().
  • logout() and logoutSuspend() call clearLastJwtInvalidated() before the user-switch so subsequent listener registrations don't replay the previous user's invalidation.

Layering / consistency

  • Replaces PR 3's plumbing setter (setJwtInvalidatedHandler is already gone from IOperationRepo); the bridge is now JwtTokenStore.IJwtUpdateListener.onJwtInvalidated → UserManager → EventProducer.
  • Migration scenario from feat: identity verification 5.8 #2599 (HYDRATE-IV-required + existing externalId + no JWT → fire listener) is intentionally not ported — verified iOS does not do this either (only fires on actual 401s). Customers enabling IV are expected to ship code that supplies JWTs for active users; the SDK only signals on actual 401 responses.

What is NOT in this PR

  • Logout IV-aware behavior (disable push sub on IV-required logout) + IAM IV integration + RYW plumbing — PR 6 (final).
  • The LoginHelper.switchUser JWT optimization for the migration scenario — out of scope (we don't fire the migration event, see above).

Testing

Unit testing

UserManagerTests (4 new):

  • JwtTokenStore.invalidateJwt fires registered IUserJwtInvalidatedListener — end-to-end via the real JwtTokenStore + UserManager subscription.
  • listener replay: subscribers added after fire receive the most recent event — verifies the replay cache is consulted on addJwtInvalidatedListener.
  • clearLastJwtInvalidated stops replay for new subscribers — verifies the logout-path clears the cache.
  • removeJwtInvalidatedListener stops further notifications — verifies unsubscribe.

LoginHelperTests (2 new):

  • login with JWT stores token in JwtTokenStore before enqueueing op — new-user path + JWT storage.
  • login with same externalId + new JWT updates the stored token — same-user-refresh path; asserts no user-switch happens but JWT is updated.

All affected tests pass locally. Two pre-existing SDKInitTests failures are unrelated and consistent with the integration branch baseline.

Manual testing

Not applicable for this PR — the wired listener fires only when JwtTokenStore.invalidateJwt is called, which happens from PR 4's handleFailUnauthorized in response to a 401. End-to-end manual testing requires the integration branch to be merged + a backend with jwt_required=true.

Affected code checklist

  • Notifications
  • Outcomes
  • Sessions
  • In-App Messaging
  • REST API requests
  • Public API changes (new: IUserJwtInvalidatedListener, UserJwtInvalidatedEvent, IOneSignal.updateUserJwt, IOneSignal.addUserJwtInvalidatedListener, IOneSignal.removeUserJwtInvalidatedListener. Existing login(externalId, jwt) semantics: JWT now actually stored.)

Checklist

Overview

  • I have filled out all REQUIRED sections above
  • PR does one thing — wires the developer-facing IV API on top of PR 3's invalidation event source and PR 4's executor JWT attachment.
  • Any Public API changes are explained in the PR details and conform to existing APIs (matches the addObserver/removeObserver shape used by IUserStateObserver, etc.)

Testing

  • I have included test coverage for these changes
  • All automated tests pass locally (pre-existing SDKInitTests failures are unrelated — same 2 on integration branch)
  • Manual testing N/A — wiring is consumed only when 401s drive JwtTokenStore.invalidateJwt, which requires a live IV-enabled backend.

Final pass

  • Code is as readable as possible.
  • I have reviewed this PR myself, ensuring it meets each checklist item

@nan-li nan-li force-pushed the feat/iv-public-api-05 branch from eb9c4e2 to 0fde95d Compare May 1, 2026 07:52
@nan-li
Copy link
Copy Markdown
Contributor Author

nan-li commented May 1, 2026

@claude review

Comment on lines +148 to +157
/**
* Subscribe a listener for JWT-invalidated events. Fires on a background thread when
* the SDK detects that the stored JWT for a user is no longer valid (typically after
* a 401 from the OneSignal backend). Apps should respond by fetching a fresh JWT from
* their backend and supplying it via [updateUserJwt].
*
* Listener replay: if an invalidation has already occurred before this listener is
* registered, the most recent invalidation is delivered to the new listener so apps
* that subscribe late don't miss the signal.
*/
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The KDoc on addUserJwtInvalidatedListener (IOneSignal.kt:148-157) makes two claims that contradict the implementation in UserManager. (1) "Fires on a background thread" — but the replay path invokes listener.onUserJwtInvalidated(...) synchronously on the caller's thread (UserManager.kt:101-105, no launch). A developer who registers from Activity.onCreate and follows the documented "fetch a fresh JWT from your backend" guidance can hit NetworkOnMainThreadException/ANR on the replay path when a cold-start invalidation has been buffered. (2) "if an invalidation has already occurred before this listener is registered, the most recent invalidation is delivered to the new listener" — but fireJwtInvalidated only buffers when hasSubscribers==false and addJwtInvalidatedListener clears pendingJwtInvalidatedExternalId after the first replay; the PR's own tests ("buffered event is consumed by the first subscriber; second subscriber gets nothing" and "fire when subscribers exist does NOT buffer for late subscribers") encode the actual consume-on-first-subscribe semantics. The internal IUserJwtInvalidatedListener KDoc correctly documents the threading asymmetry, but most developers will read only the entry-point KDoc. Fix: tighten the public KDoc to spell out (a) replay runs synchronously on the calling thread and (b) replay only delivers to the first listener registered after a no-listener fire — alternatively wrap the replay in jwtInvalidatedDispatchScope.launch to make the original "background thread" promise truthful and also fix the unrelated unsubscribe-survives-replay asymmetry from PR comment #2.

Extended reasoning...

What the bug is

The public KDoc on IOneSignal.addUserJwtInvalidatedListener (IOneSignal.kt:148-157) documents two guarantees that the implementation does not provide:

// IOneSignal.kt
/**
 * Subscribe a listener for JWT-invalidated events. Fires on a background thread when
 * the SDK detects that the stored JWT for a user is no longer valid ...
 *
 * Listener replay: if an invalidation has already occurred before this listener is
 * registered, the most recent invalidation is delivered to the new listener so apps
 * that subscribe late don't miss the signal.
 */

Neither claim survives contact with UserManager.

Claim 1 — "Fires on a background thread"

addJwtInvalidatedListener (UserManager.kt:96-106) delivers the buffered replay synchronously on the caller's thread:

pendingExternalId?.let {
    runCatching { listener.onUserJwtInvalidated(UserJwtInvalidatedEvent(it)) }
        .onFailure { ex -> Logging.warn(...) }
}

There is no jwtInvalidatedDispatchScope.launch here — only the regular fire path (UserManager.kt:117-122) is dispatched to Dispatchers.Default. The internal KDoc on IUserJwtInvalidatedListener correctly documents this asymmetry ("Replay delivery happens synchronously on the thread that calls IOneSignal.addUserJwtInvalidatedListener"), but a developer who reads only the entry-point KDoc on IOneSignal will not see it.

Claim 2 — "the most recent invalidation is delivered to the new listener"

fireJwtInvalidated only sets pendingJwtInvalidatedExternalId when jwtInvalidatedNotifier.hasSubscribers is false (UserManager.kt:113-126), and addJwtInvalidatedListener clears the field after the first replay (UserManager.kt:99). The internal field doc at UserManager.kt:58-62 explicitly says "Consumed-on-first-subscribe" — internally consistent, but the public KDoc says the opposite.

Step-by-step proof

Threading proof (NetworkOnMainThreadException vector):

  1. Developer reads the IOneSignal KDoc, sees "Fires on a background thread", and writes a listener that calls their backend to refresh the JWT.
  2. App enables identity verification. Cold start: SDK request returns 401 before the developer's listener is wired up. UserManager.fireJwtInvalidated("alice") runs with hasSubscribers=false, sets pendingJwtInvalidatedExternalId = "alice".
  3. Activity.onCreate runs on the main thread and calls OneSignal.addUserJwtInvalidatedListener(listener).
  4. addJwtInvalidatedListener reads pendingExternalId = "alice", then synchronously invokes listener.onUserJwtInvalidated(UserJwtInvalidatedEvent("alice"))on the main thread.
  5. Listener performs the documented "fetch a fresh JWT from your backend" call → NetworkOnMainThreadException or ANR.

Replay-scope proof (Scenario A — second subscriber gets nothing):

  1. Cold-start 401 with no listeners → fireJwtInvalidated("alice") buffers via the hasSubscribers=false branch.
  2. App's primary code subscribes ListenerA → ListenerA gets "alice", pendingJwtInvalidatedExternalId set to null.
  3. A debug/observability harness in the same app subscribes ListenerB → ListenerB gets nothing, despite "an invalidation has already occurred before this listener is registered" being literally true.

Replay-scope proof (Scenario B — fire-with-listener never buffered):

  1. App subscribes ListenerA early.
  2. SDK fires invalidation → hasSubscribers=true branch → ListenerA is dispatched to via Dispatchers.Default. pendingJwtInvalidatedExternalId is never set.
  3. ListenerA unsubscribes (Activity destroyed).
  4. ListenerB subscribes later → no replay.

Both scenarios are encoded as passing tests in this PR — UserManagerTests.kt's "buffered event is consumed by the first subscriber; second subscriber gets nothing" and "fire when subscribers exist does NOT buffer for late subscribers" — confirming the implementation deliberately diverges from the public contract.

Why this matters

The threading half is a real crash vector for FF=on apps that wire listeners in onCreate. The replay-scope half only matters in multi-listener setups (debug harnesses, Compose-style observers across screens) but is plain wrong as documented.

Fix

Either (a) tighten the public KDoc to match the implementation:

Threading: regular delivery happens on a background dispatcher; the replay
described below runs synchronously on the thread that calls this method.
Listeners that perform network I/O should re-dispatch off the main thread.

Listener replay: if an invalidation occurred while no listener was subscribed,
the buffered event is delivered to the FIRST listener registered after that fire
(consumed on first subscribe). Subsequent listeners, and any listener subscribed
after a fire-with-listener, do not receive a replay.

Or (b) wrap the replay in jwtInvalidatedDispatchScope.launch { ... }, which also resolves the unsubscribe-survives-replay asymmetry called out in inline comment #2.

@nan-li nan-li force-pushed the feat/iv-http-executors-04 branch 2 times, most recently from 50aba1d to 26abaf3 Compare May 4, 2026 04:54
@nan-li nan-li force-pushed the feat/iv-public-api-05 branch from 21e9999 to dcf7f9b Compare May 4, 2026 04:56
Comment on lines +124 to +137
fun fireJwtInvalidated(externalId: String) {
synchronized(jwtInvalidatedLock) {
if (jwtInvalidatedNotifier.hasSubscribers) {
jwtInvalidatedDispatchScope.launch {
jwtInvalidatedNotifier.fire { listener ->
runCatching { listener.onUserJwtInvalidated(UserJwtInvalidatedEvent(externalId)) }
.onFailure { ex -> Logging.warn("UserManager: jwt-invalidated listener threw: ${ex.message}") }
}
}
} else {
pendingJwtInvalidatedExternalId = externalId
}
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 fireJwtInvalidated takes the buffer-or-fire decision under jwtInvalidatedLock, but EventProducer.fire (events/EventProducer.kt:47-52) snapshots subscribers inside the launched coroutine — so if the only subscriber unsubscribes between launch and dispatch (e.g. Activity.onDestroy after a 401), the snapshot is empty, the callback is silently dropped, AND pendingJwtInvalidatedExternalId is never set (the else branch was already skipped on the lock-protected hasSubscribers==true read). A subsequently-registered listener (e.g. recreated Activity) gets no replay because the buffer was never populated, breaking the documented "apps that subscribe late don't miss the signal" contract on a real 401 invalidation. Distinct from existing PR comment #2 (which covers double-delivery via add/fire race and replay surviving unsubscribe in the replay path) — this is fire-path event loss with no recovery. Fix: snapshot subscribers synchronously inside jwtInvalidatedLock (e.g. capture jwtInvalidatedNotifier.subscribers.toList() before launching), or always-buffer-and-clear-after-delivery.

Extended reasoning...

What the bug is

UserManager.fireJwtInvalidated (UserManager.kt:124-137) takes the buffer-or-fire decision under jwtInvalidatedLock:

fun fireJwtInvalidated(externalId: String) {
    synchronized(jwtInvalidatedLock) {
        if (jwtInvalidatedNotifier.hasSubscribers) {
            jwtInvalidatedDispatchScope.launch {
                jwtInvalidatedNotifier.fire { listener -> ... }
            }
        } else {
            pendingJwtInvalidatedExternalId = externalId
        }
    }
}

The lock guards the decision — but EventProducer.fire (events/EventProducer.kt:47-52) takes its subscriber snapshot inside the function body, i.e. when the launched coroutine actually runs on Dispatchers.Default, not at launch time:

fun fire(callback: (THandler) -> Unit) {
    val localList = synchronized(subscribers) { subscribers.toList() }
    for (s in localList) { callback(s) }
}

Meanwhile removeJwtInvalidatedListener (UserManager.kt:115-117) does not acquire jwtInvalidatedLock; it only delegates to notifier.unsubscribe, which synchronizes on a different lock (subscribers inside EventProducer). So holding jwtInvalidatedLock across the hasSubscribers check provides no coordination with concurrent unsubscribe.

The race window

Between launch { ... } returning and the coroutine actually executing on Dispatchers.Default, the only subscriber can unsubscribe. When the coroutine eventually runs, fire snapshots an empty list. No callback is invoked. And critically, pendingJwtInvalidatedExternalId is not set — because the lock-protected else branch was skipped at fire time. A subsequently-registered listener observes nothing.

Why existing code does not prevent it

  • jwtInvalidatedLock is held only during the buffer-or-fire decision; it is not held during dispatch, and unsubscribe does not contend on it.
  • EventProducer.fire deliberately snapshots inside the lambda body (necessary for the regular at-time-of-execution semantics other callers rely on).
  • pendingJwtInvalidatedExternalId is only written in the else branch — never as a fallback when the launched fire happens to dispatch into an empty subscriber list.
  • The PR's own existing tests use Thread.sleep(50) after fireJwtInvalidated to allow async dispatch — confirming the launch-to-dispatch gap is non-trivial.

Step-by-step proof

  1. App registers ListenerA (an Activity-bound listener) → subscribers=[A]; pendingJwtInvalidatedExternalId=null.
  2. SDK 401 triggers JwtTokenStore.invalidateJwtUserManager.onJwtInvalidatedfireJwtInvalidated("alice").
  3. Under jwtInvalidatedLock: hasSubscribers==truejwtInvalidatedDispatchScope.launch { fire(...) } (coroutine queued on Dispatchers.Default, not yet running). Lock released.
  4. The Activity is destroyed (e.g. configuration change, user backgrounding) → onDestroy calls removeJwtInvalidatedListener(A). unsubscribe synchronizes on EventProducer.subscribers, not jwtInvalidatedLock. subscribers=[].
  5. The coroutine starts. EventProducer.fire does synchronized(subscribers) { subscribers.toList() }[]. No callback invoked. Event silently dropped. pendingJwtInvalidatedExternalId is still null.
  6. The new Activity is created and calls addJwtInvalidatedListener(B). Under jwtInvalidatedLock: subscribes B; reads pendingJwtInvalidatedExternalId == null → no replay. B never learns about the 401.

Impact

The public KDoc on IOneSignal.addUserJwtInvalidatedListener explicitly promises: "if an invalidation has already occurred before this listener is registered, the most recent invalidation is delivered to the new listener so apps that subscribe late don't miss the signal." This race breaks that contract on a real 401 invalidation — exactly the high-stakes case the IV API exists to handle. The Activity teardown / restart pattern in step 4 is normal Android lifecycle behavior, and Dispatchers.Default scheduling delay during a config change can absolutely span the window between launch and dispatch.

Distinct from the existing inline review at UserManager.kt:113, which covers (a) double-delivery via concurrent add+fire and (b) the replay coroutine in addJwtInvalidatedListener surviving unsubscribe. Neither addresses fire-path event loss when the only subscriber unsubscribes between the lock-protected hasSubscribers check and the async dispatch — and there is no replay path to recover here, because the buffer was never written.

How to fix

Snapshot the subscriber list synchronously inside jwtInvalidatedLock before launching, so unsubscribe-after-launch is harmless:

fun fireJwtInvalidated(externalId: String) {
    val snapshot: List<IUserJwtInvalidatedListener>
    synchronized(jwtInvalidatedLock) {
        snapshot = jwtInvalidatedNotifier.subscribersSnapshot() // exposes subscribers.toList()
        if (snapshot.isEmpty()) {
            pendingJwtInvalidatedExternalId = externalId
            return
        }
    }
    jwtInvalidatedDispatchScope.launch {
        snapshot.forEach { listener ->
            runCatching { listener.onUserJwtInvalidated(UserJwtInvalidatedEvent(externalId)) }
                .onFailure { ex -> Logging.warn(...) }
        }
    }
}

Alternative: always set pendingJwtInvalidatedExternalId and clear it after at least one listener has been delivered (always-buffer-and-clear-after-delivery), mirroring the consume-on-first-subscribe semantics already in place for the no-listener path.

@nan-li nan-li force-pushed the feat/iv-http-executors-04 branch from 26abaf3 to e76fb60 Compare May 4, 2026 16:15
@nan-li nan-li force-pushed the feat/iv-public-api-05 branch from dcf7f9b to 0fbd10c Compare May 4, 2026 17:12
@nan-li nan-li force-pushed the feat/iv-http-executors-04 branch from e76fb60 to d2033ea Compare May 4, 2026 17:36
Comment on lines 342 to +351
override fun onModelReplaced(
model: IdentityModel,
tag: String,
) { }
) {
// IdentityModel replacement = login or logout switch. Clear any buffered invalidation
// so the next user's listener doesn't replay the previous user's stale event.
synchronized(jwtInvalidatedLock) {
pendingJwtInvalidatedExternalId = null
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The onModelReplaced handler in UserManager.kt:342-351 unconditionally clears pendingJwtInvalidatedExternalId on every IdentityModelStore replace and the inline comment claims this only happens on "login or logout switch" — but RefreshUserOperationExecutor.kt:149 also calls _identityModelStore.replace(identityModel, ModelChangeTags.HYDRATE) after every successful refresh (UserRefreshService fires one on every session start), and SingletonModelStore.replace fires onModelReplaced regardless of tag. The handler ignores the tag, so a same-user HYDRATE wipes the buffered invalidation. Fix: gate the clear on the tag (only clear when tag != ModelChangeTags.HYDRATE) or compare model.externalId to the buffered externalId before clearing.

Extended reasoning...

What is wrong

UserManager.onModelReplaced (UserManager.kt:342-351) unconditionally clears the JWT-invalidated replay buffer:

override fun onModelReplaced(model: IdentityModel, tag: String) {
    // IdentityModel replacement = login or logout switch. Clear any buffered invalidation
    // so the next user listener doesnt replay the previous user stale event.
    synchronized(jwtInvalidatedLock) {
        pendingJwtInvalidatedExternalId = null
    }
}

The inline comment is factually incorrect. A grep for _identityModelStore.replace returns two callers:

  1. UserSwitcher.kt:81identityModelStore.replace(identityModel) (default tag = NORMAL) — login/logout switch
  2. RefreshUserOperationExecutor.kt:149_identityModelStore.replace(identityModel, ModelChangeTags.HYDRATE) — same-user backend hydration

SingletonModelStore.replace (common/modeling/SingletonModelStore.kt:36-46) fires onModelReplaced(existingModel, tag) for any replace regardless of tag. The handler ignores the tag parameter, so a HYDRATE replacement of the same user clears the buffer just like a real user-switch.

Why this matters

RefreshUserOperation runs routinely outside login/logout: UserRefreshService.onSessionStarted (UserRefreshService.kt:31-38) enqueues one on every session start, and LoginUserOperationExecutor.kt:246 schedules a follow-on refresh after every successful create-user. Each successful refresh emits a HYDRATE replace.

The IOneSignal.addUserJwtInvalidatedListener KDoc explicitly promises: "if an invalidation has already occurred before this listener is registered, the most recent invalidation is delivered to the new listener so apps that subscribe late dont miss the signal." HYDRATE-driven clearing breaks that promise on contrived but reachable orderings.

Step-by-step proof (narrow but reachable scenario)

  1. Cold start. SDK begins processing persisted ops. No IUserJwtInvalidatedListener has been wired up yet.
  2. A persisted UpdateUser op for alice returns 401. OperationRepoIvExtensions.handleFailUnauthorized calls JwtTokenStore.invalidateJwt("alice")UserManager.onJwtInvalidated("alice")fireJwtInvalidated("alice"). With no subscribers, the else branch sets pendingJwtInvalidatedExternalId = "alice".
  3. The app — having a scheduled JWT-refresh routine — pre-emptively calls OneSignal.updateUserJwt("alice", freshJwt) before subscribing its listener. OneSignalImp.updateUserJwt puts the JWT and calls operationRepo.forceExecuteOperations().
  4. The previously-parked or new RefreshUserOperation now passes hasValidJwtIfRequired, dispatches to backend, and succeeds. RefreshUserOperationExecutor.kt:149 calls _identityModelStore.replace(identityModel, ModelChangeTags.HYDRATE).
  5. UserManager.onModelReplaced(model, "HYDRATE") runs and clears pendingJwtInvalidatedExternalId (despite the same-user hydration having no design relationship to user-switching).
  6. The app registers addUserJwtInvalidatedListener(L). addJwtInvalidatedListener reads pendingJwtInvalidatedExternalId == null → no replay. L never learns about the 401, contradicting the documented late-subscriber contract.

Addressing the refutation

The refutation correctly observes that OperationRepo.processQueueForever is sequential, so an "in-flight RefreshUserOperation racing with a 401 on a sibling op" in the strict concurrent sense cannot happen — once a 401 invalidates the JWT, subsequent JWT-requiring ops are filtered by hasValidJwtIfRequired until the JWT is restored. The pre-emptive updateUserJwt scenario above sidesteps that because the JWT is re-established before the next refresh dispatches.

The refutation also argues that, in any flow where HYDRATE follows a 401, the developer has already resolved the issue (via updateUserJwt or login(sameId, fresh)) so clearing the buffer is consistent with the separate ask in PR comment #3 to clear on fresh-JWT establishment. That is a defensible product choice, but it is not what this code does or what the comment claims:

  • The clear runs on the IdentityModel replace event, not on a JWT-update event. PR comment onPause NPE #3 explicitly notes that JwtTokenStore.putJwt does not clear the buffer (UserManager does not override onJwtUpdated). The clear-on-HYDRATE happens to coincide with JWT establishment in the happy path but is the wrong mechanism.
  • The behavior also clears for HYDRATE paths that have nothing to do with a recent JWT update — e.g., a session-start refresh where the JWT was already valid and never invalidated. Clearing a same-user invalidation buffer because the same-user identity got refreshed has no design justification, as verifier onPause NPE #3 notes.
  • The in-code comment misrepresents the mechanism and will mislead future readers.

Fix

Gate the clear on the tag, so HYDRATE replacements (same user, just hydrating fields from backend) do not wipe the buffer:

override fun onModelReplaced(model: IdentityModel, tag: String) {
    if (tag == ModelChangeTags.HYDRATE) return
    synchronized(jwtInvalidatedLock) {
        pendingJwtInvalidatedExternalId = null
    }
}

Alternatively, compare model.externalId to the buffered externalId and clear only on a real user switch. Either makes the comment truthful and removes the contract-violation race.

@nan-li nan-li force-pushed the feat/iv-public-api-05 branch 2 times, most recently from 7916683 to a038970 Compare May 4, 2026 19:24
Copy link
Copy Markdown
Contributor

@abdulraqeeb33 abdulraqeeb33 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one nit but otherwise looking great

* that synchronously trigger invalidation don't run app code on the SDK's internal thread.
* Replay (synchronous, on the calling thread) bypasses this scope.
*/
private val jwtInvalidatedDispatchScope = CoroutineScope(SupervisorJob() + Dispatchers.Default)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use existing Default scope from OneSignalDispatcher

nan-li and others added 5 commits May 5, 2026 08:55
…gin JWT (5/6)

Adds the developer-facing IV API:
- IUserJwtInvalidatedListener (fun interface) + UserJwtInvalidatedEvent
- IOneSignal.updateUserJwt(externalId, token)
- IOneSignal.addUserJwtInvalidatedListener / removeUserJwtInvalidatedListener
- LoginHelper stores the JWT in JwtTokenStore on switchUser (new user + same-user-new-token)

UserManager subscribes to JwtTokenStore as IJwtUpdateListener; onJwtInvalidated
forwards to a multi-subscriber EventProducer<IUserJwtInvalidatedListener>
fired async on Dispatchers.Default with per-listener runCatching.

Listener replay: caches the last invalidation event so listeners that subscribe
after the invalidation still receive the signal. Cleared on logout.
Rewrite the listener replay to match the reference design from #2613
commit 89cca43:
- Lock-protected pendingJwtInvalidatedExternalId (consume-on-first-subscribe)
  replaces the @volatile lastJwtInvalidatedExternalId (always-replay)
- fireJwtInvalidated buffers ONLY when no subscribers exist; otherwise
  schedules an async fire. Closes the double-fire race.
- onModelReplaced clears the buffer on IdentityModel replacement (login or
  logout switch via UserSwitcher.createAndSwitchToNewUser → replace).
- Drop the explicit clearLastJwtInvalidated() method and its calls in
  OneSignalImp.logout() / logoutSuspend() — clear is now automatic.

Match #2599 conventions:
- Drop the isInitialized throw in addUserJwtInvalidatedListener /
  removeUserJwtInvalidatedListener.
- Add updateUserJwtSuspend.

Update IUserJwtInvalidatedListener docstring to clarify replay is
synchronous on the caller's thread; regular fire is async.

Tests rewritten for the new semantics: first-subscriber-replay,
consume-on-first, fire-with-subscribers-doesn't-buffer, onModelReplaced
clears.
Match the init-readiness pattern used by updateUserJwt: when background
threading is enabled, waitForInit; otherwise throw if not initialized.
Without this, listeners added before initWithContext silently bind to a
not-yet-wired UserManager.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Same-user-id login with a fresh JWT (the documented post-401 refresh
path) was storing the new token but never calling forceExecuteOperations,
so ops parked by hasValidJwtIfRequired stayed deferred until something
else woke the queue. updateUserJwt already does this correctly; reference
#2599 LoginHelper does too. Match that design — drop redundant
if (jwtBearerToken != null) guards (putJwt no-ops on null) and call
forceExecuteOperations unconditionally on the same-id branch so the
queue drains as soon as the developer supplies a fresh token.

Extend the existing same-id+JWT test to verify forceExecuteOperations
fires.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three review-driven fixes on the IV public-API surface:

1. OneSignal.kt: add @JvmStatic wrappers for updateUserJwt,
   addUserJwtInvalidatedListener, removeUserJwtInvalidatedListener, and
   updateUserJwtSuspend. Without these, the four new IOneSignal methods
   were unreachable from app code (the OneSignal object is the documented
   Java/Kotlin entry point and "implements IOneSignal in spirit"). Mirrors
   the convention from reference branch #2599.

2. OneSignalImp.kt: mask jwtBearerToken in login()/loginSuspend() DEBUG
   logs (...${jwtBearerToken?.takeLast(8)}). Pre-PR the parameter was a
   no-op (LoginHelper had a TODO) so the leak was theoretical; this PR
   wires the token through to JwtTokenStore.putJwt, so a live bearer
   credential now flows through Logging.log at DEBUG. updateUserJwt
   already masks; mirror that.

3. UserManager.kt: restore Logging.warn(msg, ex) form on the two new
   runCatching.onFailure handlers in fireJwtInvalidated and
   addJwtInvalidatedListener. Interpolating ${ex.message} drops the
   stack trace; Logging.warn already accepts a Throwable second arg
   that propagates through to log listeners and Otel — same pattern
   restored in JwtTokenStore via e76fb60.

Also refreshes detekt baseline for the new MagicNumber (takeLast(8)),
ConstructorParameterNaming (_jwtTokenStore on UserManager), TooManyFunctions
(UserManager now also implements IJwtUpdateListener), and UseCheckOrError
on the 4 new IllegalStateException throws.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@nan-li nan-li force-pushed the feat/iv-public-api-05 branch from a038970 to c233c25 Compare May 5, 2026 15:57
Comment on lines +67 to +72
/**
* Async dispatch of [IUserJwtInvalidatedListener] callbacks so HYDRATE / op-repo paths
* that synchronously trigger invalidation don't run app code on the SDK's internal thread.
* Replay (synchronous, on the calling thread) bypasses this scope.
*/
private val jwtInvalidatedDispatchScope = CoroutineScope(SupervisorJob() + Dispatchers.Default)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 🟡 UserManager.kt:72 introduces private val jwtInvalidatedDispatchScope = CoroutineScope(SupervisorJob() + Dispatchers.Default) instead of using the SDK's existing OneSignalDispatchers.launchOnDefault infrastructure. The maintainer left an inline comment on this line on 2026-05-04 asking for exactly this change: "use existing Default scope from OneSignalDispatcher". Replace the field and jwtInvalidatedDispatchScope.launch { ... } with OneSignalDispatchers.launchOnDefault { ... } to mirror the existing pattern (e.g. StartupService.kt:29) — mechanical fix.

Extended reasoning...

What is wrong

UserManager.kt:67-72 introduces a fresh per-instance scope:

private val jwtInvalidatedDispatchScope = CoroutineScope(SupervisorJob() + Dispatchers.Default)

…and then dispatches the developer-facing JWT-invalidated event via jwtInvalidatedDispatchScope.launch { ... } in fireJwtInvalidated (UserManager.kt:127).

The SDK already provides centralized dispatcher infrastructure for exactly this purpose at com.onesignal.common.threading.OneSignalDispatchers (OneSignalDispatchers.kt:111-134). It exposes a tuned Default CoroutineDispatcher backed by a custom ThreadPoolExecutor (DEFAULT_CORE_POOL_SIZE=2, DEFAULT_MAX_POOL_SIZE=3, KEEP_ALIVE_TIME_SECONDS=30, LinkedBlockingQueue(QUEUE_CAPACITY=200), daemon threads named OneSignal-Default-N), plus a convenience launchOnDefault helper that wraps an internal DefaultScope. getStatus() / getPerformanceMetrics() introspect this pool. The new code bypasses all of it.

Maintainer feedback

The maintainer (abdulraqeeb33) left an inline review on this exact line on 2026-05-04T19:45:13Z (inline comment id 3184062053):

use existing Default scope from OneSignalDispatcher

This is direct, unambiguous code-review feedback that has not been applied — the line is unchanged on the current tip.

Why it matters

  1. Bounded vs unbounded resource usage. The new code dispatches onto kotlinx's shared, unbounded Dispatchers.Default thread pool rather than the SDK's bounded one — defeating the controlled resource usage the SDK enforces elsewhere.
  2. Thread naming / observability. Threads dispatched via OneSignalDispatchers are named OneSignal-Default-N and are visible to getStatus/getPerformanceMetrics introspection. The new scope's tasks run on anonymous DefaultDispatcher-worker-N threads.
  3. Convention. StartupService.kt:29 is the canonical existing pattern: OneSignalDispatchers.launchOnDefault { ... }. ThreadUtils.kt (lines 125, 174, 232) also uses it. Diverging from the convention here is what the maintainer flagged.

Step-by-step proof

  1. Open OneSignalSDK/onesignal/core/src/main/java/com/onesignal/user/internal/UserManager.kt at line 72.
  2. Observe the new field: private val jwtInvalidatedDispatchScope = CoroutineScope(SupervisorJob() + Dispatchers.Default) — this references kotlinx.coroutines.Dispatchers.Default, the unbounded shared system pool.
  3. fireJwtInvalidated (line 127) calls jwtInvalidatedDispatchScope.launch { ... } — coroutines dispatch through that scope, never touching the SDK's tuned OneSignalDispatchers.Default executor.
  4. Compare to StartupService.kt:29: OneSignalDispatchers.launchOnDefault { ... } — the established convention.
  5. Maintainer comment id 3184062053 on UserManager.kt:72 explicitly requests using the SDK's existing dispatcher.

Addressing the duplicate-refutation

A late refutation argued this is a duplicate of two lower-ID bugs covering the same line. That's correct in fact (same finding) but the lower-ID twins were dropped during synthesis — this is the surviving canonical entry, not an additional report. The maintainer-requested fix is unambiguously present and unaddressed in the current code, so the issue itself stands regardless of how it was originally bookkept.

Fix

Mechanical, two-step change in UserManager.kt:

// Remove the field at lines 67-72 (and the kotlinx imports it requires).

// In fireJwtInvalidated, replace:
jwtInvalidatedDispatchScope.launch { ... }
// with:
OneSignalDispatchers.launchOnDefault { ... }

Mirrors StartupService.kt:29 exactly. No behavioral change for callers; the listener still runs off the SDK's internal thread, just on the SDK's properly-sized pool with named threads.

🔬 also observed by verify-runtime

Comment on lines 39 to 46
if (currentExternalId == externalId) {
// Same-user refresh path (e.g. login(sameId, freshJwt) after a 401). Store the
// fresh token and wake the queue so any ops deferred by `hasValidJwtIfRequired`
// dispatch immediately — symmetric with `updateUserJwt`. putJwt no-ops on null.
jwtTokenStore.putJwt(externalId, jwtBearerToken)
operationRepo.forceExecuteOperations()
return null
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 In the same-user branch of LoginHelper.switchUser (LoginHelper.kt:39-46), operationRepo.forceExecuteOperations() runs on every login(sameId, ...) call, even when jwtBearerToken is null. JwtTokenStore.putJwt correctly no-ops on null, but the wake still fires and short-circuits any in-flight retry-backoff window — a behavioral regression vs. pre-PR (which was a clean no-op on this path). Fix: gate both calls on jwtBearerToken != null, mirroring updateUserJwt which only ever runs with a non-null token by signature.

Extended reasoning...

What is wrong

LoginHelper.switchUser (LoginHelper.kt:39-46) handles the same-externalId branch like this:

if (currentExternalId == externalId) {
    // Same-user refresh path (e.g. login(sameId, freshJwt) after a 401)...
    jwtTokenStore.putJwt(externalId, jwtBearerToken)
    operationRepo.forceExecuteOperations()
    return null
}

The inline comment frames this as "Same-user refresh path (e.g. login(sameId, freshJwt) after a 401) ... symmetric with updateUserJwt." That symmetry premise is wrong: updateUserJwt(externalId, token: String) has a non-null token parameter by signature, so it can only ever be invoked with a real token. The same-user login branch has no such guarantee — login(externalId, jwtBearerToken: String? = null) is freely callable with jwtBearerToken == null.

Why the wake matters

OperationRepo.forceExecuteOperations() (OperationRepo.kt:259-262) wakes both retryWaiter and waiter. OperationRepo.delayBeforeNextExecution (line 453-466) suspends a deferred op via withTimeoutOrNull(delayFor) { retryWaiter.waitForWake() } — so a wake with force=true short-circuits the retry-backoff window. Pre-PR this branch was a clean return null; post-PR every redundant login(sameId) (no JWT) bypasses retry backoff for any in-flight op.

Step-by-step proof

  1. App is logged in as alice. Queue has a deferred op currently in retry backoff (e.g. an UpdateUserOperation waiting on its post-failure delay via delayBeforeNextExecution).
  2. App calls OneSignal.login("alice") (no JWT) — perhaps from a redundant "ensure logged in" guard at startup or a reactive UI path.
  3. LoginHelper.switchUser("alice", null) enters the same-externalId branch.
  4. jwtTokenStore.putJwt("alice", null) no-ops per JwtTokenStore.putJwt (the early-return on null was added in this PR family).
  5. operationRepo.forceExecuteOperations() wakes both retryWaiter and waiter.
  6. The deferred ops retryWaiter.waitForWake() returns early; the op fires before its scheduled retry-backoff would have allowed.

Pre-PR this code path was return null — completely side-effect free. The behavioral regression is on a flow unrelated to the IV refresh feature this PR was meant to support.

Distinct from the earlier inline review

Inline comment #3173991030 on this same line range was about the inverse case: when a JWT was supplied, the same-user branch failed to call forceExecuteOperations(). The PR fixed that by adding an unconditional wake — overcorrecting into this new bug. The clean fix that addresses both is one guard:

if (currentExternalId == externalId) {
    if (jwtBearerToken != null) {
        jwtTokenStore.putJwt(externalId, jwtBearerToken)
        operationRepo.forceExecuteOperations()
    }
    return null
}

Impact

Low — wake-the-queue is benign in steady-state apps and only mildly compresses retry timings. But the behavioral regression vs. pre-PR is real on a code path (no-JWT same-user login) that has nothing to do with the IV feature this PR is shipping. Filing as a nit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants