fix: offload every main-thread onFocus / onUnfocused handler behind sdk_background_threading FF#2644
Merged
Merged
Conversation
Contributor
📊 Diff Coverage ReportDiff Coverage Report (Changed Lines Only)Gate: aggregate coverage on changed executable lines must be ≥ 80% (JaCoCo line data for lines touched in the diff). Changed Files Coverage
Overall (aggregate gate)22/30 touched executable lines covered (73.3% — requires ≥ 80%) Per-file detail (informational; gate is aggregate above):
❌ Coverage Check FailedAggregate coverage on touched lines is 73.3% (minimum 80%). |
fadi-george
approved these changes
May 11, 2026
3c92229 to
8d3904d
Compare
abdulraqeeb33
pushed a commit
that referenced
this pull request
May 12, 2026
…o avoid cold-start ANRs
ANR-dump analysis (logs/2026-05-12, 500 entries, all on
sdk_background_threading) attributes 47 / 500 ANRs (9.4%) to the SDK's
own background-threading helpers stalling the main thread on cold start.
The three call sites in the bucket are now all routed through
runOnSerialIOIfBackgroundThreading by the SDK-4505 / SDK-4506 work, but
the deeper root cause is that the very first IO consumer (whichever
caller wins the race) pays the cost of constructing the entire
OneSignalDispatchers lazy chain on its thread:
ThreadPoolExecutor.execute -> LinkedBlockingQueue.offer
CoroutineDispatcher.dispatch -> kotlinx.coroutines first-launch
OneSignalDispatchers.IOScope.<init> (by lazy)
OneSignalDispatchers.IO (by lazy)
OneSignalDispatchers.ioExecutor (by lazy)
That includes the kotlinx.coroutines MainDispatcherFactory ServiceLoader
scan, executor + thread-factory construction, dispatcher wrapping, and
SupervisorJob / CoroutineScope wiring. Under sdk_background_threading
the first caller is typically an Activity-lifecycle handler or
JobService.onStartJob - both on the main thread - and OTel records
5-20s blocks before the watchdog fires.
OneSignalDispatchers.prewarm() spawns a dedicated short-lived
"OneSignal-prewarm" daemon thread that submits one empty launch on each
of IO / Default / SerialIO. That single thread pays the lazy-init cost
end-to-end so the next production caller - even on the main thread -
only sees the cheap "submit to already-constructed executor" path.
* Idempotent: double-checked-locked prewarmStarted flag, so repeat
calls from init / suspend init / JobService.onStartJob no-op cheaply.
* Fire-and-forget: failures log and swallow; the existing
Dispatchers.IO / SerialIO fallback paths in [IO] / [SerialIO] still
apply if anything goes wrong, so a failed prewarm just means the
first real caller pays the original cost.
* Daemon thread at NORM_PRIORITY - 2 so prewarm never blocks process
exit or starves UI work.
Called from:
* OneSignalImp.initWithContext(context, appId) (sync variant)
* OneSignalImp.initWithContextSuspend(context, appId) (suspend variant,
used by re-entrant suspend callers)
* SyncJobService.onStartJob BEFORE suspendifyOnIO, because the
JobService can fire before the host app's initWithContext runs.
Tests (:core OneSignalDispatchersTests):
* prewarm returns immediately on the caller and the daemon thread
brings IO / Default / SerialIO + their scopes to Active.
* prewarm is idempotent - second call does not spawn another
OneSignal-prewarm thread (verified via thread-name scan).
Scope reduction: the remaining-onFocus-handlers part of the original
SDK-4507 change (NotificationPermissionController + FeatureFlagsRefreshService
runOnSerialIOIfBackgroundThreading wrappers) was moved up the stack to
SDK-4506 (#2644), where it sits next to NotificationsManager.onFocus
since all three share the same FF-gated rollout shape. This PR is now
focused purely on the prewarm fix.
:OneSignal:core detekt + full unit suite green.
Co-authored-by: Cursor <cursoragent@cursor.com>
abdulraqeeb33
pushed a commit
that referenced
this pull request
May 12, 2026
…ing helper
Introduces the threading infrastructure that the follow-up PRs depend on. This
PR adds the helpers and tests; it does not change any production call sites.
What it adds
* OneSignalDispatchers.SerialIO
A single-thread, named ("OneSignal-SerialIO") CoroutineDispatcher backed
by Executors.newSingleThreadExecutor with a SupervisorJob + CoroutineScope.
Falls back to Dispatchers.IO.limitedParallelism(1) if executor construction
fails. Submission order on the dispatcher == execution order on its single
worker, which is exactly the semantics the focus / unfocus lifecycle
handlers need (see the next PR).
Companion: launchOnSerialIO { ... } and a SerialIO entry in
OneSignalDispatchers.getPerformanceMetrics() / getStatus().
* ThreadUtils.suspendifyOnSerialIO { ... }
Always-on serial dispatch. Wraps OneSignalDispatchers.launchOnSerialIO and
is intentionally NOT gated on ThreadingMode.useBackgroundThreading - some
code paths need ordered off-main execution unconditionally.
* ThreadUtils.runOnSerialIOIfBackgroundThreading { ... }
FF-gated wrapper for non-suspending blocks. When
ThreadingMode.useBackgroundThreading is true the block is dispatched to
SerialIO; when false the block runs inline on the calling thread. This is
the call shape every subsequent focus / unfocus handler in this series
uses, so the rollout matrix stays one-knob simple.
Block is non-suspending on purpose: the FF-off branch executes on the
caller's thread, and a suspending block there would force a runBlocking,
which defeats the purpose of an A/B comparison.
* IOMockHelper stubs the new helpers
suspendifyOnSerialIO + launchOnSerialIO are tracked by awaitIO() so
existing specs stay deterministic. runOnSerialIOIfBackgroundThreading is
stubbed inline-on-test-thread by default so existing call-site specs keep
their observable behavior; specs that want to exercise the FF-on (offload)
branch can override the stub.
Tests
* OneSignalDispatchersTests: new SerialIO cases - construction, lazy chain
activates on first launch, getStatus reports Active + queue size, falls
back to the limitedParallelism(1) path if executor construction fails.
getStatus + getPerformanceMetrics are refactored to extract executorStatus
+ scopeStatus inline helpers to keep them under Detekt's LongMethod /
ComplexMethod thresholds.
* ThreadUtilsFeatureFlagTests: new cases that suspendifyOnSerialIO always
routes through the serial dispatcher (FF-agnostic), and that
runOnSerialIOIfBackgroundThreading routes through the serial dispatcher
when the FF is on and runs inline when the FF is off.
Why a dedicated serial dispatcher (not just suspendifyOnIO)
Multi-thread IO pools don't guarantee submission order = execution order. A
rapid focus burst (activity restart, share flow popping the activity back/
forth) could otherwise interleave cancel/schedule pairs or session-state
mutations. Pinning order-sensitive lifecycle work to a single executor keeps
it globally ordered, and future per-event work (focus counters, session
timing, analytics) inherits the guarantee for free.
:OneSignal:core detekt + full unit suite green. No production behavior change
in this PR; the follow-up PRs land the call-site offloads (#2644) and the
dispatcher prewarm (#2645).
Co-authored-by: Cursor <cursoragent@cursor.com>
47522c0 to
7d4a763
Compare
520ae2c to
0621414
Compare
abdulraqeeb33
pushed a commit
that referenced
this pull request
May 12, 2026
…old-start ANRs ANR-dump analysis (logs/2026-05-12, 500 entries on sdk_background_threading) shows 23 / 500 (4.6%) of ANRs ending in SyncJobService.onStartJob -> suspendifyOnIO, all bottoming out in the same OneSignalDispatchers lazy chain: ThreadPoolExecutor.execute -> LinkedBlockingQueue.offer CoroutineDispatcher.dispatch -> kotlinx.coroutines first-launch OneSignalDispatchers.IOScope.<init> (by lazy) OneSignalDispatchers.IO (by lazy) OneSignalDispatchers.ioExecutor (by lazy) The first IO consumer in the process pays the executor + dispatcher + scope construction + the kotlinx.coroutines MainDispatcherFactory ServiceLoader scan on its thread. Under sdk_background_threading whichever main-thread caller wins the race eats 5-20s before the watchdog fires. #2644 routes the five known onFocus / onUnfocused handlers through runOnSerialIOIfBackgroundThreading so they no longer fire on main, but the deeper structural problem is the lazy chain itself - a future call site that slips past the FF gate (or a JobService delivered to main before init has run) hits the same stall. OneSignalDispatchers.prewarm() spawns a dedicated short-lived "OneSignal-prewarm" daemon thread that submits one empty launch on each of IO / Default / SerialIO. That single thread pays the lazy-init cost end-to-end so the next production caller - even on the main thread - only sees the cheap "submit work to an already-constructed executor" cost. * Idempotent: double-checked-locked prewarmStarted flag, so repeat calls from init / suspend init / SyncJobService.onStartJob no-op cheaply. An internal resetPrewarmForTest() lets specs exercise the "first call wins" branch independently. * Fire-and-forget: failures log and swallow. The existing Dispatchers.IO / SerialIO fallback paths in [IO] / [SerialIO] still apply if anything goes wrong, so a failed prewarm just means the first real caller pays the original cost. * Daemon thread at NORM_PRIORITY - 2 so prewarm never blocks process exit or starves UI work. Called from: * OneSignalImp.initWithContext(context, appId) (sync variant) * OneSignalImp.initWithContextSuspend(context, appId) (suspend variant, used by re-entrant suspend callers) * SyncJobService.onStartJob BEFORE suspendifyOnIO (JobService can fire before the host app init runs) Tests (:core OneSignalDispatchersTests) * prewarm returns immediately on the caller and the daemon thread brings IO / Default / SerialIO + their scopes to Active. * prewarm is idempotent - second call does not spawn another OneSignal-prewarm thread (verified via thread-name scan). Stacked on #2644. Together with #2643 and #2644 this covers the full 95 / 500 main-thread-ANR bucket from logs/2026-05-12 attributable to SDK threading helpers (47 onFocus + 23 JobService + 25 SessionService). :OneSignal:core detekt + full unit suite green. Co-authored-by: Cursor <cursoragent@cursor.com>
0621414 to
6dc8889
Compare
…ing helper
Introduces the threading infrastructure that the follow-up PRs depend on. This
PR adds the helpers and tests; it does not change any production call sites.
What it adds
* OneSignalDispatchers.SerialIO
A single-thread, named ("OneSignal-SerialIO") CoroutineDispatcher backed
by Executors.newSingleThreadExecutor with a SupervisorJob + CoroutineScope.
Falls back to Dispatchers.IO.limitedParallelism(1) if executor construction
fails. Submission order on the dispatcher == execution order on its single
worker, which is exactly the semantics the focus / unfocus lifecycle
handlers need (see the next PR).
Companion: launchOnSerialIO { ... } and a SerialIO entry in
OneSignalDispatchers.getPerformanceMetrics() / getStatus().
* ThreadUtils.suspendifyOnSerialIO { ... }
Always-on serial dispatch. Wraps OneSignalDispatchers.launchOnSerialIO and
is intentionally NOT gated on ThreadingMode.useBackgroundThreading - some
code paths need ordered off-main execution unconditionally.
* ThreadUtils.runOnSerialIOIfBackgroundThreading { ... }
FF-gated wrapper for non-suspending blocks. When
ThreadingMode.useBackgroundThreading is true the block is dispatched to
SerialIO; when false the block runs inline on the calling thread. This is
the call shape every subsequent focus / unfocus handler in this series
uses, so the rollout matrix stays one-knob simple.
Block is non-suspending on purpose: the FF-off branch executes on the
caller's thread, and a suspending block there would force a runBlocking,
which defeats the purpose of an A/B comparison.
* IOMockHelper stubs the new helpers
suspendifyOnSerialIO + launchOnSerialIO are tracked by awaitIO() so
existing specs stay deterministic. runOnSerialIOIfBackgroundThreading is
stubbed inline-on-test-thread by default so existing call-site specs keep
their observable behavior; specs that want to exercise the FF-on (offload)
branch can override the stub.
Tests
* OneSignalDispatchersTests: new SerialIO cases - construction, lazy chain
activates on first launch, getStatus reports Active + queue size, falls
back to the limitedParallelism(1) path if executor construction fails.
getStatus + getPerformanceMetrics are refactored to extract executorStatus
+ scopeStatus inline helpers to keep them under Detekt's LongMethod /
ComplexMethod thresholds.
* ThreadUtilsFeatureFlagTests: new cases that suspendifyOnSerialIO always
routes through the serial dispatcher (FF-agnostic), and that
runOnSerialIOIfBackgroundThreading routes through the serial dispatcher
when the FF is on and runs inline when the FF is off.
Why a dedicated serial dispatcher (not just suspendifyOnIO)
Multi-thread IO pools don't guarantee submission order = execution order. A
rapid focus burst (activity restart, share flow popping the activity back/
forth) could otherwise interleave cancel/schedule pairs or session-state
mutations. Pinning order-sensitive lifecycle work to a single executor keeps
it globally ordered, and future per-event work (focus counters, session
timing, analytics) inherits the guarantee for free.
:OneSignal:core detekt + full unit suite green. No production behavior change
in this PR; the follow-up PRs land the call-site offloads (#2644) and the
dispatcher prewarm (#2645).
Co-authored-by: Cursor <cursoragent@cursor.com>
7d4a763 to
6e22c2e
Compare
6dc8889 to
be6f168
Compare
abdulraqeeb33
pushed a commit
that referenced
this pull request
May 12, 2026
…old-start ANRs ANR-dump analysis (logs/2026-05-12, 500 entries on sdk_background_threading) shows 23 / 500 (4.6%) of ANRs ending in SyncJobService.onStartJob -> suspendifyOnIO, all bottoming out in the same OneSignalDispatchers lazy chain: ThreadPoolExecutor.execute -> LinkedBlockingQueue.offer CoroutineDispatcher.dispatch -> kotlinx.coroutines first-launch OneSignalDispatchers.IOScope.<init> (by lazy) OneSignalDispatchers.IO (by lazy) OneSignalDispatchers.ioExecutor (by lazy) The first IO consumer in the process pays the executor + dispatcher + scope construction + the kotlinx.coroutines MainDispatcherFactory ServiceLoader scan on its thread. Under sdk_background_threading whichever main-thread caller wins the race eats 5-20s before the watchdog fires. #2644 routes the five known onFocus / onUnfocused handlers through runOnSerialIOIfBackgroundThreading so they no longer fire on main, but the deeper structural problem is the lazy chain itself - a future call site that slips past the FF gate (or a JobService delivered to main before init has run) hits the same stall. OneSignalDispatchers.prewarm() spawns a dedicated short-lived "OneSignal-prewarm" daemon thread that submits one empty launch on each of IO / Default / SerialIO. That single thread pays the lazy-init cost end-to-end so the next production caller - even on the main thread - only sees the cheap "submit work to an already-constructed executor" cost. * Idempotent: double-checked-locked prewarmStarted flag, so repeat calls from init / suspend init / SyncJobService.onStartJob no-op cheaply. An internal resetPrewarmForTest() lets specs exercise the "first call wins" branch independently. * Fire-and-forget: failures log and swallow. The existing Dispatchers.IO / SerialIO fallback paths in [IO] / [SerialIO] still apply if anything goes wrong, so a failed prewarm just means the first real caller pays the original cost. * Daemon thread at NORM_PRIORITY - 2 so prewarm never blocks process exit or starves UI work. Called from: * OneSignalImp.initWithContext(context, appId) (sync variant) * OneSignalImp.initWithContextSuspend(context, appId) (suspend variant, used by re-entrant suspend callers) * SyncJobService.onStartJob BEFORE suspendifyOnIO (JobService can fire before the host app init runs) Tests (:core OneSignalDispatchersTests) * prewarm returns immediately on the caller and the daemon thread brings IO / Default / SerialIO + their scopes to Active. * prewarm is idempotent - second call does not spawn another OneSignal-prewarm thread (verified via thread-name scan). Stacked on #2644. Together with #2643 and #2644 this covers the full 95 / 500 main-thread-ANR bucket from logs/2026-05-12 attributable to SDK threading helpers (47 onFocus + 23 JobService + 25 SessionService). :OneSignal:core detekt + full unit suite green. Co-authored-by: Cursor <cursoragent@cursor.com>
…dk_background_threading FF Wraps every IApplicationLifecycleHandler that does slow / blocking work on the main thread with runOnSerialIOIfBackgroundThreading (introduced in #2643). All five handlers share one rollout knob, one ordering guarantee (the SerialIO single-thread executor), and one observable contract in tests. The handlers + why they were ANR-ing BackgroundManager.onFocus / onUnfocused Synchronous JobScheduler.cancel / .schedule on the main thread. Binder transactions to system_server that can block for many seconds on Xiaomi / MIUI under power-save. OTel insertId ycae33cjpu6gcyut shows a 20,796 ms main-thread block on a 25078RA3EL / Android 15 device. NotificationsManager.onFocus refreshNotificationState() drives NotificationRestoreWorkManager .beginEnqueueingWork, which lazily constructs WorkManager (opens / migrates the SQLite store at app_data/databases/androidx.work.workdb on first call) and then writes a WorkSpec row. OTel insertId 9qy5s0ta0cwqwmb0 shows a 30,516 ms main-thread block on a vivo I2306 / Android 15 device. Short-circuits on `restored = true` after the first call, so only the first focus event per process eats the SQLite stall. NotificationPermissionController polling lifecycle listener onFocus reads ConfigModel.foregroundFetchNotificationPermissionInterval and calls pollingWaiter.wake(), which dispatches a coroutine resume onto the IO pool via channel.trySend -> ThreadPoolExecutor.execute. On cold start that hits the OneSignalDispatchers lazy chain (executor + dispatcher + scope construction) on the calling thread - 26 / 500 main-thread ANRs in logs/2026-05-12 sit on this stack. onUnfocused does the symmetric job of pushing the polling interval to 1 day to effectively pause polling. FeatureFlagsRefreshService.onFocus / onUnfocused onFocus -> restartForegroundPolling -> OneSignalDispatchers.launchOnIO, same lazy chain stall - 18 / 500 ANRs in the same bucket. onUnfocused cancels the poll job; we route the cancellation through the same serial dispatcher so back-to-back focus -> unfocus stays globally ordered with onFocus's polling-job swap, and `synchronized(this)` is qualified as `synchronized(this@FeatureFlagsRefreshService)` so the lambda locks on the service instance (the same monitor restartForegroundPolling takes) rather than the no-receiver lambda object. SessionService.onFocus / onUnfocused sessionLifeCycleNotifier.fire { onSessionStarted / Active } invokes the registered session-lifecycle handlers (operation repo, IAM trigger eval, etc.) synchronously, and the first one to touch OneSignalDispatchers pays the cold-init cost on the main thread - 25 / 500 ANRs in logs/2026-05-12 sit on this stack. session.startTime / session.focusTime / activeDuration accounting is preserved by capturing _time.currentTimeMillis on the caller's thread BEFORE the wrapper and passing it into the deferred handleOnFocus / handleOnUnfocused, so the timestamps reflect when Android delivered the event, not when the serial dispatcher ran the block. Rollout matrix (uniform across all five handlers) FF on -> runOnSerialIOIfBackgroundThreading { ... } dispatches to OneSignalDispatchers.SerialIO (single-thread executor). Main thread returns from handleFocus immediately. FF off -> the block runs inline on the lifecycle main thread. Legacy behavior; retains the ANR for the control cohort so the A/B comparison stays clean. Activation is APP_STARTUP per FeatureFlag.kt, so a given session is latched on one path and won't bounce mid-run. Worth flagging that the production ANR samples for every handler in this PR were on FF=ON - because all five previously bypassed every threading helper, the FF did not gate any of these codepaths. This PR is what introduces the gate. Why the serial dispatcher specifically All five handlers are invoked from the same main-thread fanout (ApplicationService.handleFocus -> applicationLifecycleNotifier.fire). A rapid focus burst on a multi-thread IO pool could interleave them with each other and with the BackgroundManager cancel/schedule pair. Pinning all five to the same single-thread executor keeps lifecycle work globally ordered on the main-thread submission order, and future per-event work added to any of these handlers (focus counters, notification analytics, session timing) inherits the ordering guarantee for free. Tests (all new specs pass; existing specs unchanged) * BackgroundManagerTests: existing tests + FF-on (dispatches through launchOnSerialIO in order) + FF-off (runs inline, does not dispatch) for both cancel and schedule. Includes a rapid unfocus -> focus burst test that pins both events through the serial dispatcher in submission order. * NotificationsManagerTests: dispatch contract on onFocus + rapid focus burst preserves submission order. Lambda body is observable (the test stub invokes the captured block) so JaCoCo sees the refreshNotificationState() call covered. * NotificationPermissionControllerTests: dispatch contract for the polling lifecycle listener on both onFocus and onUnfocused. Existing polling integration tests still pass under the FF-off default. * FeatureFlagsRefreshServiceTests: onFocus + onUnfocused route through runOnSerialIOIfBackgroundThreading. * SessionServiceTests: existing state-mutation assertions still pass under the FF-off default (the wrapper runs inline). New assertions for the dispatch contract on onFocus + onUnfocused + the rapid burst. :OneSignal:core + :OneSignal:notifications detekt + full unit suites green. Co-authored-by: Cursor <cursoragent@cursor.com>
abdulraqeeb33
pushed a commit
that referenced
this pull request
May 12, 2026
…old-start ANRs ANR-dump analysis (logs/2026-05-12, 500 entries on sdk_background_threading) shows 23 / 500 (4.6%) of ANRs ending in SyncJobService.onStartJob -> suspendifyOnIO, all bottoming out in the same OneSignalDispatchers lazy chain: ThreadPoolExecutor.execute -> LinkedBlockingQueue.offer CoroutineDispatcher.dispatch -> kotlinx.coroutines first-launch OneSignalDispatchers.IOScope.<init> (by lazy) OneSignalDispatchers.IO (by lazy) OneSignalDispatchers.ioExecutor (by lazy) The first IO consumer in the process pays the executor + dispatcher + scope construction + the kotlinx.coroutines MainDispatcherFactory ServiceLoader scan on its thread. Under sdk_background_threading whichever main-thread caller wins the race eats 5-20s before the watchdog fires. #2644 routes the five known onFocus / onUnfocused handlers through runOnSerialIOIfBackgroundThreading so they no longer fire on main, but the deeper structural problem is the lazy chain itself - a future call site that slips past the FF gate (or a JobService delivered to main before init has run) hits the same stall. OneSignalDispatchers.prewarm() spawns a dedicated short-lived "OneSignal-prewarm" daemon thread that submits one empty launch on each of IO / Default / SerialIO. That single thread pays the lazy-init cost end-to-end so the next production caller - even on the main thread - only sees the cheap "submit work to an already-constructed executor" cost. * Idempotent: double-checked-locked prewarmStarted flag, so repeat calls from init / suspend init / SyncJobService.onStartJob no-op cheaply. An internal resetPrewarmForTest() lets specs exercise the "first call wins" branch independently. * Fire-and-forget: failures log and swallow. The existing Dispatchers.IO / SerialIO fallback paths in [IO] / [SerialIO] still apply if anything goes wrong, so a failed prewarm just means the first real caller pays the original cost. * Daemon thread at NORM_PRIORITY - 2 so prewarm never blocks process exit or starves UI work. Called from: * OneSignalImp.initWithContext(context, appId) (sync variant) * OneSignalImp.initWithContextSuspend(context, appId) (suspend variant, used by re-entrant suspend callers) * SyncJobService.onStartJob BEFORE suspendifyOnIO (JobService can fire before the host app init runs) Tests (:core OneSignalDispatchersTests) * prewarm returns immediately on the caller and the daemon thread brings IO / Default / SerialIO + their scopes to Active. * prewarm is idempotent - second call does not spawn another OneSignal-prewarm thread (verified via thread-name scan). Stacked on #2644. Together with #2643 and #2644 this covers the full 95 / 500 main-thread-ANR bucket from logs/2026-05-12 attributable to SDK threading helpers (47 onFocus + 23 JobService + 25 SessionService). :OneSignal:core detekt + full unit suite green. Co-authored-by: Cursor <cursoragent@cursor.com>
be6f168 to
abe5633
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
One Line Summary
Wrap every main-thread
IApplicationLifecycleHandler.onFocus/onUnfocusedthat does slow / blocking work withrunOnSerialIOIfBackgroundThreading(introduced in #2643). Five handlers, one rollout knob, one ordering guarantee.Linear: SDK-4506
Project: Android: refactor loading during initialization
Base branch:
ar/sdk-4505(#2643 — adds the helper + serial dispatcher)Motivation
Activity.onStart→ApplicationService.onActivityStarted→handleFocus→applicationLifecycleNotifier.fire { onFocus(...) }synchronously invokes everyIApplicationLifecycleHandleron the main thread:Five lifecycle handlers do slow / blocking work on that main-thread fanout. ANR-dump analysis of
logs/2026-05-12 (500 ANR entries, all on sdk_background_threading)plus two prior OTel samples attribute ~94 / 500 (~18.8 %) of main-thread ANRs to them.BackgroundManager.onFocus/onUnfocusedJobScheduler.cancel/.schedule(synchronous Binder tosystem_server)JobSchedulerImpl.cancelycae33cjpu6gcyut— 20,796 ms on Xiaomi25078RA3EL/ Android 15NotificationsManager.onFocusNotificationRestoreWorkManager.beginEnqueueingWork→ WorkManager SQLite init +enqueueUniqueWorkSQLiteConnection.nativeExecuteForLong9qy5s0ta0cwqwmb0— 30,516 ms on vivoI2306/ Android 15NotificationPermissionControllerpolling listener (onFocus/onUnfocused)ConfigModel, callspollingWaiter.wake()→ IO pool dispatchlogs/2026-05-12FeatureFlagsRefreshService.onFocus/onUnfocusedrestartForegroundPolling→OneSignalDispatchers.launchOnIOlogs/2026-05-12SessionService.onFocus/onUnfocusedsessionLifeCycleNotifier.fire { onSessionStarted / onSessionActive }— runs every subscribed handler synchronously (operation repo, IAM trigger eval, etc.)logs/2026-05-12The three dispatcher-cold-start handlers (NPC + FFRS + SessionService) all bottom out in the
OneSignalDispatcherslazy chain — the deeper structural fix for that is #2645 (prewarm). This PR is the safer first cut: keep all five handlers on one FF-gated knob so the rollout matrix stays simple.Fix
Single shared helper from #2643:
Applied uniformly at every call site:
Notable per-handler details:
FeatureFlagsRefreshService.onUnfocusedqualifiesthisasthis@FeatureFlagsRefreshServiceso the lambda locks on the service instance — the same monitorrestartForegroundPollingtakes — rather than on the no-receiver lambda object.SessionServicecaptures_time.currentTimeMillison the caller's thread BEFORE the wrapper, sosession.startTime/session.focusTime/activeDurationreflect when Android delivered the event, not whenever the serial dispatcher ran the block.NotificationPermissionController.onUnfocusedis intentionally NOT a no-op — it pushes the polling interval to 1 day to effectively pause polling. With the wrapper, FF=ON moves that single-field assignment onto SerialIO and FF=OFF keeps it inline.Gated rollout
runOnSerialIOIfBackgroundThreading { ... }→OneSignalDispatchers.SerialIO(single-thread executor)Activation is
APP_STARTUPperFeatureFlag.kt, so a given session is latched on one path and won't bounce mid-run. Worth flagging that the production ANR samples for every handler in this PR were on FF=ON — because all five previously bypassed every threading helper, the FF did not gate any of these codepaths. This PR is what introduces the gate.Why the serial dispatcher (not
suspendifyOnIO)All five handlers are invoked from the same main-thread fanout (
ApplicationService.handleFocus→applicationLifecycleNotifier.fire). A rapid focus burst on a multi-thread IO pool could interleave them with each other and with theBackgroundManager.cancel/schedulepair. Pinning all five to the same single-thread executor keeps lifecycle work globally ordered on the main-thread submission order, and future per-event work added to any of these handlers inherits the ordering guarantee for free.Scope
BackgroundManager: synchronousJobScheduler.cancel/scheduleno longer on main whensdk_background_threadingis on.NotificationsManager.onFocus: WorkManager SQLite init + enqueue no longer on main.NotificationPermissionControllerpolling listener:onFocus+onUnfocusedmove to SerialIO.FeatureFlagsRefreshService:onFocus+onUnfocusedmove to SerialIO (lock qualified).SessionService:onFocus+onUnfocusedmove to SerialIO with timing capture preserved.refreshNotificationState,NotificationRestoreWorkManager.beginEnqueueingWork(still synchronized + idempotent onrestored),NotificationHelper.areNotificationsEnabled,setPermissionStatusAndFire, the body ofrestartForegroundPolling, andhandleOnFocus/handleOnUnfocused's session-state mutation semantics.NotificationsManager.onUnfocusedis empty in production; not touched.Affected code checklist
Testing
Static
:OneSignal:core:detekt,:OneSignal:notifications:detekt— clean.:OneSignal:core:compileReleaseKotlin,:OneSignal:notifications:compileReleaseKotlin,:OneSignal:testhelpers:compileReleaseKotlin— clean.Automated
:OneSignal:core:testReleaseUnitTest— full suite green, including:BackgroundManagerTests— FF=on dispatch vialaunchOnSerialIOin submission order on bothcancelandschedule, FF=off inline, rapidunfocus -> focusburst routes through the serial dispatcher in submission order.FeatureFlagsRefreshServiceTests—onFocus/onUnfocusedroute throughrunOnSerialIOIfBackgroundThreading.SessionServiceTests— existing state-mutation assertions still pass under the FF-off default; new assertions for the dispatch contract ononFocus+onUnfocused+ the rapidunfocus -> focusburst.:OneSignal:notifications:testReleaseUnitTest— full suite green, including:NotificationsManagerTests— dispatch contract + rapid-burst ordering, lambda body observable so JaCoCo sees therefreshNotificationStatecall covered.NotificationPermissionControllerTests—onFocus+onUnfocusedpolling lifecycle listener dispatch contract. Existing polling integration tests still pass under the FF-off default since the wrapper inlines.Manual
Will follow up with manual repro on a vivo device under simulated SQLite contention conditions.
Checklist
Overview
Testing
Final pass