Handle failed identity resolution gracefully in PolarisEventMetadataFactory#4459
Handle failed identity resolution gracefully in PolarisEventMetadataFactory#4459HonahX wants to merge 2 commits into
Conversation
…actory When PolarisEventMetadataFactory.create() is called before authentication has completed (e.g., in pre-authentication filters), triggering deferred identity resolution can cause the auth pipeline to fail with a CompletionException. Previously this exception would propagate to the caller. For example, when the rate limiter filter emits an event before rejecting a request to an unauthenticated endpoint like /oauth/tokens, the leaked auth exception gets mapped to 401 instead of the intended 429. Fix: extract getSecurityIdentity() helper that catches CompletionException from deferred identity resolution, returning Optional.empty() instead of propagating the auth failure. A debug log is emitted for observability. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
adnanhemani
left a comment
There was a problem hiding this comment.
Overall, LGTM - agreed with us needing this change. Thanks @HonahX!
|
|
||
| @Test | ||
| void createReturnsEmptyUserWhenIdentityNotYetResolved() { | ||
| // Simulates getNow(null) returning null because the future hasn't completed |
There was a problem hiding this comment.
Also, I think this is true, but would be nice if someone can confirm that this behavior hasn't changed as part of this code change.
snazy
left a comment
There was a problem hiding this comment.
I re-read the updated factory and the rate-limiter call path. Catching CompletionException looks a bit odd at first, but given this code intentionally peeks via CompletionStage#getNow(null), that is the wrapper we get when deferred identity resolution has already completed exceptionally. Treating that as an unavailable optional user is reasonable for event metadata, especially from pre-auth filters where metadata creation must not change the response status.
The focused test passes locally, and the PR CI is green.
snazy
left a comment
There was a problem hiding this comment.
One follow-up after thinking more about the CompletionException boundary: I think this catch should probably unwrap and only suppress the expected auth-unavailable/auth-failed case, not every CompletionException.
CompletionException is just the CompletionStage wrapper. It could also wrap a technical failure from identity resolution. For example, AuthenticatingAugmentor deliberately lets ServiceFailureException bubble so it is not converted into a 401. With the current code, that kind of failure would be logged at debug and treated as an absent optional user.
Could we narrow this to something like AuthenticationFailedException (and any other explicitly expected unauthenticated case), and rethrow the CompletionException otherwise? A test that a non-auth RuntimeException still propagates would make the boundary clearer.
| void createReturnsEmptyUserWhenIdentityNotYetResolved() { | ||
| // Simulates getNow(null) returning null because the future hasn't completed | ||
| when(currentIdentityAssociation.getDeferredIdentity()) | ||
| .thenReturn(Uni.createFrom().item(() -> null)); |
There was a problem hiding this comment.
This is not accurately testing the scenario of a future that isn't completed. You should return instead a Uni that never completes.
There was a problem hiding this comment.
Good catch!
Uni.createFrom().nothing()
should do the trick. Updated the test to use that!
- Only suppress AuthenticationFailedException in getSecurityIdentity(), rethrow CompletionException for non-auth failures (e.g. ServiceFailureException) - Fix "not-yet-resolved" test to use Uni.createFrom().nothing() (truly incomplete) - Add realmId/timestamp assertions to anonymous and not-yet-resolved tests - Add test verifying non-auth exceptions propagate Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@snazy Thanks for the review! I've updated to suppress |
| .subscribeAsCompletionStage() | ||
| .getNow(null)); | ||
| } catch (CompletionException e) { | ||
| if (e.getCause() instanceof AuthenticationFailedException) { |
There was a problem hiding this comment.
How about a holder approach?
We could start with PolarisPrincipalHolder but alter it to always receive the PolarisPrincipal after authentication completes... that is in AuthenticatingAugmentor.
The normal producer method would assert that the principal is set.
The events factory could deal with PolarisPrincipalHolder and get an optional PolarisPrincipal value via another getter.
This way most of the processing logic does not have to deal with async aspects.
WDYT?
PolarisEventMetadataFactory.create()callsgetDeferredIdentity().subscribeAsCompletionStage().getNow(null)to resolve the current user for event metadata. When called before authentication has completed(e.g., in pre-authentication filters), this can trigger the auth pipeline, which may fail with
CompletionExceptionwrapping anAuthenticationFailedException.Previously, this exception would propagate to the caller. For example, when the rate limiter filter emits an event before rejecting a request, the leaked
Completion exception gets mapped to 500 instead of the intended 429.
In general, since
useritself is an optional field, proceed the event dispatching with a null user instead of failing directly could increase the stability and while we can still see/notice the actual error through debug log.PolarisEventMetadataFactory.java: RefactorgetUser()to use newgetSecurityIdentity()helper withCompletionExceptionhandling and debug loggingPolarisEventMetadataFactoryTest.java: New unit test covering resolved identity, anonymous identity, not-yet-resolved identity, and failed deferred resolutionTest plan
PolarisEventMetadataFactoryTestpasses (4 cases)Checklist
CHANGELOG.md(if needed)site/content/in-dev/unreleased(if needed)