Skip to content

Fix deadlocks in request cache (unstable hashCode, re-entrant self-deadlock, mutable properties)#12166

Open
gnodet wants to merge 3 commits into
maven-4.0.xfrom
fix/identity-hashmap-request-cache
Open

Fix deadlocks in request cache (unstable hashCode, re-entrant self-deadlock, mutable properties)#12166
gnodet wants to merge 3 commits into
maven-4.0.xfrom
fix/identity-hashmap-request-cache

Conversation

@gnodet
Copy link
Copy Markdown
Contributor

@gnodet gnodet commented May 27, 2026

Summary

Fixes two deadlock vectors and one root-cause mutability bug in the Maven 4 request cache layer, responsible for ~80 timeout failures in Maven 4 compatibility testing (projects using maven-remote-resources-plugin hang indefinitely).

Root Cause Analysis

The hang occurs during parent POM resolution triggered by the remote-resources-plugin's lazy getProjects() call, which enters the compat resolver → projectBuilder.build() → Maven 4 ArtifactResolver → request cache. Two independent deadlock vectors exist:

1. Unstable hashCode() in nonCachedResults HashMap

AbstractRequestCache.requests() uses a local HashMap to coordinate batch results. The batch supplier puts results via put(), and the individualSupplier lambda retrieves them via containsKey().

ResolverRequest is a Java record whose hashCode() transitively includes RequestTrace.dataModelBuilderRequestsystemProperties. When systemProperties is null in the builder, DefaultModelBuilderRequest stores a live view of session.getSystemProperties() (backed by java.util.Properties) instead of a snapshot. If System.setProperty() is called between put() and containsKey(), the hash changes, the entry is stored in one bucket but looked up in another, and containsKey() returns false — the thread waits forever.

2. Re-entrant self-deadlock via CachingSupplier

When requests() stores CachingSupplier instances wrapping individualSupplier in the shared session cache, a re-entrant call from the same thread during batch computation can self-deadlock:

  1. requests([req])doCache(req, individualSupplier) creates CachingSupplier CS in cache
  2. Batch executes under synchronized(nonCachedResults)
  3. Batch computation triggers model building → parent POM resolution → request(req) → finds CS in cache
  4. cs.apply(req)synchronized(cs) is re-entrant (same thread) → value is null → calls individualSupplier
  5. individualSuppliersynchronized(nonCachedResults) (re-entrant) → containsKey(req) → false (batch not done) → wait() → releases nonCachedResults monitor
  6. DEADLOCK: the thread is blocked in wait(), but it is also the thread running the batch — notifyAll() only fires after the batch returns, which can never happen

Fix (3 commits)

Commit 1: IdentityHashMap for nonCachedResults

Switch from HashMap to IdentityHashMap in AbstractRequestCache.requests(). Since the code always uses the same object references for put() and containsKey() within a single requests() call, identity-based lookup is correct and immune to any hashCode() instability. This is a defense-in-depth measure: RequestTrace.data is typed as Object with no immutability guarantee, so any mutable data in the trace chain could trigger the same issue.

Commit 2: Re-entrant self-deadlock detection in CachingSupplier

Track the computing thread in CachingSupplier. When apply() detects re-entrant access from the same thread (the thread that is already computing the value), it throws CyclicCacheAccessException. AbstractRequestCache.request() catches this and falls back to computing directly with the caller's original supplier, bypassing the cache for that one call. The outer batch computation continues normally and sets the cached value for future callers.

Commit 3: Snapshot system/user properties in ModelBuilderRequest

DefaultModelBuilderRequest constructor used session.getSystemProperties() directly when systemProperties was null, storing a live view backed by java.util.Properties. Now always wraps in Map.copyOf() regardless of the source, ensuring the stored map is a true immutable snapshot. Same fix applied to userProperties.

Test plan

  • Reproduced locally: httpcomponents-stylecheck hangs indefinitely with Maven 4
  • Verified fix: same project builds in 1.6s after the changes
  • Maven's own build (mvn verify -DskipTests) passes
  • Existing AbstractRequestCacheTest passes

Claude Code on behalf of Guillaume Nodet

The `requests()` method uses a HashMap to coordinate batch results between
the batch supplier and the individual CachingSupplier instances. When a
request object's hashCode() changes between HashMap.put() and
HashMap.containsKey() (e.g., because ResolverRequest includes a
RequestTrace with mutable ModelBuilderRequest data), the key is stored
in one bucket but looked up in another, causing containsKey() to return
false. The individualSupplier then waits forever for a result that is
already in the map but unreachable.

Switch to IdentityHashMap which uses System.identityHashCode() and
reference equality (==), both of which are stable regardless of
mutable object state. This is safe because the code always uses the
same object reference for put and get.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@gnodet gnodet added this to the 4.0.0-rc-6 milestone May 28, 2026
…mputation

When requests() stores CachingSuppliers wrapping individualSupplier in the
session cache, a re-entrant call from the same thread can self-deadlock:
the batch computation triggers request() which finds the cached entry,
calls individualSupplier.apply() which wait()s for the batch to complete,
but the batch can't complete because this thread is now blocked.

Track the computing thread in CachingSupplier and throw
CyclicCacheAccessException on re-entrant access. The request() method
catches this and falls back to computing directly with the caller's
supplier, bypassing the cache for that one call.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
gnodet added a commit that referenced this pull request May 28, 2026
Cherry-pick from fix/identity-hashmap-request-cache branch.
When requests() stores CachingSuppliers wrapping individualSupplier in
the session cache, a re-entrant call from the same thread self-deadlocks.
Track the computing thread and fall back to direct computation on
re-entrant access.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When systemProperties or userProperties are null, the constructor fell
through to session.getSystemProperties() without Map.copyOf(). That
returns a live view over java.util.Properties, whose hashCode() mutates
if System.setProperty() is called. Since ModelBuilderRequest ends up as
RequestTrace.data in parent traces, and ResolverRequest.hashCode()
recurses through the entire trace chain, this made cache keys unstable.

Always wrap in Map.copyOf() regardless of the source.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@gnodet gnodet changed the title Fix deadlock in AbstractRequestCache.requests() due to unstable hashCode Fix deadlocks in request cache (unstable hashCode, re-entrant self-deadlock, mutable properties) May 29, 2026
gnodet added a commit that referenced this pull request May 29, 2026
)

Cherry-pick from fix/identity-hashmap-request-cache branch.
DefaultModelBuilderRequest stored a live view of session properties
backed by java.util.Properties when systemProperties was null. Now
always wraps in Map.copyOf() to ensure immutable snapshot.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant