Skip to content

Conversation

@ejona86
Copy link
Member

@ejona86 ejona86 commented Dec 30, 2025

These other implementations pre-date FixedResultPicker, and are no longer needed.

@ejona86 ejona86 requested a review from shivaspeaks December 30, 2025 17:09
These other implementations pre-date FixedResultPicker, and are no
longer needed.
gRPC-LB's tests failed when using FixedResultPicker because it didn't
transition to READY. This was because GrpclbState.maybeUpdatePicker()
didn't consider the ConnectivityState when checking if anything had
changed. The old PickFirstLoadBalancer.Picker didn't implement equals()
so every update was considered different, ignoring the connectivity
state. PickFirstLeafLoadBalancer wasn't impacted because it didn't pick
a subchannel when in CONNECTING, and neither did PickFirstLoadBalancer
after the first update.

This is fixed on both sides: PickFirstLoadBalancer no londer returns the
useless subchannel when picking during CONNECTING, and gRPC-LB now
checks the connectivity state.

This now causes what appears to be a bug in RLS. That is still being
investigated.
@ejona86 ejona86 force-pushed the use-fixedresultpicker branch from 13240f3 to 9287f85 Compare January 3, 2026 00:11
lb_serverStatusCodeConversion() has been misleading since 42e1829
("xds: Do RLS fallback policy eagar start"). At that point, the
subchannel it marked as READY was for the default target's policy, not
the policy for wilderness. However, since old PF policy provided a
subchannel when CONNECTING, everything was "fine", but RLS would
mistakenly count toward target_picks.

This demonstrates that target_picks has been broken since it was
introduced for PF, as PF relied on the caller to avoid the picker when
it was CONNECTING. This may have been hard to notice in production, as
the metrics become correct as soon as the connection is established, so
as long as you use the channel for a while, the duplicate counting would
become a small percentage of the overall amount.
@ejona86
Copy link
Member Author

ejona86 commented Jan 3, 2026

@kannanjgithub, FYI, the second commit here ("Fix grpclb incompatibility") ends up fixing an RLS metric bug. See the "Fix RLS test" for a description of the bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant