-
Notifications
You must be signed in to change notification settings - Fork 1.4k
[federation] Introduced MultiClusterHelixBrokerStarter to start broker in federated mode #17421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[federation] Introduced MultiClusterHelixBrokerStarter to start broker in federated mode #17421
Conversation
…cross-cluster federated broker
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #17421 +/- ##
============================================
+ Coverage 55.68% 63.19% +7.50%
- Complexity 703 1474 +771
============================================
Files 2463 3156 +693
Lines 139013 188179 +49166
Branches 22166 28796 +6630
============================================
+ Hits 77407 118912 +41505
- Misses 55096 60056 +4960
- Partials 6510 9211 +2701
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| * This class is responsible for managing routing managers and providing the appropriate | ||
| * routing manager based on query options (e.g., whether federation is enabled). | ||
| */ | ||
| public class MultiClusterRoutingContext { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is only a sink right now. would you be adding logic to this in a future PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct.
| _threadAccountant.startWatcherTask(); | ||
|
|
||
| // TODO: Hook multiClusterRoutingContext into request handlers subsequently. | ||
| MultiClusterRoutingContext multiClusterRoutingContext = getMultiClusterRoutingContext(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this initialized in base broker starter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to initialize this in BaseBrokerStarter since it would be passed to all the request handlers subsequently. Alternatively, we could separate out initializing the request handlers in a separate method which can be overriden by MultiClusterHelixBrokerStarter. (The only thing there is we would be passing nulls for multiClusterRoutingContext in BaseBrokerStarter regardless).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I thought about this. I think we can leave this be. In the next PR, can you change the local var to an Optional.ofNullable? Just so it is clear for other folks that this will be null in almost all cases (except federated)
Summary
This change introduces foundational multi-cluster support to the Pinot broker, enabling a single broker to connect to and route queries across multiple Pinot clusters.
Key highlights
BaseBrokerStarterto improve extensibility by extracting initialization steps (spectator Helix manager, routing manager, cluster change mediator) into overridable hooks.MultiClusterHelixBrokerStarterto:MultiClusterRoutingManager.MultiClusterRoutingContext, a shared context object that encapsulates table caches and routing managers for local vs. federated routing.pinot.remote.cluster.namespinot.remote.zk.server.<clusterName>This PR lays the groundwork for cross-cluster query federation while preserving existing single-cluster behavior. Subsequent PRs will add support for SSE and MSE cross-cluster queries.
Testing
Added an end-to-end integration test (
MultiClusterIntegrationTest) validating that two isolated clusters can start with multi-cluster brokers, ingest data independently, and remain queryable.