Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
This Jira issue is to discuss two changes that unfortunately are difficult to address separately
- Separate all ZooKeeper coordination logic into it’s own module, that can be tested in isolation
- Evaluate using Apache Curator for coordination instead of our own logic.
I drafted a SIP, but this is very much WIP, I’d like to hear opinions before I spend too much time on something people hates.
From the initial draft of the SIP:
The main goal of this change is to allow better testing of the different ZooKeeper interactions related to coordination (leader election, queues, etc). There are already some abstractions in place for lower level operations (set-data, get-data, etc, see DistribStateManager), so the idea is to have a new, related abstraction named CoordinationManager, where we could have some higher level coordination-related classes, like LeaderRunner (Overseer), LeaderLatch (for shard leaders), etc. Curator comes into place because, in order to refactor the existing code into these new abstractions, we’d have to rework much of it, so we could instead consider using Curator, a library that was mentioned in the past many times. While I don’t think this is required, It would make this transition and our code simpler (from what I could see, however, input from people with more Curator experience would be greatly appreciated).
While it would be out of the scope of this change, If the abstractions/interfaces are correctly designed, this could lead to, in the future, be able to use something other than ZooKeeper for coordination, either etcd or maybe even some in-memory replacement for tests.
There are still many open questions, and many questions I still don’t know we’ll have, but please, let me know if you have any early feedback, specially if you’ve worked with Curator in the past.
Attachments
1.
|
Refactor the Solr Zookeeper logic to use Apache Curator | Open | Unassigned |
|