Details
Description
This Jira is intended to capture sub jiras on the path to remove the Overseer component from SolrCloud and move to all nodes being able to do the work currently done by Overseer.
See detailed description in this doc.
Copying (edited) from the above doc:
The motivation for removing Overseer include:
- Mono threaded state change is slow and doesn’t scale,
- Communication between cluster nodes and the Overseer use Zookeeper as a queueing mechanism, this is not a good idea,
- Nodes talking to Overseer (then Overseer talking to itself) via Zookeeper is inefficient and adds latency,
- Collection API scalability is poor, because not only a single node processes commands for all Collections, but it also depends on the mono threaded state change queue consumption,
- The code supporting Overseer in SolrCloud is complex (election, queue management, recovery etc).
The general idea is that there’s already a central point in the SolrCloud cluster and it’s Zookeeper. It might not be necessary to have a second central point (Overseer) because nodes can interact directly with Zookeeper and synchronize more efficiently by optimistic locking using “conditional updates” (a.k.a compare and swap or CAS).
Attachments
Issue Links
- is related to
-
SOLR-11465 Overseer should process independent messages in parallel
- Open
1.
|
Remove Overseer ClusterStateUpdater | Closed | Ilan Ginzburg |
|
||||||||
2.
|
Distribute Collection API command execution | Closed | Ilan Ginzburg |
|
||||||||
3.
|
Refactor: separate Collection API commands from Overseer and message handling logic | Closed | Ilan Ginzburg |
|
||||||||
4.
|
Remove OverseerConfigSetMessageHandler; always do distributed mode | Open | Unassigned |
|