Description
While doing a rolling upgrade from 5.3 to 5.4 of a solrcloud cluster, I observed that upgraded nodes would not register their shards as active unless they were elected the leader for the shard.
There were no errors, the shards were fully up and responsive, but would not publish any change from the "down" state.
This appears to be because the recovery process never happens, because the ZK node containing the current leader can't be found, because the ZK path has changed.
Specifically, the leader data node changed from:
<collection>/leaders/<shard>
to
<collection>/leaders/<shard>/leader
It looks to me like this happened during SOLR-7844, perhaps accidentally.
At the least, the "Migrating to Solr 5.4" section of the README should get updated with this info, since it means a rolling upgrade of a collection with multiple replicas will suffer serious degradation in the number of active replicas as nodes are upgraded. It's entirely possible this will reduce some shards to a single active replica.
Attachments
Issue Links
- is duplicated by
-
SOLR-8561 Add fallback to ZkController.getLeaderProps for a mixed 5.4-pre-5.4 deployments
- Closed