Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-5593

shard leader loss due to ZK session expiry

Agile BoardAttach filesAttach ScreenshotVotersStop watchingWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.7, 6.0
    • Component/s: SolrCloud
    • Labels:
      None

      Description

      The problem we saw was that the shard leader ceased to be shard leader (in our case due to its zookeeper session expiring). The followers thus rejected update requests (DistributedUpdateProcessor setupRequest's call to ZkStateReader getLeaderRetry) and the leader asked them to recover (DistributedUpdateProcessor doFinish). The followers published themselves as recovering (CoreAdminHandler handleRequestRecoveryAction) and the shard leader loss triggered an election in which none of the followers became the leader due to their recovering state (ShardLeaderElectionContext shouldIBeLeader). The former shard leader also did not become shard leader because its new seq number placed it after the existing replicas (LeaderElector checkIfIamLeader seq <= intSeqs.get(0)).

        Attachments

        Issue Links

          Activity

            People

            • Assignee:
              markrmiller@gmail.com Mark Miller
              Reporter:
              cpoerschke Christine Poerschke

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment