Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-5325

zk connection loss causes overseer leader loss

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 4.3, 4.4, 4.5
    • Fix Version/s: 4.5.1, 4.6, 6.0
    • Component/s: None
    • Labels:
      None

      Description

      The problem we saw was that when the solr overseer leader experienced temporary zk connectivity problems it stopped processing overseer queue events.

      This first happened when quorum within the external zk ensemble was lost due to too many zookeepers being stopped (similar to SOLR-5199). The second time it happened when there was a sufficient number of zookeepers but they were holding zookeeper leadership elections and thus refused connections (the elections were taking several seconds, we were using the default zookeeper.cnxTimeout=5s value and it was hit for one ensemble member).

        Attachments

        1. SOLR-5325.patch
          11 kB
          Mark Miller
        2. SOLR-5325.patch
          7 kB
          Mark Miller
        3. SOLR-5325.patch
          14 kB
          Christine Poerschke

          Issue Links

            Activity

              People

              • Assignee:
                markrmiller@gmail.com Mark Miller
                Reporter:
                cpoerschke Christine Poerschke
              • Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: