Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-7819

ZkController.ensureReplicaInLeaderInitiatedRecovery does not respect retryOnConnLoss

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 5.2, 5.2.1
    • 5.4, 6.0
    • SolrCloud

    Description

      SOLR-7245 added a retryOnConnLoss parameter to ZkController.ensureReplicaInLeaderInitiatedRecovery so that indexing threads do not hang during a partition on ZK operations. However, some of those changes were unintentionally reverted by SOLR-7336 in 5.2.

      I found this while running Jepsen tests on 5.2.1 where a hung update managed to put a leader into a 'down' state (I'm still investigating and will open a separate issue about this problem).

      Attachments

        1. SOLR-7819.patch
          9 kB
          Shalin Shekhar Mangar
        2. SOLR-7819.patch
          30 kB
          Shalin Shekhar Mangar
        3. SOLR-7819.patch
          30 kB
          Shalin Shekhar Mangar
        4. SOLR-7819.patch
          30 kB
          Shalin Shekhar Mangar
        5. SOLR-7819.patch
          44 kB
          Shalin Shekhar Mangar
        6. SOLR-7819.patch
          43 kB
          Shalin Shekhar Mangar

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            shalin Shalin Shekhar Mangar
            shalin Shalin Shekhar Mangar
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment