Solr
  1. Solr
  2. SOLR-7245

Temporary ZK election or connection loss should not stall indexing due to LIR

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.1, 6.0
    • Component/s: SolrCloud
    • Labels:
      None

      Description

      If there's a ZK election or connection loss, and the leader is unable to reach a replica, it currently would stall till the ZK connection is established, due to the LIR process. This shouldn't happen, and in some way regresses the work done in SOLR-5577.

      I will try get to this, but if someone races me to it, feel free to..

      1. SOLR-7245.patch
        16 kB
        Ramkumar Aiyengar
      2. SOLR-7245.patch
        15 kB
        Ramkumar Aiyengar

        Issue Links

          Activity

          Hide
          Ramkumar Aiyengar added a comment -

          I need to think this through a bit more, but here's a starting patch which tries to have the update path make a best effort in updating ZK but not stalling if disconnected. Comments welcome..

          Show
          Ramkumar Aiyengar added a comment - I need to think this through a bit more, but here's a starting patch which tries to have the update path make a best effort in updating ZK but not stalling if disconnected. Comments welcome..
          Hide
          Ramkumar Aiyengar added a comment -

          Updated the patch to account for changes since. Tests seem to pass, but would appreciate a second look, I haven't dealt much with LIR till now.

          Is there an existing test anyone knows of which covers this area? BasicZkTest doesn't handle cloud setups. May be one of the chaos monkey tests?

          Show
          Ramkumar Aiyengar added a comment - Updated the patch to account for changes since. Tests seem to pass, but would appreciate a second look, I haven't dealt much with LIR till now. Is there an existing test anyone knows of which covers this area? BasicZkTest doesn't handle cloud setups. May be one of the chaos monkey tests?
          Hide
          Mark Miller added a comment -

          The ChaosMonkey tests could hit such a case, but there is no real guarantee - those tests need love to make sure they hit everything we hope they hit.

          Timothy Potter, any chance you can take a gander at this?

          Show
          Mark Miller added a comment - The ChaosMonkey tests could hit such a case, but there is no real guarantee - those tests need love to make sure they hit everything we hope they hit. Timothy Potter , any chance you can take a gander at this?
          Hide
          Timothy Potter added a comment -

          Taking a look, thanks for the heads up.

          Show
          Timothy Potter added a comment - Taking a look, thanks for the heads up.
          Hide
          Timothy Potter added a comment -

          Patch looks good at first look, but wanted to make sure we coordinate with Shalin Shekhar Mangar on SOLR-7109

          Show
          Timothy Potter added a comment - Patch looks good at first look, but wanted to make sure we coordinate with Shalin Shekhar Mangar on SOLR-7109
          Hide
          Ramkumar Aiyengar added a comment -

          Patch looks good at first look, but wanted to make sure we coordinate with Shalin Shekhar Mangar on SOLR-7109

          Hopefully should be. I actually noticed this when I was reviewing the patch for that issue. But Shalin, let me know if otherwise..

          Show
          Ramkumar Aiyengar added a comment - Patch looks good at first look, but wanted to make sure we coordinate with Shalin Shekhar Mangar on SOLR-7109 Hopefully should be. I actually noticed this when I was reviewing the patch for that issue. But Shalin, let me know if otherwise..
          Hide
          ASF subversion and git services added a comment -

          Commit 1668274 from Ramkumar Aiyengar in branch 'dev/trunk'
          [ https://svn.apache.org/r1668274 ]

          SOLR-7245: Temporary ZK election or connection loss should not stall indexing due to LIR

          Show
          ASF subversion and git services added a comment - Commit 1668274 from Ramkumar Aiyengar in branch 'dev/trunk' [ https://svn.apache.org/r1668274 ] SOLR-7245 : Temporary ZK election or connection loss should not stall indexing due to LIR
          Hide
          Shalin Shekhar Mangar added a comment -

          +1 LGTM.

          Show
          Shalin Shekhar Mangar added a comment - +1 LGTM.
          Hide
          ASF subversion and git services added a comment -

          Commit 1668479 from Ramkumar Aiyengar in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1668479 ]

          SOLR-7245: Temporary ZK election or connection loss should not stall indexing due to LIR

          Show
          ASF subversion and git services added a comment - Commit 1668479 from Ramkumar Aiyengar in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1668479 ] SOLR-7245 : Temporary ZK election or connection loss should not stall indexing due to LIR
          Hide
          Ramkumar Aiyengar added a comment -

          Thanks everyone. I am looking at adding ZK restarts to chaos monkey tests, but that can be a separate issue, it has wider coverage than just this issue..

          Show
          Ramkumar Aiyengar added a comment - Thanks everyone. I am looking at adding ZK restarts to chaos monkey tests, but that can be a separate issue, it has wider coverage than just this issue..
          Hide
          Timothy Potter added a comment -

          Bulk close after 5.1 release

          Show
          Timothy Potter added a comment - Bulk close after 5.1 release

            People

            • Assignee:
              Ramkumar Aiyengar
              Reporter:
              Ramkumar Aiyengar
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development