Solr
  1. Solr
  2. SOLR-5644

SplitShard does not handle not finding a shard leader well.

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.7, 6.0
    • Component/s: SolrCloud
    • Labels:
      None

      Description

      In OverseerCollectionProcessor:

      // find the leader for the shard
      Replica parentShardLeader = clusterState.getLeader(collectionName, slice);

      This returns null if there is no current leader and the following code does not deal with that case and instead NPE's.

      1. SOLR-5644.patch
        2 kB
        Anshum Gupta
      2. SOLR-5644.patch
        2 kB
        Anshum Gupta

        Issue Links

          Activity

          Hide
          Anshum Gupta added a comment -

          A basic fix that retries for 10 seconds and throws an exception if it still doesn't have a leader.

          Show
          Anshum Gupta added a comment - A basic fix that retries for 10 seconds and throws an exception if it still doesn't have a leader.
          Hide
          Mark Miller added a comment -

          I don't think that clusterstate object will ever be updated?

          What about changing the splitshard method to take zkstatereader and use zkstatereader#getleaderretry?

          Show
          Mark Miller added a comment - I don't think that clusterstate object will ever be updated? What about changing the splitshard method to take zkstatereader and use zkstatereader#getleaderretry?
          Hide
          Anshum Gupta added a comment -

          My bad! Had that, removed that.

          Will just put up another patch.

          Show
          Anshum Gupta added a comment - My bad! Had that, removed that. Will just put up another patch.
          Hide
          Anshum Gupta added a comment -

          Using zkStateReader.getLeaderRetry(). This should get and use the updated clusterstate.

          Any suggestions on keeping a 10s timeout for getting a leader or reduce it to the default for getLeaderRetry()?

          Show
          Anshum Gupta added a comment - Using zkStateReader.getLeaderRetry(). This should get and use the updated clusterstate. Any suggestions on keeping a 10s timeout for getting a leader or reduce it to the default for getLeaderRetry()?
          Hide
          Mark Miller added a comment -

          10 seems fine to me.

          Show
          Mark Miller added a comment - 10 seems fine to me.
          Hide
          ASF subversion and git services added a comment -

          Commit 1565555 from shalin@apache.org in branch 'dev/trunk'
          [ https://svn.apache.org/r1565555 ]

          SOLR-5644: SplitShard does not handle not finding a shard leader well

          Show
          ASF subversion and git services added a comment - Commit 1565555 from shalin@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1565555 ] SOLR-5644 : SplitShard does not handle not finding a shard leader well
          Hide
          ASF subversion and git services added a comment -

          Commit 1565556 from shalin@apache.org in branch 'dev/branches/branch_4x'
          [ https://svn.apache.org/r1565556 ]

          SOLR-5644: SplitShard does not handle not finding a shard leader well

          Show
          ASF subversion and git services added a comment - Commit 1565556 from shalin@apache.org in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1565556 ] SOLR-5644 : SplitShard does not handle not finding a shard leader well
          Hide
          Shalin Shekhar Mangar added a comment -

          Thanks Mark and Anshum.

          The same getLeaderRetry should be used in migrate action as well. I'll open an issue and fix it.

          Show
          Shalin Shekhar Mangar added a comment - Thanks Mark and Anshum. The same getLeaderRetry should be used in migrate action as well. I'll open an issue and fix it.

            People

            • Assignee:
              Shalin Shekhar Mangar
              Reporter:
              Mark Miller
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development