Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-9504

A replica with an empty index becomes the leader even when other more qualified replicas are in line

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 7.0
    • Fix Version/s: 6.3, 7.0
    • Component/s: SolrCloud
    • Security Level: Public (Default Security Level. Issues are Public)
    • Labels:

      Description

      I haven't tried branch_6x or any release yet. But this is trivially reproducible on master with the following steps:

      1. Start two solr nodes
      2. Create a collection with 1 shard, 1 replica so that one node is empty.
      3. Index some documents
      4. Shutdown the leader node
      5. Use addreplica API to create a replica of the collection on the still-running node. For some reason this API hangs until you restart the other node (possibly a bug itself) but do not wait for the API to complete.
      6. Restart the former leader node

      You'll find that the replica with 0 docs has become the leader. The former leader recovers from the leader without replicating any index files. It still has the old index which has some docs.

      This is from the logs of the 0 doc replica:

      713102 INFO  (zkCallback-4-thread-5-processing-n:127.0.1.1:7574_solr) [   ] o.a.s.c.c.ZkStateReader Updating data for [gettingstarted] from [9] to [10]
      714377 INFO  (qtp110456297-15) [c:gettingstarted s:shard1 r:core_node2 x:gettingstarted_shard1_replica2] o.a.s.c.ShardLeaderElectionContext Enough replicas found to continue.
      714377 INFO  (qtp110456297-15) [c:gettingstarted s:shard1 r:core_node2 x:gettingstarted_shard1_replica2] o.a.s.c.ShardLeaderElectionContext I may be the new leader - try and sync
      714377 INFO  (qtp110456297-15) [c:gettingstarted s:shard1 r:core_node2 x:gettingstarted_shard1_replica2] o.a.s.c.SyncStrategy Sync replicas to http://127.0.1.1:7574/solr/gettingstarted_shard1_replica2/
      714380 INFO  (qtp110456297-15) [c:gettingstarted s:shard1 r:core_node2 x:gettingstarted_shard1_replica2] o.a.s.u.PeerSync PeerSync: core=gettingstarted_shard1_replica2 url=http://127.0.1.1:7574/solr START replicas=[http://127.0.1.1:8983/solr/gettingstarted_shard1_replica1/] nUpdates=100
      714381 INFO  (qtp110456297-15) [c:gettingstarted s:shard1 r:core_node2 x:gettingstarted_shard1_replica2] o.a.s.u.PeerSync PeerSync: core=gettingstarted_shard1_replica2 url=http://127.0.1.1:7574/solr DONE.  We have no versions.  sync failed.
      714382 INFO  (qtp110456297-15) [c:gettingstarted s:shard1 r:core_node2 x:gettingstarted_shard1_replica2] o.a.s.c.SyncStrategy Leader's attempt to sync with shard failed, moving to the next candidate
      714382 INFO  (qtp110456297-15) [c:gettingstarted s:shard1 r:core_node2 x:gettingstarted_shard1_replica2] o.a.s.c.ShardLeaderElectionContext We failed sync, but we have no versions - we can't sync in that case - we were active before, so become leader anyway
      714387 INFO  (qtp110456297-15) [c:gettingstarted s:shard1 r:core_node2 x:gettingstarted_shard1_replica2] o.a.s.c.ShardLeaderElectionContextBase Creating leader registration node /collections/gettingstarted/leaders/shard1/leader after winning as /collections/gettingstarted/leader_elect/shard1/election/96579592334475268-core_node2-n_0000000001
      714398 INFO  (qtp110456297-15) [c:gettingstarted s:shard1 r:core_node2 x:gettingstarted_shard1_replica2] o.a.s.c.ShardLeaderElectionContext I am the new leader: http://127.0.1.1:7574/solr/gettingstarted_shard1_replica2/ shard1
      

      It basically tries to sync but has no versions and because it was active before (it is a new core starting up for the first time), it becomes the leader and publishes itself as active.

      1. SOLR-9504.patch
        19 kB
        Shalin Shekhar Mangar

        Activity

        Hide
        shalinmangar Shalin Shekhar Mangar added a comment -
        Show
        shalinmangar Shalin Shekhar Mangar added a comment - FYI Mark Miller , Yonik Seeley
        Hide
        markrmiller@gmail.com Mark Miller added a comment -

        I think this one is known. There is another jira issue around this as well somewhere and it was understood as an ugly limitation when another bug around this was fixed. I had meant to add something to peer sync or something that let you check if another replica looked better because it wasn't empty or something, but never got to it.

        Show
        markrmiller@gmail.com Mark Miller added a comment - I think this one is known. There is another jira issue around this as well somewhere and it was understood as an ugly limitation when another bug around this was fixed. I had meant to add something to peer sync or something that let you check if another replica looked better because it wasn't empty or something, but never got to it.
        Hide
        shalinmangar Shalin Shekhar Mangar added a comment -

        Mark Miller - The behavior when leader vote wait expires is well known but for it to happen before expiry of that period is a surprise (at least to me). Perhaps instead of just giving up if the leader candidate have no version, it can request versions anyway from peers and rejoin if others have some versions?

        Show
        shalinmangar Shalin Shekhar Mangar added a comment - Mark Miller - The behavior when leader vote wait expires is well known but for it to happen before expiry of that period is a surprise (at least to me). Perhaps instead of just giving up if the leader candidate have no version, it can request versions anyway from peers and rejoin if others have some versions?
        Hide
        shalinmangar Shalin Shekhar Mangar added a comment -

        Whoops! You wrote the same thing that I did. I'll work on adding such a check to peer sync.

        Show
        shalinmangar Shalin Shekhar Mangar added a comment - Whoops! You wrote the same thing that I did. I'll work on adding such a check to peer sync.
        Hide
        shalinmangar Shalin Shekhar Mangar added a comment -

        Longer term, we need to work on a bi-directional sync during recovery to really solve these kind of issues.

        Show
        shalinmangar Shalin Shekhar Mangar added a comment - Longer term, we need to work on a bi-directional sync during recovery to really solve these kind of issues.
        Hide
        shalinmangar Shalin Shekhar Mangar added a comment -

        Patch with a test that fails without the fix.

        Basically, when we need to bail out because we have no versions, we peek at the other replicas. If even one has versions, then we return this bit of information to the ShardLeaderElectionContext.runLeaderProcess and rejoin the election, else we proceed as before. The hacky bit is that there is now a PeerSyncResult class which has a success flag as well as an optional otherHasVersions flag.

        I'm going to run some tests in a loop to ensure I haven't broken anything.

        Show
        shalinmangar Shalin Shekhar Mangar added a comment - Patch with a test that fails without the fix. Basically, when we need to bail out because we have no versions, we peek at the other replicas. If even one has versions, then we return this bit of information to the ShardLeaderElectionContext.runLeaderProcess and rejoin the election, else we proceed as before. The hacky bit is that there is now a PeerSyncResult class which has a success flag as well as an optional otherHasVersions flag. I'm going to run some tests in a loop to ensure I haven't broken anything.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit ce24de5cd65726dd9593512ec4082ba81b9d7801 in lucene-solr's branch refs/heads/master from Shalin Shekhar Mangar
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=ce24de5 ]

        SOLR-9504: A replica with an empty index becomes the leader even when other more qualified replicas are in line

        Show
        jira-bot ASF subversion and git services added a comment - Commit ce24de5cd65726dd9593512ec4082ba81b9d7801 in lucene-solr's branch refs/heads/master from Shalin Shekhar Mangar [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=ce24de5 ] SOLR-9504 : A replica with an empty index becomes the leader even when other more qualified replicas are in line
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit effd22457691420982534f47ee71cd52ef64b8b9 in lucene-solr's branch refs/heads/branch_6x from Shalin Shekhar Mangar
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=effd224 ]

        SOLR-9504: A replica with an empty index becomes the leader even when other more qualified replicas are in line

        (cherry picked from commit ce24de5)

        Show
        jira-bot ASF subversion and git services added a comment - Commit effd22457691420982534f47ee71cd52ef64b8b9 in lucene-solr's branch refs/heads/branch_6x from Shalin Shekhar Mangar [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=effd224 ] SOLR-9504 : A replica with an empty index becomes the leader even when other more qualified replicas are in line (cherry picked from commit ce24de5)
        Hide
        shalinmangar Shalin Shekhar Mangar added a comment -

        Closing after 6.3.0 release.

        Show
        shalinmangar Shalin Shekhar Mangar added a comment - Closing after 6.3.0 release.

          People

          • Assignee:
            shalinmangar Shalin Shekhar Mangar
            Reporter:
            shalinmangar Shalin Shekhar Mangar
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development