Solr
  1. Solr
  2. SOLR-6235

SyncSliceTest fails on jenkins with no live servers available error

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.10, 6.0
    • Component/s: SolrCloud, Tests
    • Labels:
      None

      Description

      1 tests failed.
      FAILED:  org.apache.solr.cloud.SyncSliceTest.testDistribSearch
      
      Error Message:
      No live SolrServers available to handle this request
      
      Stack Trace:
      org.apache.solr.client.solrj.SolrServerException: No live SolrServers available to handle this request
              at __randomizedtesting.SeedInfo.seed([685C57B3F25C854B:E9BAD9AB8503E577]:0)
              at org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(LBHttpSolrServer.java:317)
              at org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:659)
              at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:91)
              at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:301)
              at org.apache.solr.cloud.AbstractFullDistribZkTestBase.checkShardConsistency(AbstractFullDistribZkTestBase.java:1149)
              at org.apache.solr.cloud.AbstractFullDistribZkTestBase.checkShardConsistency(AbstractFullDistribZkTestBase.java:1118)
              at org.apache.solr.cloud.SyncSliceTest.doTest(SyncSliceTest.java:236)
              at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:865)
      
      1. SOLR-6235.patch
        20 kB
        Shalin Shekhar Mangar
      2. SOLR-6235.patch
        15 kB
        Shalin Shekhar Mangar

        Activity

        Hide
        Shalin Shekhar Mangar added a comment -

        Wow, crazy crazy bug! I finally found the root cause.

        The problem is with the leader initiated replica code which uses core name to set/get status. This works fine as long as the core names for all nodes are different but if they all happened to be "collection1" then we have this problem

        In this particular failure that I investigated:
        http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/1667/consoleText

        Here's the sequence of events:

        1. port:51916 - core_node1 was initially the leader, docs were indexed and then it was killed
        2. port:51919 - core_node2 became the leader, peer sync happened, shards were checked for consistency
        3. port:51916 - core_node1 was brought back online, it recovered from the leader, consistency check passed
        4. port:51923 core_node3 and port:51932 core_node4 were added to the skipped servers
        5. 300 docs were indexed (to go beyond the peer sync limit)
        6. port:51919 - core_node2 (the leader was killed)

        Here is where things get interesting:

        1. port:51923 core_node3 tries to become the leader and initiates sync with other replicas
        2. In the meanwhile, a commit request from checkShardConsistency makes its way to port:51923 core_node3 (even though it's not clear whether it has indeed become the leader)
        3. port:51923 core_node3 calls commit on all shards including port:51919 core_node2 which should've been down but perhaps the local state at 51923 is not updated yet?
        4. port:51923 core_node3 puts replica collection1 on 127.0.0.1:51919_ into leader-initiated recovery
        5. port:51923 - core_node3 fails to peersync (because number of changes were too large) and rejoins election
        6. After this point each shard that tries to become the leader fails because it thinks that it has been put under leader initiated recovery and goes into actual "recovery"
        7. Of course, since there is no leader, recovery cannot happen and each shard eventually goes to "recovery_failed" state
        8. Eventually the test gives up and throws an error saying that there are no live server available to handle the request.
        Show
        Shalin Shekhar Mangar added a comment - Wow, crazy crazy bug! I finally found the root cause. The problem is with the leader initiated replica code which uses core name to set/get status. This works fine as long as the core names for all nodes are different but if they all happened to be "collection1" then we have this problem In this particular failure that I investigated: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/1667/consoleText Here's the sequence of events: port:51916 - core_node1 was initially the leader, docs were indexed and then it was killed port:51919 - core_node2 became the leader, peer sync happened, shards were checked for consistency port:51916 - core_node1 was brought back online, it recovered from the leader, consistency check passed port:51923 core_node3 and port:51932 core_node4 were added to the skipped servers 300 docs were indexed (to go beyond the peer sync limit) port:51919 - core_node2 (the leader was killed) Here is where things get interesting: port:51923 core_node3 tries to become the leader and initiates sync with other replicas In the meanwhile, a commit request from checkShardConsistency makes its way to port:51923 core_node3 (even though it's not clear whether it has indeed become the leader) port:51923 core_node3 calls commit on all shards including port:51919 core_node2 which should've been down but perhaps the local state at 51923 is not updated yet? port:51923 core_node3 puts replica collection1 on 127.0.0.1:51919_ into leader-initiated recovery port:51923 - core_node3 fails to peersync (because number of changes were too large) and rejoins election After this point each shard that tries to become the leader fails because it thinks that it has been put under leader initiated recovery and goes into actual "recovery" Of course, since there is no leader, recovery cannot happen and each shard eventually goes to "recovery_failed" state Eventually the test gives up and throws an error saying that there are no live server available to handle the request.
        Hide
        Shalin Shekhar Mangar added a comment -

        We should use coreNode instead of core names for setting leader initiated recovery. I'll put up a patch.

        Show
        Shalin Shekhar Mangar added a comment - We should use coreNode instead of core names for setting leader initiated recovery. I'll put up a patch.
        Hide
        Timothy Potter added a comment -

        Hi Shalin,

        Great find! Using coreNode is a good idea, but why would all the cores have the name "collection1"? Is that valid or an indication of a problem upstream from this code?

        Also, you raise a good point about all replicas thinking they are in leader-initiated recovery (LIR). In ElectionContext, when running shouldIBeLeader, the node will choose to not be the leader if it is in LIR. However, this could lead to no leader. My thinking there is the state is bad enough that we would need manual intervention to clear one of the LIR znodes to allow a replica to get past this point. But maybe we can do better here?

        Show
        Timothy Potter added a comment - Hi Shalin, Great find! Using coreNode is a good idea, but why would all the cores have the name "collection1"? Is that valid or an indication of a problem upstream from this code? Also, you raise a good point about all replicas thinking they are in leader-initiated recovery (LIR). In ElectionContext, when running shouldIBeLeader, the node will choose to not be the leader if it is in LIR. However, this could lead to no leader. My thinking there is the state is bad enough that we would need manual intervention to clear one of the LIR znodes to allow a replica to get past this point. But maybe we can do better here?
        Hide
        Mark Miller added a comment -

        but why would all the cores have the name "collection1"?

        It's probably historical. When we first where trying to make it easier to use SolrCloud and no Collections API existed, you could start up cores and have them be part of the same collection by giving them the same core name. This helped in trying to make a demo startup that required minimal extra work. So, most of the original tests probably just followed suit.

        As we get rid of predefined cores in SolrCloud and move to the collections API, that stuff will go away.

        Show
        Mark Miller added a comment - but why would all the cores have the name "collection1"? It's probably historical. When we first where trying to make it easier to use SolrCloud and no Collections API existed, you could start up cores and have them be part of the same collection by giving them the same core name. This helped in trying to make a demo startup that required minimal extra work. So, most of the original tests probably just followed suit. As we get rid of predefined cores in SolrCloud and move to the collections API, that stuff will go away.
        Hide
        Shalin Shekhar Mangar added a comment -

        but why would all the cores have the name "collection1"? Is that valid or an indication of a problem upstream from this code?

        The reasons are what Mark said but it is a supported use-case and pretty common. Imagine stock solr running on 4 nodes - each node would have the same collection1 core name.

        Also, you raise a good point about all replicas thinking they are in leader-initiated recovery (LIR). In ElectionContext, when running shouldIBeLeader, the node will choose to not be the leader if it is in LIR. However, this could lead to no leader. My thinking there is the state is bad enough that we would need manual intervention to clear one of the LIR znodes to allow a replica to get past this point. But maybe we can do better here?

        Good question. With careful use of minRf, the user can retry operations and maintain consistency even if we arbitrarily elect a leader in this case. But most people won't use minRf and don't care about consistency as much as availability. For them there should be a way to get out of this mess easily. We can have a collection property (boolean + timeout value) to force elect a leader even if all shards were in LIR. What do you think?

        Show
        Shalin Shekhar Mangar added a comment - but why would all the cores have the name "collection1"? Is that valid or an indication of a problem upstream from this code? The reasons are what Mark said but it is a supported use-case and pretty common. Imagine stock solr running on 4 nodes - each node would have the same collection1 core name. Also, you raise a good point about all replicas thinking they are in leader-initiated recovery (LIR). In ElectionContext, when running shouldIBeLeader, the node will choose to not be the leader if it is in LIR. However, this could lead to no leader. My thinking there is the state is bad enough that we would need manual intervention to clear one of the LIR znodes to allow a replica to get past this point. But maybe we can do better here? Good question. With careful use of minRf, the user can retry operations and maintain consistency even if we arbitrarily elect a leader in this case. But most people won't use minRf and don't care about consistency as much as availability. For them there should be a way to get out of this mess easily. We can have a collection property (boolean + timeout value) to force elect a leader even if all shards were in LIR. What do you think?
        Hide
        Mark Miller added a comment -

        you could start up cores and have them be part of the same collection by giving them the same core name.

        If you don't specify a collection name, it also defaults to the core name - hence collection1 for the core name.

        Show
        Mark Miller added a comment - you could start up cores and have them be part of the same collection by giving them the same core name. If you don't specify a collection name, it also defaults to the core name - hence collection1 for the core name.
        Hide
        Shalin Shekhar Mangar added a comment -

        We can have a collection property (boolean + timeout value) to force elect a leader even if all shards were in LIR

        In case it wasn't clear, I think it should be true by default.

        Show
        Shalin Shekhar Mangar added a comment - We can have a collection property (boolean + timeout value) to force elect a leader even if all shards were in LIR In case it wasn't clear, I think it should be true by default.
        Hide
        Mark Miller added a comment -

        Great work tracking this down!

        Indeed, it's a current limitation that you can have all nodes in a shard thinking they cannot be leader, even when all of them are available. This is not required by the distributed model we have at all, it's just a consequence of being over restrictive on the initial implementation - if all known replicas are participating, you should be able to get a leader. So I'm not sure if this case should be optional. But iff not all known replicas are participating and you still want to force a leader, that should be optional - I think it should default to false though. I think the system should default to reasonable data safety in these cases.

        How best to solve this, I'm not quite sure, but happy to look at a patch. How do you plan on monitoring and taking action? Via the Overseer? It seems tricky to do it from the replicas.

        Show
        Mark Miller added a comment - Great work tracking this down! Indeed, it's a current limitation that you can have all nodes in a shard thinking they cannot be leader, even when all of them are available. This is not required by the distributed model we have at all, it's just a consequence of being over restrictive on the initial implementation - if all known replicas are participating, you should be able to get a leader. So I'm not sure if this case should be optional. But iff not all known replicas are participating and you still want to force a leader, that should be optional - I think it should default to false though. I think the system should default to reasonable data safety in these cases. How best to solve this, I'm not quite sure, but happy to look at a patch. How do you plan on monitoring and taking action? Via the Overseer? It seems tricky to do it from the replicas.
        Hide
        Mark Miller added a comment -

        On another note, it almost seems we can do better than ask for a recovery on a failed commit.

        Show
        Mark Miller added a comment - On another note, it almost seems we can do better than ask for a recovery on a failed commit.
        Hide
        Timothy Potter added a comment -

        We have a similar issue where a replica attempting to be the leader needs to wait a while to see other replicas before declaring itself the leader, see ElectionContext around line 200:

        int leaderVoteWait = cc.getZkController().getLeaderVoteWait();
        if (!weAreReplacement)

        { waitForReplicasToComeUp(weAreReplacement, leaderVoteWait); }

        So one quick idea might be to have the code that checks if it's in LIR see if all replicas are in LIR and if so, wait out the leaderVoteWait period and check again. If all are still in LIR, then move on with becoming the leader (in the spirit of availability).

        Show
        Timothy Potter added a comment - We have a similar issue where a replica attempting to be the leader needs to wait a while to see other replicas before declaring itself the leader, see ElectionContext around line 200: int leaderVoteWait = cc.getZkController().getLeaderVoteWait(); if (!weAreReplacement) { waitForReplicasToComeUp(weAreReplacement, leaderVoteWait); } So one quick idea might be to have the code that checks if it's in LIR see if all replicas are in LIR and if so, wait out the leaderVoteWait period and check again. If all are still in LIR, then move on with becoming the leader (in the spirit of availability).
        Hide
        Shalin Shekhar Mangar added a comment -

        But iff not all known replicas are participating and you still want to force a leader, that should be optional - I think it should default to false though. I think the system should default to reasonable data safety in these cases.

        That's the same case as the leaderVoteWait situation and we do go ahead after that amount of time even if all replicas aren't participating. Therefore, I think that we should handle it the same way. But to help people who care about consistency over availability, there should be a configurable property which bans this auto-promotion completely.

        In any case, we should switch to coreNodeName instead of coreName and open an issue to improve the leader election part.

        Show
        Shalin Shekhar Mangar added a comment - But iff not all known replicas are participating and you still want to force a leader, that should be optional - I think it should default to false though. I think the system should default to reasonable data safety in these cases. That's the same case as the leaderVoteWait situation and we do go ahead after that amount of time even if all replicas aren't participating. Therefore, I think that we should handle it the same way. But to help people who care about consistency over availability, there should be a configurable property which bans this auto-promotion completely. In any case, we should switch to coreNodeName instead of coreName and open an issue to improve the leader election part.
        Hide
        Mark Miller added a comment -

        That's the same case as the leaderVoteWait situation and we do go ahead after that amount of time even if all replicas aren't participating.

        No, we don't - only if a new leader is elected does he try and do the wait. There are situations where that doesn't happen. This is like the issue where the leader loses the connection to zk after sending docs to replicas and then they fail and the leader asks them to recover and then you have no leader for the shard. We did a kind of workaround for that specific issue, but I've seen it happen with other errors as well. You can certainly lose a whole shard when everyone is participating in the election - no one thinks they can be the leader because they all published recovery last.

        There are lots and lots of improvements to be made to recovery still - it's a baby.

        Show
        Mark Miller added a comment - That's the same case as the leaderVoteWait situation and we do go ahead after that amount of time even if all replicas aren't participating. No, we don't - only if a new leader is elected does he try and do the wait. There are situations where that doesn't happen. This is like the issue where the leader loses the connection to zk after sending docs to replicas and then they fail and the leader asks them to recover and then you have no leader for the shard. We did a kind of workaround for that specific issue, but I've seen it happen with other errors as well. You can certainly lose a whole shard when everyone is participating in the election - no one thinks they can be the leader because they all published recovery last. There are lots and lots of improvements to be made to recovery still - it's a baby.
        Hide
        Mark Miller added a comment -

        only if a new leader is elected does he try and do the wait.

        Sorry - that line is confusing - the issue is that waiting for everyone doesn't matter. They might all be participating anyway, the wait is irrelevant. The issue comes after that code, when no one will become the leader.

        Show
        Mark Miller added a comment - only if a new leader is elected does he try and do the wait. Sorry - that line is confusing - the issue is that waiting for everyone doesn't matter. They might all be participating anyway, the wait is irrelevant. The issue comes after that code, when no one will become the leader.
        Hide
        Timothy Potter added a comment -

        I opened this ticket SOLR-6236 for the leader election issue we're discussing, but the title might not be quite accurate

        Show
        Timothy Potter added a comment - I opened this ticket SOLR-6236 for the leader election issue we're discussing, but the title might not be quite accurate
        Hide
        Mark Miller added a comment -

        there should be a configurable property which bans this auto-promotion completely.

        That's why I'm drawing the distinction between everyone participating and not everyone participating.

        Sometimes you can lose a shard and it's because the leader->zk connection blinks. In this case, if you have all the replicas in a shard, it's safe to force an election anyway.

        Sometimes you lose a shard and you don't have all the replicas - in that case, it should be optional to force an election and default to false.

        Show
        Mark Miller added a comment - there should be a configurable property which bans this auto-promotion completely. That's why I'm drawing the distinction between everyone participating and not everyone participating. Sometimes you can lose a shard and it's because the leader->zk connection blinks. In this case, if you have all the replicas in a shard, it's safe to force an election anyway. Sometimes you lose a shard and you don't have all the replicas - in that case, it should be optional to force an election and default to false.
        Hide
        Jessica Cheng Mallet added a comment -

        Why is core_node3 able to put core_node2 (the old leader) into LIR when core_node3 has not been elected a leader yet? (Actually, why is core_node3 processing any "update" at all when it's not a leader?)

        That's really more of a problem than the fact that the one LIR core_node3 wrote to "collection1" set everyone else in LIR, because what if really only core_node2 is up-to-date and it just went through a blip and came back--in this case the only right choice for leader is core_node2.

        Show
        Jessica Cheng Mallet added a comment - Why is core_node3 able to put core_node2 (the old leader) into LIR when core_node3 has not been elected a leader yet? (Actually, why is core_node3 processing any "update" at all when it's not a leader?) That's really more of a problem than the fact that the one LIR core_node3 wrote to "collection1" set everyone else in LIR, because what if really only core_node2 is up-to-date and it just went through a blip and came back--in this case the only right choice for leader is core_node2.
        Hide
        Jessica Cheng Mallet added a comment -

        Obviously, this is not to say that "the one LIR core_node3 wrote to 'collection1' set everyone else in LIR" is not a problem.

        Show
        Jessica Cheng Mallet added a comment - Obviously, this is not to say that "the one LIR core_node3 wrote to 'collection1' set everyone else in LIR" is not a problem.
        Hide
        Shalin Shekhar Mangar added a comment -

        Why is core_node3 able to put core_node2 (the old leader) into LIR when core_node3 has not been elected a leader yet? (Actually, why is core_node3 processing any "update" at all when it's not a leader?)

        Yeah, the discussion went in another direction but this is something I found odd and I'm gonna find out why that happened.

        Show
        Shalin Shekhar Mangar added a comment - Why is core_node3 able to put core_node2 (the old leader) into LIR when core_node3 has not been elected a leader yet? (Actually, why is core_node3 processing any "update" at all when it's not a leader?) Yeah, the discussion went in another direction but this is something I found odd and I'm gonna find out why that happened.
        Hide
        Mark Miller added a comment -

        Why is core_node3 able to put core_node2 (the old leader) into LIR when core_node3 has not been elected a leader yet? (Actually, why is core_node3 processing any "update" at all when it's not a leader?)

        I have not followed the sequences that closely, but I would guess that it's because of how we implemented distrib commit.

        Show
        Mark Miller added a comment - Why is core_node3 able to put core_node2 (the old leader) into LIR when core_node3 has not been elected a leader yet? (Actually, why is core_node3 processing any "update" at all when it's not a leader?) I have not followed the sequences that closely, but I would guess that it's because of how we implemented distrib commit.
        Hide
        Mark Miller added a comment -

        That is part of my motivation for saying:

        On another note, it almost seems we can do better than ask for a recovery on a failed commit.

        The current method was kind of just a least effort impl, so there might be some other things we can do as well. If I remember right, whoever gets the commit just broadcasts it out to everyone over http, including itself.

        Show
        Mark Miller added a comment - That is part of my motivation for saying: On another note, it almost seems we can do better than ask for a recovery on a failed commit. The current method was kind of just a least effort impl, so there might be some other things we can do as well. If I remember right, whoever gets the commit just broadcasts it out to everyone over http, including itself.
        Hide
        Jessica Cheng Mallet added a comment -

        I would guess that it's because of how we implemented distrib commit.

        As in, anyone (non-leader) can distribute commits to everyone else? Is that why you commented earlier:

        On another note, it almost seems we can do better than ask for a recovery on a failed commit.

        If so, that totally makes sense.

        Show
        Jessica Cheng Mallet added a comment - I would guess that it's because of how we implemented distrib commit. As in, anyone (non-leader) can distribute commits to everyone else? Is that why you commented earlier: On another note, it almost seems we can do better than ask for a recovery on a failed commit. If so, that totally makes sense.
        Hide
        Mark Miller added a comment -

        Right. I think a minimum, doing nothing is probably preferable in most cases. Perhaps a retry or two?

        Or perhaps we should look at sending to leaders to originate. We would still want to commit everywhere in parallel though, and I'm not sure we can do anything that is that much better.

        The current situation doesn't seem good though.

        Show
        Mark Miller added a comment - Right. I think a minimum, doing nothing is probably preferable in most cases. Perhaps a retry or two? Or perhaps we should look at sending to leaders to originate. We would still want to commit everywhere in parallel though, and I'm not sure we can do anything that is that much better. The current situation doesn't seem good though.
        Hide
        Shalin Shekhar Mangar added a comment -

        Patch which uses coreNodeName instead of coreName for leader initiated recoveries.

        Show
        Shalin Shekhar Mangar added a comment - Patch which uses coreNodeName instead of coreName for leader initiated recoveries.
        Hide
        Shalin Shekhar Mangar added a comment -

        I fixed another mistake that I found while fixing this problem. The call to ensureReplicaInLeaderInitiatedRecovery in ElectionContext.startLeaderInitiatedRecoveryOnReplicas had core name instead of replicaUrl. In a related note, the HttpPartitionTest can be improved to not rely on Thread.sleep so much – I'll open a separate issue on that.

        Show
        Shalin Shekhar Mangar added a comment - I fixed another mistake that I found while fixing this problem. The call to ensureReplicaInLeaderInitiatedRecovery in ElectionContext.startLeaderInitiatedRecoveryOnReplicas had core name instead of replicaUrl. In a related note, the HttpPartitionTest can be improved to not rely on Thread.sleep so much – I'll open a separate issue on that.
        Hide
        Shalin Shekhar Mangar added a comment -

        I improved logging by adding coreName as well as coreNodeName everywhere in LIR code. This is ready.

        Show
        Shalin Shekhar Mangar added a comment - I improved logging by adding coreName as well as coreNodeName everywhere in LIR code. This is ready.
        Hide
        ASF subversion and git services added a comment -

        Commit 1610028 from shalin@apache.org in branch 'dev/trunk'
        [ https://svn.apache.org/r1610028 ]

        SOLR-6235: Leader initiated recovery should use coreNodeName instead of coreName to avoid marking all replicas having common core name as down

        Show
        ASF subversion and git services added a comment - Commit 1610028 from shalin@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1610028 ] SOLR-6235 : Leader initiated recovery should use coreNodeName instead of coreName to avoid marking all replicas having common core name as down
        Hide
        ASF subversion and git services added a comment -

        Commit 1610029 from shalin@apache.org in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1610029 ]

        SOLR-6235: Leader initiated recovery should use coreNodeName instead of coreName to avoid marking all replicas having common core name as down

        Show
        ASF subversion and git services added a comment - Commit 1610029 from shalin@apache.org in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1610029 ] SOLR-6235 : Leader initiated recovery should use coreNodeName instead of coreName to avoid marking all replicas having common core name as down
        Hide
        ASF subversion and git services added a comment -

        Commit 1610351 from shalin@apache.org in branch 'dev/trunk'
        [ https://svn.apache.org/r1610351 ]

        SOLR-6235: Improved logging in RecoveryStrategy and fixed a mistake in ElectionContext logging that I had made earlier.

        Show
        ASF subversion and git services added a comment - Commit 1610351 from shalin@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1610351 ] SOLR-6235 : Improved logging in RecoveryStrategy and fixed a mistake in ElectionContext logging that I had made earlier.
        Hide
        ASF subversion and git services added a comment -

        Commit 1610352 from shalin@apache.org in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1610352 ]

        SOLR-6235: Improved logging in RecoveryStrategy and fixed a mistake in ElectionContext logging that I had made earlier.

        Show
        ASF subversion and git services added a comment - Commit 1610352 from shalin@apache.org in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1610352 ] SOLR-6235 : Improved logging in RecoveryStrategy and fixed a mistake in ElectionContext logging that I had made earlier.
        Hide
        ASF subversion and git services added a comment -

        Commit 1610361 from shalin@apache.org in branch 'dev/trunk'
        [ https://svn.apache.org/r1610361 ]

        SOLR-6235: Fix comparison to use coreNodeName on both sides in ElectionContext.startLeaderInitiatedRecoveryOnReplicas

        Show
        ASF subversion and git services added a comment - Commit 1610361 from shalin@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1610361 ] SOLR-6235 : Fix comparison to use coreNodeName on both sides in ElectionContext.startLeaderInitiatedRecoveryOnReplicas
        Hide
        ASF subversion and git services added a comment -

        Commit 1610362 from shalin@apache.org in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1610362 ]

        SOLR-6235: Fix comparison to use coreNodeName on both sides in ElectionContext.startLeaderInitiatedRecoveryOnReplicas

        Show
        ASF subversion and git services added a comment - Commit 1610362 from shalin@apache.org in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1610362 ] SOLR-6235 : Fix comparison to use coreNodeName on both sides in ElectionContext.startLeaderInitiatedRecoveryOnReplicas

          People

          • Assignee:
            Shalin Shekhar Mangar
            Reporter:
            Shalin Shekhar Mangar
          • Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development