Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-3807

Currently during recovery we pause for a number of seconds after waiting for the leader to see a recovering state so that any previous updates will have finished before our commit on the leader - we don't need this wait for peersync.

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • 4.0, 6.0
    • SolrCloud
    • None

    Attachments

      Activity

        assinging to mark to assess if this is a blocker for 4.0 or should be punted

        hossman Chris M. Hostetter added a comment - assinging to mark to assess if this is a blocker for 4.0 or should be punted
        markrmiller@gmail.com Mark Miller added a comment -

        Yeah, this is part of a commit of a bunch of issues I'm going to do shortly.

        markrmiller@gmail.com Mark Miller added a comment - Yeah, this is part of a commit of a bunch of issues I'm going to do shortly.
        commit-tag-bot Commit Tag Bot added a comment -

        [branch_4x commit] Mark Robert Miller
        http://svn.apache.org/viewvc?view=revision&revision=1384937

        SOLR-3833: When a election is started because a leader went down, the new leader candidate should decline if the last state they published was not active.

        SOLR-3836: When doing peer sync, we should only count sync attempts that cannot reach the given host as success when the candidate leader is syncing with the replicas - not when replicas are syncing to the leader.

        SOLR-3835: In our leader election algorithm, if on connection loss we found we did not create our election node, we should retry, not throw an exception.

        SOLR-3834: A new leader on cluster startup should also run the leader sync process in case there was a bad cluster shutdown.

        SOLR-3772: On cluster startup, we should wait until we see all registered replicas before running the leader process - or if they all do not come up, N amount of time.

        SOLR-3756: If we are elected the leader of a shard, but we fail to publish this for any reason, we should clean up and re trigger a leader election.

        SOLR-3812: ConnectionLoss during recovery can cause lost updates, leading to shard inconsistency.

        SOLR-3813: When a new leader syncs, we need to ask all shards to sync back, not just those that are active.

        SOLR-3807: Currently during recovery we pause for a number of seconds after waiting for the leader to see a recovering state so that any previous updates will have finished before our commit on the leader - we don't need this wait for peersync.

        SOLR-3837: When a leader is elected and asks replicas to sync back to him and that fails, we should ask those nodes to recovery asynchronously rather than synchronously.

        commit-tag-bot Commit Tag Bot added a comment - [branch_4x commit] Mark Robert Miller http://svn.apache.org/viewvc?view=revision&revision=1384937 SOLR-3833 : When a election is started because a leader went down, the new leader candidate should decline if the last state they published was not active. SOLR-3836 : When doing peer sync, we should only count sync attempts that cannot reach the given host as success when the candidate leader is syncing with the replicas - not when replicas are syncing to the leader. SOLR-3835 : In our leader election algorithm, if on connection loss we found we did not create our election node, we should retry, not throw an exception. SOLR-3834 : A new leader on cluster startup should also run the leader sync process in case there was a bad cluster shutdown. SOLR-3772 : On cluster startup, we should wait until we see all registered replicas before running the leader process - or if they all do not come up, N amount of time. SOLR-3756 : If we are elected the leader of a shard, but we fail to publish this for any reason, we should clean up and re trigger a leader election. SOLR-3812 : ConnectionLoss during recovery can cause lost updates, leading to shard inconsistency. SOLR-3813 : When a new leader syncs, we need to ask all shards to sync back, not just those that are active. SOLR-3807 : Currently during recovery we pause for a number of seconds after waiting for the leader to see a recovering state so that any previous updates will have finished before our commit on the leader - we don't need this wait for peersync. SOLR-3837 : When a leader is elected and asks replicas to sync back to him and that fails, we should ask those nodes to recovery asynchronously rather than synchronously.
        uschindler Uwe Schindler added a comment -

        Closed after release.

        uschindler Uwe Schindler added a comment - Closed after release.

        People

          markrmiller@gmail.com Mark Miller
          markrmiller@gmail.com Mark Miller
          Votes:
          0 Vote for this issue
          Watchers:
          1 Start watching this issue

          Dates

            Created:
            Updated:
            Resolved: