Solr
  1. Solr
  2. SOLR-3807

Currently during recovery we pause for a number of seconds after waiting for the leader to see a recovering state so that any previous updates will have finished before our commit on the leader - we don't need this wait for peersync.

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.0, 6.0
    • Component/s: SolrCloud
    • Labels:
      None

      Activity

      Hide
      Hoss Man added a comment -

      assinging to mark to assess if this is a blocker for 4.0 or should be punted

      Show
      Hoss Man added a comment - assinging to mark to assess if this is a blocker for 4.0 or should be punted
      Hide
      Mark Miller added a comment -

      Yeah, this is part of a commit of a bunch of issues I'm going to do shortly.

      Show
      Mark Miller added a comment - Yeah, this is part of a commit of a bunch of issues I'm going to do shortly.
      Hide
      Commit Tag Bot added a comment -

      [branch_4x commit] Mark Robert Miller
      http://svn.apache.org/viewvc?view=revision&revision=1384937

      SOLR-3833: When a election is started because a leader went down, the new leader candidate should decline if the last state they published was not active.

      SOLR-3836: When doing peer sync, we should only count sync attempts that cannot reach the given host as success when the candidate leader is syncing with the replicas - not when replicas are syncing to the leader.

      SOLR-3835: In our leader election algorithm, if on connection loss we found we did not create our election node, we should retry, not throw an exception.

      SOLR-3834: A new leader on cluster startup should also run the leader sync process in case there was a bad cluster shutdown.

      SOLR-3772: On cluster startup, we should wait until we see all registered replicas before running the leader process - or if they all do not come up, N amount of time.

      SOLR-3756: If we are elected the leader of a shard, but we fail to publish this for any reason, we should clean up and re trigger a leader election.

      SOLR-3812: ConnectionLoss during recovery can cause lost updates, leading to shard inconsistency.

      SOLR-3813: When a new leader syncs, we need to ask all shards to sync back, not just those that are active.

      SOLR-3807: Currently during recovery we pause for a number of seconds after waiting for the leader to see a recovering state so that any previous updates will have finished before our commit on the leader - we don't need this wait for peersync.

      SOLR-3837: When a leader is elected and asks replicas to sync back to him and that fails, we should ask those nodes to recovery asynchronously rather than synchronously.

      Show
      Commit Tag Bot added a comment - [branch_4x commit] Mark Robert Miller http://svn.apache.org/viewvc?view=revision&revision=1384937 SOLR-3833 : When a election is started because a leader went down, the new leader candidate should decline if the last state they published was not active. SOLR-3836 : When doing peer sync, we should only count sync attempts that cannot reach the given host as success when the candidate leader is syncing with the replicas - not when replicas are syncing to the leader. SOLR-3835 : In our leader election algorithm, if on connection loss we found we did not create our election node, we should retry, not throw an exception. SOLR-3834 : A new leader on cluster startup should also run the leader sync process in case there was a bad cluster shutdown. SOLR-3772 : On cluster startup, we should wait until we see all registered replicas before running the leader process - or if they all do not come up, N amount of time. SOLR-3756 : If we are elected the leader of a shard, but we fail to publish this for any reason, we should clean up and re trigger a leader election. SOLR-3812 : ConnectionLoss during recovery can cause lost updates, leading to shard inconsistency. SOLR-3813 : When a new leader syncs, we need to ask all shards to sync back, not just those that are active. SOLR-3807 : Currently during recovery we pause for a number of seconds after waiting for the leader to see a recovering state so that any previous updates will have finished before our commit on the leader - we don't need this wait for peersync. SOLR-3837 : When a leader is elected and asks replicas to sync back to him and that fails, we should ask those nodes to recovery asynchronously rather than synchronously.
      Hide
      Uwe Schindler added a comment -

      Closed after release.

      Show
      Uwe Schindler added a comment - Closed after release.

        People

        • Assignee:
          Mark Miller
          Reporter:
          Mark Miller
        • Votes:
          0 Vote for this issue
          Watchers:
          1 Start watching this issue

          Dates

          • Created:
            Updated:
            Resolved:

            Development