Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-23572

Change rebalance scheduling when data nodes are changed

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 3.0
    • None

    Description

      Motivation

      IEP-131
       
      In HA mode for scale up situations:

      1. Let’s say we have [A, B, C] for the partition assignments, B and C left.
      2. Raft group was narrowed in force manner to [A], and after that node B returned, we must enhance stable to [A, B]
      3. In terms of DZ.scale up, there wasn't any change in DZ.scale up time window, so data nodes will be the same, so it could mean that we don’t need to schedule new rebalance to enhance stable assignment to [A, B], but we actually do need. (Note that DZ.scale down timer is quite big and wasn’t event passed)

      Proposed enhancements

      1. Data nodes are rewritten on scale up even if they are the same
      2. When we decide if we need to trigger rebalance after data nodes change, we calculate assignments and apply nodes aliveness check filter to those assignments. If we see that the actual stablePartAssignmentsKey differs from the filtered one, we schedule rebalance.
      3. In our example, calculated assignments will be [A, B, C], we will filter them to [A,B] and schedule new rebalance to enhance stablePartAssignmentsKey

      Definition of done

      • Corresponding approach must be implemented, so nodes that returned back after majority loss could be returned back to stable

      Attachments

        Issue Links

          Activity

            People

              maliev Mirza Aliev
              maliev Mirza Aliev
              Kirill Gusakov Kirill Gusakov
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 50m
                  50m