Uploaded image for project: 'Accumulo'
  1. Accumulo
  2. ACCUMULO-4353

Stabilize tablet assignment during transient failure

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 1.8.0
    • None
    • None

    Description

      When a tablet server dies, Accumulo attempts to reassign the tablets it was hosting as quickly as possible to maintain availability. If multiple tablet servers die in quick succession, such as from a rolling restart of the Accumulo cluster or a network partition, this behavior can cause a storm of reassignment and rebalancing, placing significant load on the master.

      To avert such load, Accumulo should be capable of maintaining a steady tablet assignment state in the face of transient tablet server loss. Instead of reassigning tablets as quickly as possible, Accumulo should be await the return of a temporarily downed tablet server (for some configurable duration) before assigning its tablets to other tablet servers.

      Attachments

        Issue Links

          Activity

            People

              ShawnWalker Shawn Walker
              ShawnWalker Shawn Walker
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 7h 50m
                  7h 50m