Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-7006

[MTTR] Improve Region Server Recovery Time - Distributed Log Replay

    Details

    • Type: New Feature
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.98.0, 0.95.1
    • Component/s: MTTR
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      Distributed Log Replay Description:

      After a region server fails, we firstly assign a failed region to another region server with recovering state marked in ZooKeeper. Then a SplitLogWorker directly replays edits from WAL(Write-Ahead-Log)s of the failed region server to the region after it's re-opened in the new location. When a region is in recovering state, it can also accept writes but no reads(including Append and Increment), region split or merge.

      The feature piggybacks on existing distributed log splitting framework and directly replay WAL edits to another region server instead of creating recovered.edits files.

      The advantages over existing log splitting recovered edits implementation:
      1) Eliminate the steps to write and read recovered.edits files. There could be thousands of recovered.edits files are created and written concurrently during a region server recovery. Many small random writes could degrade the overall system performance.
      2) Allow writes even when a region is in recovering state. It only takes seconds for a failed over region to accept writes again.

      The feature can be enabled by setting hbase.master.distributed.log.replay to true (by default is false)
      Show
      Distributed Log Replay Description: After a region server fails, we firstly assign a failed region to another region server with recovering state marked in ZooKeeper. Then a SplitLogWorker directly replays edits from WAL(Write-Ahead-Log)s of the failed region server to the region after it's re-opened in the new location. When a region is in recovering state, it can also accept writes but no reads(including Append and Increment), region split or merge. The feature piggybacks on existing distributed log splitting framework and directly replay WAL edits to another region server instead of creating recovered.edits files. The advantages over existing log splitting recovered edits implementation: 1) Eliminate the steps to write and read recovered.edits files. There could be thousands of recovered.edits files are created and written concurrently during a region server recovery. Many small random writes could degrade the overall system performance. 2) Allow writes even when a region is in recovering state. It only takes seconds for a failed over region to accept writes again. The feature can be enabled by setting hbase.master.distributed.log.replay to true (by default is false)

      Description

      Just saw interesting issue where a cluster went down hard and 30 nodes had 1700 WALs to replay. Replay took almost an hour. It looks like it could run faster that much of the time is spent zk'ing and nn'ing.

      Putting in 0.96 so it gets a look at least. Can always punt.

        Attachments

        1. 7006-addendum-3.txt
          2 kB
          Ted Yu
        2. hbase-7006-addendum.patch
          1.0 kB
          Jeffrey Zhong
        3. hbase-7006-combined.patch
          234 kB
          Jeffrey Zhong
        4. hbase-7006-combined-v1.patch
          246 kB
          Jeffrey Zhong
        5. hbase-7006-combined-v4.patch
          298 kB
          Jeffrey Zhong
        6. hbase-7006-combined-v5.patch
          307 kB
          Jeffrey Zhong
        7. hbase-7006-combined-v6.patch
          315 kB
          Jeffrey Zhong
        8. hbase-7006-combined-v7.patch
          315 kB
          Jeffrey Zhong
        9. hbase-7006-combined-v8.patch
          311 kB
          Jeffrey Zhong
        10. hbase-7006-combined-v9.patch
          312 kB
          Jeffrey Zhong
        11. LogSplitting Comparison.pdf
          50 kB
          Jeffrey Zhong
        12. ProposaltoimprovelogsplittingprocessregardingtoHBASE-7006-v2.pdf
          130 kB
          Jeffrey Zhong

          Issue Links

            Activity

              People

              • Assignee:
                jeffreyz Jeffrey Zhong
                Reporter:
                stack stack
              • Votes:
                0 Vote for this issue
                Watchers:
                31 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: