HBase
  1. HBase
  2. HBASE-1364

[performance] Distributed splitting of regionserver commit logs

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.92.0
    • Component/s: Coprocessors
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Adds distributed WAL log splitting in place of single-process master orchestrated splitting. Feature is ON by default (To disable, set hbase.master.distributed.log.splitting=false).

      Description

      HBASE-1008 has some improvements to our log splitting on regionserver crash; but it needs to run even faster.

      (Below is from HBASE-1008)

      In bigtable paper, the split is distributed. If we're going to have 1000 logs, we need to distribute or at least multithread the splitting.

      1. As is, regions starting up expect to find one reconstruction log only. Need to make it so pick up a bunch of edit logs and it should be fine that logs are elsewhere in hdfs in an output directory written by all split participants whether multithreaded or a mapreduce-like distributed process (Lets write our distributed sort first as a MR so we learn whats involved; distributed sort, as much as possible should use MR framework pieces). On startup, regions go to this directory and pick up the files written by split participants deleting and clearing the dir when all have been read in. Making it so can take multiple logs for input, can also make the split process more robust rather than current tenuous process which loses all edits if it doesn't make it to the end without error.
      2. Each column family rereads the reconstruction log to find its edits. Need to fix that. Split can sort the edits by column family so store only reads its edits.

      1. HBASE-1364.patch
        88 kB
        Alex Newman
      2. 1364-v5.txt
        162 kB
        stack
      3. org.apache.hadoop.hbase.master.TestDistributedLogSplitting-output.txt
        5.72 MB
        stack

        Issue Links

        There are no Sub-Tasks for this issue.

          Activity

          stack created issue -
          Jonathan Gray made changes -
          Field Original Value New Value
          Fix Version/s 0.21.0 [ 12313607 ]
          stack made changes -
          Link This issue is part of HBASE-1816 [ HBASE-1816 ]
          Cosmin Lehene made changes -
          Link This issue relates to HBASE-1994 [ HBASE-1994 ]
          Alex Newman made changes -
          Assignee Alex Newman [ posix4e ]
          Alex Newman made changes -
          Time Spent 4h [ 14400 ]
          Remaining Estimate 0h [ 0 ]
          Alex Newman made changes -
          Remaining Estimate 0h [ 0 ] 12h [ 43200 ]
          Alex Newman made changes -
          Remaining Estimate 12h [ 43200 ]
          stack made changes -
          Fix Version/s 0.22.0 [ 12314223 ]
          Fix Version/s 0.21.0 [ 12313607 ]
          Alex Newman made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Alex Newman made changes -
          Attachment 1364.patch [ 12446299 ]
          Alex Newman made changes -
          Attachment 1364-v2.patch [ 12446385 ]
          Alex Newman made changes -
          Remaining Estimate 0h [ 0 ]
          Time Spent 4h [ 14400 ] 8h [ 28800 ]
          Alex Newman made changes -
          Attachment 1 (3) [ 12450248 ]
          Cosmin Lehene made changes -
          Link This issue is related to HBASE-3323 [ HBASE-3323 ]
          Todd Lipcon made changes -
          Component/s coprocessors [ 12314191 ]
          Alex Newman made changes -
          Attachment 1 (3) [ 12450248 ]
          Alex Newman made changes -
          Attachment 1364-v2.patch [ 12446385 ]
          Alex Newman made changes -
          Attachment 1364.patch [ 12446299 ]
          Alex Newman made changes -
          Attachment HBASE-1364.patch [ 12468771 ]
          Andrew Purtell made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Alex Newman made changes -
          Assignee Alex Newman [ posix4e ]
          Prakash Khemani made changes -
          Assignee Prakash Khemani [ khemani ]
          stack made changes -
          Attachment 1364-v5.txt [ 12476390 ]
          stack made changes -
          stack made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Hadoop Flags [Reviewed]
          Release Note Adds distributed WAL log splitting in place of single-process master orchestrated splitting. Feature is ON by default (To disable, set hbase.master.distributed.log.splitting=false).
          Resolution Fixed [ 1 ]
          Lars George made changes -
          Link This issue relates to HBASE-3889 [ HBASE-3889 ]

            People

            • Assignee:
              Prakash Khemani
              Reporter:
              stack
            • Votes:
              1 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 2h Original Estimate - 2h
                2h
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 10h
                10h

                  Development