Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-18971

Limit the concurrent opened wal writers when splitting

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Recovery, wal
    • Labels:
      None

      Description

      A whole cluster restart is very easy to fail under the current architecture if there are many regions on a single region server.

      On a small cluster, although an recovered edits file is very small, NN will reserve a block size for it when opening, so it will easily run out of space.

      And on a large cluster, although the max xceiver count is already 4096, it is still easy to run out of quota and cause DN to reject our request if there are 1k+ regions on a single RS as we will write 3 copies for a block.

      Under the current architecture we need to carefully choose the ‘hbase.regionserver.wal.max.splitters’ and 'hbase.master.executor.serverops.threads' to limit the concurrency of wal splitter. But this is only a compromise as it also slows down the fail recovery.

      So here we want to limit the concurrent opened wal writers when splitting. It may work like a memstore, which buffers the wal entries in memory and when it is full we flush some entries out.

      Suggestions are welcomed.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                zhangduo Duo Zhang
              • Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: