Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-23598

There are too much small WAL File

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 1.3.6, 2.2.2
    • Fix Version/s: None
    • Component/s: wal
    • Labels:
      None
    • Environment:

      hbase version: cdh5-1.2.0_5.14.4

      hbase.wal.provider: multiwal

      hbase.wal.regiongrouping.numgroups: 4

      The wals file shows 100+ wal files in wal-3 , and some of them has very small size

      Description

      I found 10W + WAL files in my 400-scale hbase cluster. Too many WAL files will cause the cluster and recover very slowly when cluster crash completely . (In the split log step) (because too many WAL files will cause too many ZK requests). By default, WAL files start to roll when they reach HDFS Block Size (256M In My Case) * 0.95. But I found that there are many small files (0-100M) in the WAL directory. When I look at the code , I found that when I configured multiwal (I configured 4 WALs for each RS), as long as a single WAL file reached HDFS Block Size (256M In My Case) * 0.95, all WAL files would scroll, so it caused a lot of WAL small files.
      I tried to modify the code to solve the problem (making each WAL scroll independently). Although this change is very small, I am not sure if such a change will cause other problems, currently being tested ...

        Attachments

        1. wals
          15 kB
          zhuobin zheng
        2. HBASE-23598.patch
          2 kB
          zhuobin zheng

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              zhengzhuobinzzb zhuobin zheng
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:

                Time Tracking

                Estimated:
                Original Estimate - 168h
                168h
                Remaining:
                Remaining Estimate - 168h
                168h
                Logged:
                Time Spent - Not Specified
                Not Specified