Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-14951

Make hbase.regionserver.maxlogs obsolete

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 1.2.0, 1.3.0, 2.0.0
    • Performance, wal
    • None
    • Reviewed
    • Hide
      Rolling WAL events across a cluster can be highly correlated, hence flushing memstores, hence triggering minor compactions, that can be promoted to major ones. These events are highly correlated in time if there is a balanced write-load on the regions in a table. Default value for maximum WAL files (* hbase.regionserver.maxlogs*), which controls WAL rolling events - 32 is too small for many modern deployments.
      Now we calculate this value dynamically (if not defined by user), using the following formula:

      maxLogs = Math.max( 32, HBASE_HEAP_SIZE * memstoreRatio * 2/ LogRollSize), where

      memstoreRatio is *hbase.regionserver.global.memstore.size*
      LogRollSize is maximum WAL file size (default 0.95 * HDFS block size)

      We need to make sure that we avoid fully or minimize events when RS has to flush memstores prematurely only because it reached artificial limit of hbase.regionserver.maxlogs, this is why we put this 2 x multiplier in equation, this gives us maximum WAL capacity of 2 x RS memstore-size.

      Runaway WAL files.

      The default log rolling period (1h) allows to accumulate up to 2 X Memstore Size data in a WAL. For heap size - 32G and all other default setting, this gives ~ 26GB of data. Under heavy write load, the number of WAL files can increase dramatically. RegionServer LogRoller will be archiving old WALs periodically. User has three options, either override default hbase.regionserver.maxlogs or override default hbase.regionserver.logroll.period (decrease), or both to control runaway WALs.

      For system with bursty write load, the hbase.regionserver.logroll.period can be decreased to lower value. In this case the maximum number of wal files will be defined by the total size of memstore (unflushed data), not by the hbase.regionserver.maxlogs. But for majority of applications there will be no issues with defaults. Data will be flushed periodically from memstore, the LogRoller will archive old wal files and the system will never reach the new defaults for hbase.regionserver.maxlogs, unless the system is under extreme load for prolonged period of time, but in this case, decreasing hbase.regionserver.logroll.period allows us to control runaway wal files.

      The following table gives the new default maximum log files values for several different Region Server heap sizes:

      heap memstore perc maxLogs
      1G 40% 32
      2G 40% 32
      10G 40% 80
      20G 40% 160
      32G 40% 256



        
      Show
      Rolling WAL events across a cluster can be highly correlated, hence flushing memstores, hence triggering minor compactions, that can be promoted to major ones. These events are highly correlated in time if there is a balanced write-load on the regions in a table. Default value for maximum WAL files (* hbase.regionserver.maxlogs*), which controls WAL rolling events - 32 is too small for many modern deployments. Now we calculate this value dynamically (if not defined by user), using the following formula: maxLogs = Math.max( 32, HBASE_HEAP_SIZE * memstoreRatio * 2/ LogRollSize), where memstoreRatio is *hbase.regionserver.global.memstore.size* LogRollSize is maximum WAL file size (default 0.95 * HDFS block size) We need to make sure that we avoid fully or minimize events when RS has to flush memstores prematurely only because it reached artificial limit of hbase.regionserver.maxlogs, this is why we put this 2 x multiplier in equation, this gives us maximum WAL capacity of 2 x RS memstore-size. Runaway WAL files. The default log rolling period (1h) allows to accumulate up to 2 X Memstore Size data in a WAL. For heap size - 32G and all other default setting, this gives ~ 26GB of data. Under heavy write load, the number of WAL files can increase dramatically. RegionServer LogRoller will be archiving old WALs periodically. User has three options, either override default hbase.regionserver.maxlogs or override default hbase.regionserver.logroll.period (decrease), or both to control runaway WALs. For system with bursty write load, the hbase.regionserver.logroll.period can be decreased to lower value. In this case the maximum number of wal files will be defined by the total size of memstore (unflushed data), not by the hbase.regionserver.maxlogs. But for majority of applications there will be no issues with defaults. Data will be flushed periodically from memstore, the LogRoller will archive old wal files and the system will never reach the new defaults for hbase.regionserver.maxlogs, unless the system is under extreme load for prolonged period of time, but in this case, decreasing hbase.regionserver.logroll.period allows us to control runaway wal files. The following table gives the new default maximum log files values for several different Region Server heap sizes: heap memstore perc maxLogs 1G 40% 32 2G 40% 32 10G 40% 80 20G 40% 160 32G 40% 256   

    Description

      There was a discussion in HBASE-14388 related to maximum number of log files. It was an agreement that we should calculate this number in a code but still need to honor user's setting.

      Maximum number of log files now is calculated as following:
      maxLogs = HEAP_SIZE * memstoreRatio * 2/ LogRollSize

      Attachments

        1. HBASE-14951-v2.patch
          3 kB
          Vladimir Rodionov
        2. HBASE-14951-v1.patch
          3 kB
          Vladimir Rodionov

        Activity

          People

            vrodionov Vladimir Rodionov
            vrodionov Vladimir Rodionov
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: