Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: regionserver
    • Labels:
      None

      Description

      Add the ability to throttle major compaction.
      For those use cases when a stop-the-world approach is not practical, it is useful to be able to throttle the impact that major compaction has on the cluster.

        Issue Links

          Activity

          Hide
          stack added a comment -

          As a workaround, we could run a script external to hbase that would first elicted the set of regions in a cluster and then per region, set in motion a major compaction waiting on completion before moving to the next region (Script could check hdfs and count storefiles in the region to figure completion of region major compaction). The script could be run from cron or, as per the painting of the golden gate legend, once we'd gotten to the end of the bridge/table, we would loop around and start in again on the first region, in perpetuum.

          Show
          stack added a comment - As a workaround, we could run a script external to hbase that would first elicted the set of regions in a cluster and then per region, set in motion a major compaction waiting on completion before moving to the next region (Script could check hdfs and count storefiles in the region to figure completion of region major compaction). The script could be run from cron or, as per the painting of the golden gate legend, once we'd gotten to the end of the bridge/table, we would loop around and start in again on the first region, in perpetuum.
          Hide
          Lars George added a comment -

          +1 on a feature like that. We need some script/tool/thread that can run major compactions based on load and abort if the load goes over a certain threshold. Once the low load is resumed we can continue where left off. Do this region by region, with a configurable number, i.e. one per cluster, one per node, and so on.

          We should also add a JMX/API call that returns the compaction status per server. It should list the various compaction queues, live compactions, their scope, and region/cf they work on. Maybe put this into the ServerInfo?

          Show
          Lars George added a comment - +1 on a feature like that. We need some script/tool/thread that can run major compactions based on load and abort if the load goes over a certain threshold. Once the low load is resumed we can continue where left off. Do this region by region, with a configurable number, i.e. one per cluster, one per node, and so on. We should also add a JMX/API call that returns the compaction status per server. It should list the various compaction queues, live compactions, their scope, and region/cf they work on. Maybe put this into the ServerInfo?
          Hide
          Otis Gospodnetic added a comment -

          Is this issue still needed or did HBASE-5867 take care of compaction throttling?

          Show
          Otis Gospodnetic added a comment - Is this issue still needed or did HBASE-5867 take care of compaction throttling?
          Hide
          Jonathan Hsieh added a comment -

          This is essentially a dupe of HBASE-8329 and HBASE-5867, both of which are closed out.

          Show
          Jonathan Hsieh added a comment - This is essentially a dupe of HBASE-8329 and HBASE-5867 , both of which are closed out.

            People

            • Assignee:
              Unassigned
              Reporter:
              Joep Rottinghuis
            • Votes:
              2 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development