Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.90.0
    • None
    • None
    • Reviewed

    Description

      We have a whole bunch of compaction awesome in our internal 0.89 branch. Porting this to 0.90:

      1) don't unconditionally compact 4 files. have a min threshold
      2) intelligently upgrade minors to majors
      3) new compaction algo (derived in HBASE-2462 )

      Attachments

        Issue Links

          Activity

            larsfrancke Lars Francke added a comment -

            This issue was closed as part of a bulk closing operation on 2015-11-20. All issues that have been resolved and where all fixVersions have been released have been closed (following discussions on the mailing list).

            larsfrancke Lars Francke added a comment - This issue was closed as part of a bulk closing operation on 2015-11-20. All issues that have been resolved and where all fixVersions have been released have been closed (following discussions on the mailing list).
            stack Michael Stack added a comment -

            Thanks for the patch Nicolas. Committed.

            stack Michael Stack added a comment - Thanks for the patch Nicolas. Committed.

            Message from: stack@duboce.net

            -----------------------------------------------------------
            This is an automatically generated e-mail. To reply, visit:
            http://review.cloudera.org/r/1192/#review1881
            -----------------------------------------------------------

            Ship it!

            k... let me commit.

            trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
            <http://review.cloudera.org/r/1192/#comment6110>

            Whitespace here – I can fix on commit.

            trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
            <http://review.cloudera.org/r/1192/#comment6113>

            Very cute

            • stack
            hbasereviewboard HBase Review Board added a comment - Message from: stack@duboce.net ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1192/#review1881 ----------------------------------------------------------- Ship it! k... let me commit. trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java < http://review.cloudera.org/r/1192/#comment6110 > Whitespace here – I can fix on commit. trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java < http://review.cloudera.org/r/1192/#comment6113 > Very cute stack

            St^Ack_: nspiegelberg: what config. would I set so it favored less files and kept the old read performance?
            [3:36pm] nspiegelberg: you have 2 options
            [3:37pm] nspiegelberg: 1) set compactionThreshold == 2
            [3:38pm] nspiegelberg: 2) make minCompactSize configurable and set it high
            [3:39pm] nspiegelberg: basically, before this algo, we would unconditionally compact 4 files, but the compactionThreshold == 3
            [3:40pm] nspiegelberg: this means that we would never use the compaction algorithm unless our cluster was stressed out
            [3:41pm] jdcryans: it used to not be like that tho
            [3:42pm] jdcryans: it's a hack that we compact everything
            [3:42pm] nspiegelberg: the only downside to the current algorithm is that sum(storefiles) doesn't take into account dedupe can have a snowball effect of compacting too aggressively during load. this can be migitated by lowering hbase.hstore.compaction.max
            [3:43pm] nspiegelberg: in reality, this hasn't proved to be an issue for us. lowering the max compact files will fix it. we can also add on some simple dedupe heuristics to fix this issue

            nspiegelberg Nicolas Spiegelberg added a comment - St^Ack_: nspiegelberg: what config. would I set so it favored less files and kept the old read performance? [3:36pm] nspiegelberg: you have 2 options [3:37pm] nspiegelberg: 1) set compactionThreshold == 2 [3:38pm] nspiegelberg: 2) make minCompactSize configurable and set it high [3:39pm] nspiegelberg: basically, before this algo, we would unconditionally compact 4 files, but the compactionThreshold == 3 [3:40pm] nspiegelberg: this means that we would never use the compaction algorithm unless our cluster was stressed out [3:41pm] jdcryans: it used to not be like that tho [3:42pm] jdcryans: it's a hack that we compact everything [3:42pm] nspiegelberg: the only downside to the current algorithm is that sum(storefiles) doesn't take into account dedupe can have a snowball effect of compacting too aggressively during load. this can be migitated by lowering hbase.hstore.compaction.max [3:43pm] nspiegelberg: in reality, this hasn't proved to be an issue for us. lowering the max compact files will fix it. we can also add on some simple dedupe heuristics to fix this issue

            Message from: "Nicolas" <nspiegelberg@facebook.com>

            -----------------------------------------------------------
            This is an automatically generated e-mail. To reply, visit:
            http://review.cloudera.org/r/1192/#review1878
            -----------------------------------------------------------

            trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
            <http://review.cloudera.org/r/1192/#comment6095>

            I know. I'm a whitespace ASBO. sorry

            trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
            <http://review.cloudera.org/r/1192/#comment6092>

            we can work on fine-tuning this. +50% was a safe cover for us and worked fine because we pre-split regions. Really, you probably don't need the pad because the summation algorithm will operate on at least 3 storefiles, so (assuming roughly even flush sizes) you get +100% pad from that.

            trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
            <http://review.cloudera.org/r/1192/#comment6091>

            s/are meet/have met/

            • Nicolas
            hbasereviewboard HBase Review Board added a comment - Message from: "Nicolas" <nspiegelberg@facebook.com> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1192/#review1878 ----------------------------------------------------------- trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java < http://review.cloudera.org/r/1192/#comment6095 > I know. I'm a whitespace ASBO. sorry trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java < http://review.cloudera.org/r/1192/#comment6092 > we can work on fine-tuning this. +50% was a safe cover for us and worked fine because we pre-split regions. Really, you probably don't need the pad because the summation algorithm will operate on at least 3 storefiles, so (assuming roughly even flush sizes) you get +100% pad from that. trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java < http://review.cloudera.org/r/1192/#comment6091 > s/are meet/have met/ Nicolas

            Message from: "Nicolas" <nspiegelberg@facebook.com>

            -----------------------------------------------------------
            This is an automatically generated e-mail. To reply, visit:
            http://review.cloudera.org/r/1192/
            -----------------------------------------------------------

            Review request for hbase.

            Summary
            -------

            We have a whole bunch of compaction awesome in our internal 0.89 branch. Porting this to 0.90:

            1) don't unconditionally compact 4 files. have a min threshold
            2) intelligently upgrade minors to majors
            3) new compaction algo (derived in HBASE-2462 )

            This addresses bug HBASE-3209.
            http://issues.apache.org/jira/browse/HBASE-3209

            Diffs


            trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1033278

            Diff: http://review.cloudera.org/r/1192/diff

            Testing
            -------

            Has been running on our primary cluster for the past couple weeks.

            Thanks,

            Nicolas

            hbasereviewboard HBase Review Board added a comment - Message from: "Nicolas" <nspiegelberg@facebook.com> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1192/ ----------------------------------------------------------- Review request for hbase. Summary ------- We have a whole bunch of compaction awesome in our internal 0.89 branch. Porting this to 0.90: 1) don't unconditionally compact 4 files. have a min threshold 2) intelligently upgrade minors to majors 3) new compaction algo (derived in HBASE-2462 ) This addresses bug HBASE-3209 . http://issues.apache.org/jira/browse/HBASE-3209 Diffs trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1033278 Diff: http://review.cloudera.org/r/1192/diff Testing ------- Has been running on our primary cluster for the past couple weeks. Thanks, Nicolas

            Results from our cluster:

            Multiput Latency: 25 ms => 3 ms avg
            Sync Latency: 12 ms => 1.5 ms avg
            Compaction Queue: 3 => 0.08 avg
            Compaction Time: now 1-30 sec (note: our compaction time ods chart is off, looked manually at logs)

            Read Latency: 6 ms => 9 ms
            Files / Store: 2 => 2.6

            Note that the minor Read drop can be fixed by setting compactionThreshold from 3 to 2. We just didn't need the improvement

            nspiegelberg Nicolas Spiegelberg added a comment - Results from our cluster: Multiput Latency: 25 ms => 3 ms avg Sync Latency: 12 ms => 1.5 ms avg Compaction Queue: 3 => 0.08 avg Compaction Time: now 1-30 sec (note: our compaction time ods chart is off, looked manually at logs) Read Latency: 6 ms => 9 ms Files / Store: 2 => 2.6 Note that the minor Read drop can be fixed by setting compactionThreshold from 3 to 2. We just didn't need the improvement

            People

              nspiegelberg Nicolas Spiegelberg
              nspiegelberg Nicolas Spiegelberg
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: