HBase
  1. HBase
  2. HBASE-4163

Create Split Strategy for YCSB Benchmark

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 0.90.3, 0.92.0
    • Fix Version/s: 0.99.0
    • Component/s: util
    • Labels:

      Description

      Talked with Lars about how we can make it easier for users to run the YCSB benchmarks against HBase & get realistic results. Currently, HBase is optimized for the random/uniform read/write case, which is the YCSB load. The initial reason why we perform bad when users test against us is because they do not presplit regions & have the split ratio really low. We need a one-line way for a user to create a table that is pre-split to 200 regions (or some decent number) by default & disable splitting. Realistically, this is how a uniform load cluster should scale, so it's not a hack. This will also give us a good use case to point to for how users should pre-split regions.

        Issue Links

          Activity

          Hide
          Nicolas Spiegelberg added a comment -

          My initial thought is to use the existing RegionSplitter utility. We just need to create a custom SplitAlgorithm implementation class for the YCSB key specification & tell the users to run:

          bin/hbase org.apache.hadoop.hbase.util.RegionSplitter TABLE -c 200 -f FAMILY -D split.algorithm=YcsbSplit
          

          to pre-create a table with 200 regions. To not split, we can either set hbase.hregion.max.filesize to a really high value or add a per-table split config option.

          Show
          Nicolas Spiegelberg added a comment - My initial thought is to use the existing RegionSplitter utility. We just need to create a custom SplitAlgorithm implementation class for the YCSB key specification & tell the users to run: bin/hbase org.apache.hadoop.hbase.util.RegionSplitter TABLE -c 200 -f FAMILY -D split.algorithm=YcsbSplit to pre-create a table with 200 regions. To not split, we can either set hbase.hregion.max.filesize to a really high value or add a per-table split config option.
          Hide
          Jean-Daniel Cryans added a comment -

          That's pretty clever guys.

          Show
          Jean-Daniel Cryans added a comment - That's pretty clever guys.
          Hide
          Luke Lu added a comment -

          Tried to figure this out for somebody today, here is a hbase shell one-liner to save some more people's time before the feature is implemented:

          create 'usertable', 'family', {SPLITS => (1..200).map {|i| "user#{1000+i*(9999-1000)/200}"}, MAX_FILESIZE => 4*1024**3}
          
          Show
          Luke Lu added a comment - Tried to figure this out for somebody today, here is a hbase shell one-liner to save some more people's time before the feature is implemented: create 'usertable', 'family', {SPLITS => (1..200).map {|i| "user#{1000+i*(9999-1000)/200}" }, MAX_FILESIZE => 4*1024**3}
          Hide
          stack added a comment -

          Closing as fixed. I also added note to our little ycsb section in the doc that will show the next time I push the site; it points at Luke's little script.

          Show
          stack added a comment - Closing as fixed. I also added note to our little ycsb section in the doc that will show the next time I push the site; it points at Luke's little script.
          Hide
          Hudson added a comment -

          SUCCESS: Integrated in HBase-TRUNK #4715 (See https://builds.apache.org/job/HBase-TRUNK/4715/)
          Add note on how to presplit for ycsb from HBASE-4163 (stack: rev 1548760)

          • /hbase/trunk/src/main/docbkx/book.xml
          Show
          Hudson added a comment - SUCCESS: Integrated in HBase-TRUNK #4715 (See https://builds.apache.org/job/HBase-TRUNK/4715/ ) Add note on how to presplit for ycsb from HBASE-4163 (stack: rev 1548760) /hbase/trunk/src/main/docbkx/book.xml

            People

            • Assignee:
              Luke Lu
              Reporter:
              Nicolas Spiegelberg
            • Votes:
              4 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development