Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-354

[hbase] Make region splits faster

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • None
    • None
    • None

    Description

      HADOOP-1644 '[hbase] Compactions should take no longer than period between memcache flushes' is about making compactions run faster. This issue is about making splits faster. Currently splits are done by reading as input a map file and per record, writing out two new mapfiles. Its currently too slow. ~30 seconds to split 120MB. Google hints in bigtable that splitting is very fast because they let the split children feed off the split parent. Primitive testing has splitting mapfiles using raw streams running 3 to 4 times faster than splitting on mapfile keys.

      Attachments

        1. fastsplits.patch
          128 kB
          Michael Stack
        2. mapfile_split.patch
          8 kB
          Michael Stack
        3. splits-2.patch
          188 kB
          Michael Stack
        4. splits-v3.patch
          214 kB
          Michael Stack

        Issue Links

          Activity

            People

              Unassigned Unassigned
              stack Michael Stack
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: