Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-7253

Compaction Tool

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 0.95.2
    • 0.95.0
    • Compaction
    • None
    • Hide
      The CompactionTool works at file-system level, so the table should be disabled.

      The compaction process uses the same hbase-site.xml configuration property used by the server, like
      "hbase.hstore.compactionThreshold" & co.

      You can compact the whole table or just a single region or family,
      and the input of the CompactionTool is a fs path.

      You can run the compaction as a MapReduce Job, or as a local process.
      Each family can be compacted in parallel if you use the -mapreduce option.

      To compact "TestTable" family "cf1" of region "e450da04b1a10099b618bec031e0f951"
      bin/hbase org.apache.hadoop.hbase.regionserver.CompactionTool hdfs:///hbase/TestTable/e450da04b1a10099b618bec031e0f951/cf1

      To compact all the families of region "e450da04b1a10099b618bec031e0f951":
      bin/hbase org.apache.hadoop.hbase.regionserver.CompactionTool hdfs:///hbase/TestTable/e450da04b1a10099b618bec031e0f951

      To compact all regions and family of the Table:
      bin/hbase org.apache.hadoop.hbase.regionserver.CompactionTool -mapred hdfs:///hbase/TestTable
      Show
      The CompactionTool works at file-system level, so the table should be disabled. The compaction process uses the same hbase-site.xml configuration property used by the server, like "hbase.hstore.compactionThreshold" & co. You can compact the whole table or just a single region or family, and the input of the CompactionTool is a fs path. You can run the compaction as a MapReduce Job, or as a local process. Each family can be compacted in parallel if you use the -mapreduce option. To compact "TestTable" family "cf1" of region "e450da04b1a10099b618bec031e0f951" bin/hbase org.apache.hadoop.hbase.regionserver.CompactionTool hdfs:///hbase/TestTable/e450da04b1a10099b618bec031e0f951/cf1 To compact all the families of region "e450da04b1a10099b618bec031e0f951": bin/hbase org.apache.hadoop.hbase.regionserver.CompactionTool hdfs:///hbase/TestTable/e450da04b1a10099b618bec031e0f951 To compact all regions and family of the Table: bin/hbase org.apache.hadoop.hbase.regionserver.CompactionTool -mapred hdfs:///hbase/TestTable

    Description

      In HBASE-5616, as part of the compaction code refactor, a CompactionTool was added.

      but there are some issues:

      • The tool is under test/
      • mockito is required, so the "test" scope should be removed from the pom.xml, otherwise the tool doesn't start
      • The mock, used by the tool, is mocking HRegion.getRegionInfo() but some code (Store) uses HRegion.regionInfo directly HStore.java#L2021, HStore.java#L1389, HStore.java#L1402 and you end up with a NPE in the tool.
      • The Mocked Store uses a dummy family and the compacted files doesn't get the same family properties specified (compression, encoding, ...)
      • at the end of compaction CompactionTool.java#L155, on by default, the compaction file is removed (note that the compacted one are already removed inside the store.compact()... and you end up with an empty dir, if you compact everything.

      I've fixed some stuff and added support to:

      • Run the compaction as a MR Job
      • Specify a Table (compact each region/family)
      • Specify a Region (compact each family)
      • Specify a Family (as before)

      Attachments

        1. HBASE-7253-v0.patch
          31 kB
          Matteo Bertozzi
        2. HBASE-7253-v1.patch
          32 kB
          Matteo Bertozzi

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            mbertozzi Matteo Bertozzi
            mbertozzi Matteo Bertozzi
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment