Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-21065

Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 3.0.0-alpha-1, 2.5.0
    • meta, Performance
    • None
    • Reviewed
    • Hide
      Enables ROW_INDEX_V1 encoding on hbase:meta by default. Also enables blooms.

      Will NOT enable encoding and blooms on upgrade. Operator will need to do this manually by editing hbase:meta schema (Or we provide a migration script to enable these configs -- out-of-scope for this JIRA).
      Show
      Enables ROW_INDEX_V1 encoding on hbase:meta by default. Also enables blooms. Will NOT enable encoding and blooms on upgrade. Operator will need to do this manually by editing hbase:meta schema (Or we provide a migration script to enable these configs -- out-of-scope for this JIRA).

    Description

      Some users end up hitting meta hard. Bulk is probably because our client goes to meta too often, and the real 'fix' for a saturated meta is splitting it, but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in the near term. It adds an index on hfile blocks and helped improve random reads against user-space tables (less compares as we used index to go direct to requested Cells rather than look at each Cell in turn until we found what we wanted – see RN on HBASE-16213 for citation).

      I also noticed code-reading that we don't enable blooms on hbase:meta tables; that could save some CPU and speed things up a bit too:

              // Disable blooms for meta.  Needs work.  Seems to mess w/ getClosestOrBefore.
              .setBloomFilterType(BloomType.NONE)
      

      This issue is about doing a bit of perf compare of encoding on vs current default (and will check diff in size of indexed blocks).

      Meta access is mostly random-read I believe (A review of a user's access showed this so at least for their workload). The nice addition, HBASE-19722 Meta query statistics metrics source, would help verify if it saw some usage on a prod cluster.

      If all is good, I'd like to make a small patch, one that could be easily backported, with minimal changes in it.

      As is, its all a little awkward as the meta table schema is hard-coded and meta is immutable – stuff we'll have to fix if we want to split meta – so in the meantime it requires a code change to enable (and a backport of HBASE-16213 – this patch is in 1.4.0 only currently, perhaps that is enough). Code change to enable is small:

      diff --git a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
      index 28c7ec3c2f..8f08f94dc1 100644
      --- a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
      +++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
      @@ -160,6 +160,7 @@ public class FSTableDescriptors implements TableDescriptors {
               .setScope(HConstants.REPLICATION_SCOPE_LOCAL)
               // Disable blooms for meta.  Needs work.  Seems to mess w/ getClosestOrBefore.
               .setBloomFilterType(BloomType.NONE)
      +        .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1)
               .build())
             .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY)
               .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS,
      

      Attachments

        Issue Links

          Activity

            People

              apurtell Andrew Kyle Purtell
              stack Michael Stack
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: