Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-14906

Improvements on FlushLargeStoresPolicy

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.0.0
    • 2.0.0
    • None
    • None
    • Reviewed
    • Hide
      In HBASE-14906 we use "hbase.hregion.memstore.flush.size/column_family_number" as the default threshold for memstore flush instead of the fixed value through "hbase.hregion.percolumnfamilyflush.size.lower.bound" property, which makes the default threshold more flexible to various use case. We also introduce a new property in name of "hbase.hregion.percolumnfamilyflush.size.lower.bound.min" with 16M as the default value to avoid small flush in cases like hundreds of column families.

      After this change setting "hbase.hregion.percolumnfamilyflush.size.lower.bound" in hbase-site.xml won't take effect anymore, but expert users could still set this property in table descriptor to override the default value just as before
      Show
      In HBASE-14906 we use "hbase.hregion.memstore.flush.size/column_family_number" as the default threshold for memstore flush instead of the fixed value through "hbase.hregion.percolumnfamilyflush.size.lower.bound" property, which makes the default threshold more flexible to various use case. We also introduce a new property in name of "hbase.hregion.percolumnfamilyflush.size.lower.bound.min" with 16M as the default value to avoid small flush in cases like hundreds of column families. After this change setting "hbase.hregion.percolumnfamilyflush.size.lower.bound" in hbase-site.xml won't take effect anymore, but expert users could still set this property in table descriptor to override the default value just as before

    Description

      When checking FlushLargeStoragePolicy, found below possible improving points:

      1. Currently in selectStoresToFlush, we will do the selection no matter how many actual families, which is not necessary for one single family

      2. Default value for hbase.hregion.percolumnfamilyflush.size.lower.bound could not fit in all cases, and requires user to know details of the implementation to properly set it. We propose to use "hbase.hregion.memstore.flush.size/column_family_number" instead:

        <property>
          <name>hbase.hregion.percolumnfamilyflush.size.lower.bound</name>
          <value>16777216</value>
          <description>
          If FlushLargeStoresPolicy is used and there are multiple column families,
          then every time that we hit the total memstore limit, we find out all the
          column families whose memstores exceed a "lower bound" and only flush them
          while retaining the others in memory. The "lower bound" will be
          "hbase.hregion.memstore.flush.size / column_family_number" by default
          unless value of this property is larger than that. If none of the families
          have their memstore size more than lower bound, all the memstores will be
          flushed (just as usual).
          </description>
        </property>
      

      Attachments

        1. HBASE-14906.patch
          7 kB
          Yu Li
        2. HBASE-14906.v2.patch
          8 kB
          Yu Li
        3. HBASE-14906.v3.patch
          8 kB
          Yu Li
        4. HBASE-14906.v4.patch
          10 kB
          Yu Li
        5. HBASE-14906.v4.patch
          10 kB
          Michael Stack

        Activity

          People

            liyu Yu Li
            liyu Yu Li
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: