HBase
  1. HBase
  2. HBASE-11364

[BlockCache] Add a flag to cache data blocks in L1 if multi-tier cache

    Details

    • Type: Task Task
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.99.0
    • Component/s: None
    • Labels:
      None
    • Release Note:
      Adds a flag to HColumnDescriptor, cacheDataInL1. In shell, you set it as follows: hbase(main):003:0> create 't', {NAME => 't', CONFIGURATION => {CACHE_DATA_IN_L1 => 'true'}}

      Description

      This is a prerequisite for HBASE-11323 BucketCache on all the time. It addresses a @lars hofhansl ask that we be able to ask that for some column families, even their data blocks get cached up in the LruBlockCache L1 tier in a multi-tier deploy as happens when doing BucketCache (CombinedBlockCache) setups.

        Issue Links

          Activity

          Hide
          stack added a comment -

          Adds to CacheConfig a flag cacheDataInL1. Adds override to BlockCache#cacheBlock that takes the setting of this flag. Acts on this flag in CombinedBlockCache. I set this flag on all system tables. I added a test that this flag has an effect. Also tried it here locally to prove it works.

          I went through LruBlockCache and the doc doing fixup of IN_MEMORY and another pass over the block cache seciton. When this edit goes in, will close HBASE-9131 because has that detail here.

          Show
          stack added a comment - Adds to CacheConfig a flag cacheDataInL1. Adds override to BlockCache#cacheBlock that takes the setting of this flag. Acts on this flag in CombinedBlockCache. I set this flag on all system tables. I added a test that this flag has an effect. Also tried it here locally to prove it works. I went through LruBlockCache and the doc doing fixup of IN_MEMORY and another pass over the block cache seciton. When this edit goes in, will close HBASE-9131 because has that detail here.
          Hide
          stack added a comment -

          For I while I considered an abstract notion such as 'priority' or 'hot' column family with how its implemented then done by hbase internally setting IN_MEMORY and CACHE_DATA_IN_L1 but put that aside as work for another day. All configs are fairly specific currently, CACHE_ON_WRITE, IN_MEMORY, etc., with particular meanings. Here I am following the mold for now.

          Show
          stack added a comment - For I while I considered an abstract notion such as 'priority' or 'hot' column family with how its implemented then done by hbase internally setting IN_MEMORY and CACHE_DATA_IN_L1 but put that aside as work for another day. All configs are fairly specific currently, CACHE_ON_WRITE, IN_MEMORY, etc., with particular meanings. Here I am following the mold for now.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12650658/11364.txt
          against trunk revision .
          ATTACHMENT ID: 12650658

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 12 new or modified tests.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 lineLengths. The patch introduces the following lines longer than 100:
          + * <code>hbase(main):003:0> create 't', {NAME => 't', CONFIGURATION => {CACHE_DATA_IN_L1 => 'true'}}</code>
          + LruBlockCache, and BucketCache, and SlabCache, which are both (usually) offheap. This section
          + two tiers and how blocks move between them is done by <classname>DoubleBlockCache</classname>
          + <xref linkend="offheap.blockcache.slabcache" /> for more detail on how DoubleBlockCache works.
          + It keeps all DATA blocks in the BucketCache and meta blocks – INDEX and BLOOM blocks –
          + works so differently, it is difficult to do a fair comparison between BucketCache and SlabCache.
          + xlink:href="http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/io/hfile/CacheConfig.html" />.
          + <programlisting>HColumnDescriptor.setInMemory(true);</programlisting> if creating a table from java,
          + the shell: e.g. <programlisting>hbase(main):003:0> create 't',

          {NAME => 'f', IN_MEMORY => 'true'}

          </programlisting></para>
          + DoubleBlockCache is an abstraction layer that combines two caches, the smaller onHeapCache and the

          +1 site. The mvn site goal succeeds with this patch.

          +1 core tests. The patch passed unit tests in .

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/9779//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9779//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9779//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9779//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9779//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9779//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9779//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9779//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9779//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9779//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/9779//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12650658/11364.txt against trunk revision . ATTACHMENT ID: 12650658 +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 12 new or modified tests. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. -1 findbugs . The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 lineLengths . The patch introduces the following lines longer than 100: + * <code>hbase(main):003:0> create 't', {NAME => 't', CONFIGURATION => {CACHE_DATA_IN_L1 => 'true'}}</code> + LruBlockCache, and BucketCache, and SlabCache, which are both (usually) offheap. This section + two tiers and how blocks move between them is done by <classname>DoubleBlockCache</classname> + <xref linkend="offheap.blockcache.slabcache" /> for more detail on how DoubleBlockCache works. + It keeps all DATA blocks in the BucketCache and meta blocks – INDEX and BLOOM blocks – + works so differently, it is difficult to do a fair comparison between BucketCache and SlabCache. + xlink:href="http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/io/hfile/CacheConfig.html" />. + <programlisting>HColumnDescriptor.setInMemory(true);</programlisting> if creating a table from java, + the shell: e.g. <programlisting>hbase(main):003:0> create 't', {NAME => 'f', IN_MEMORY => 'true'} </programlisting></para> + DoubleBlockCache is an abstraction layer that combines two caches, the smaller onHeapCache and the +1 site . The mvn site goal succeeds with this patch. +1 core tests . The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/9779//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9779//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9779//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9779//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9779//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9779//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9779//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9779//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9779//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9779//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/9779//console This message is automatically generated.
          Hide
          stack added a comment -

          A review please.

          Show
          stack added a comment - A review please.
          Hide
          stack added a comment -

          @apurtell or larsh Any chance of a review? Thanks lads.

          Show
          stack added a comment - @apurtell or larsh Any chance of a review? Thanks lads.
          Hide
          Andrew Purtell added a comment -

          +1.

          Thanks for adding those bits of javadoc along with the manual update.

          Show
          Andrew Purtell added a comment - +1. Thanks for adding those bits of javadoc along with the manual update.
          Hide
          stack added a comment -

          Committed to master. Thanks for the review Andrew Purtell

          Show
          stack added a comment - Committed to master. Thanks for the review Andrew Purtell
          Hide
          Hudson added a comment -

          SUCCESS: Integrated in HBase-TRUNK #5216 (See https://builds.apache.org/job/HBase-TRUNK/5216/)
          HBASE-11364 [BlockCache] Add a flag to cache data blocks in L1 if multi-tier cache (stack: rev 3ed3c5513cc26a2158173caab8d36b6d7f544009)

          • hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
          • hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/bucket/TestBucketCache.java
          • src/main/docbkx/book.xml
          • hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheConfig.java
          • hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java
          • hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHeapMemoryManager.java
          • hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/BucketCache.java
          • hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabCache.java
          • hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCache.java
          • hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java
          • hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
          • hbase-client/src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java
          • hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SingleSizeCache.java
          • hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CombinedBlockCache.java
          • hbase-client/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
          • hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AccessControlLists.java
          • hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestLruBlockCache.java
          Show
          Hudson added a comment - SUCCESS: Integrated in HBase-TRUNK #5216 (See https://builds.apache.org/job/HBase-TRUNK/5216/ ) HBASE-11364 [BlockCache] Add a flag to cache data blocks in L1 if multi-tier cache (stack: rev 3ed3c5513cc26a2158173caab8d36b6d7f544009) hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/bucket/TestBucketCache.java src/main/docbkx/book.xml hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheConfig.java hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHeapMemoryManager.java hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/BucketCache.java hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabCache.java hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCache.java hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java hbase-client/src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SingleSizeCache.java hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CombinedBlockCache.java hbase-client/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AccessControlLists.java hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestLruBlockCache.java
          Hide
          Lars Hofhansl added a comment -

          Belated +1

          So this implies that storage will be L2 by default and L1 is opt in?
          In the interested of the least surprise we might want to reverse that...?

          Show
          Lars Hofhansl added a comment - Belated +1 So this implies that storage will be L2 by default and L1 is opt in? In the interested of the least surprise we might want to reverse that...?
          Hide
          stack added a comment -

          So this implies that storage will be L2 by default and L1 is opt in?

          Currently, if you enable bucketcache, DATA blocks of user tables are kept in L2 (with this option making it so you can instead have them live in L1 if you set option on your HColumnDescriptor.

          If we enable block cache as on by default, then yes, the default would be the above.

          In the interested of the least surprise we might want to reverse that...?

          We could do that. It'd be strange though because we'd have block cache on but nothing would be using it, not unless folks change schema.... which they don't usually do. Would be better to just leave block cache off and encourage folks to enable it?

          Show
          stack added a comment - So this implies that storage will be L2 by default and L1 is opt in? Currently, if you enable bucketcache, DATA blocks of user tables are kept in L2 (with this option making it so you can instead have them live in L1 if you set option on your HColumnDescriptor. If we enable block cache as on by default, then yes, the default would be the above. In the interested of the least surprise we might want to reverse that...? We could do that. It'd be strange though because we'd have block cache on but nothing would be using it, not unless folks change schema.... which they don't usually do. Would be better to just leave block cache off and encourage folks to enable it?
          Hide
          stack added a comment -

          One other thought is that we'd untie L1 and L2 caches and have them work independent of each other; i.e. not use CombinedBlockCache which keeps meta blocks up in L1 and data down in L2... rather we'd just have evictions from L1 go to L2 and then out of L2... Double caching is possible .. they'd work like the SlabCache setup... but that seemed to generate more GC'ing in test. I can try testing.

          Show
          stack added a comment - One other thought is that we'd untie L1 and L2 caches and have them work independent of each other; i.e. not use CombinedBlockCache which keeps meta blocks up in L1 and data down in L2... rather we'd just have evictions from L1 go to L2 and then out of L2... Double caching is possible .. they'd work like the SlabCache setup... but that seemed to generate more GC'ing in test. I can try testing.
          Hide
          Lars Hofhansl added a comment -

          Currently, if you enable bucketcache, DATA blocks of user tables are kept in L2 (with this option making it so you can instead have them live in L1 if you set option on your HColumnDescriptor.

          I can see this both way. Since off caching is a separate option your way is better: Once off heap cache is enabled, data goes there by default.

          Show
          Lars Hofhansl added a comment - Currently, if you enable bucketcache, DATA blocks of user tables are kept in L2 (with this option making it so you can instead have them live in L1 if you set option on your HColumnDescriptor. I can see this both way. Since off caching is a separate option your way is better: Once off heap cache is enabled, data goes there by default.
          Hide
          stack added a comment -

          Lars Hofhansl

          I can see this both way. Since off caching is a separate option your way is better: Once off heap cache is enabled, data goes there by default.

          Come again Lars. I don't follow. Options are:

          1. Do NOT enable offheap by default. Just talk it up as the way to go underlining it will make pure in-memory access slower (but you can make it so some of your tables are pegged in memory if you want because of the flag here). Upside: No surprise. Downside: Folks don't read manuals nor change defaults.
          2. Enable offheap BucketCache using CombinedBucketCache. When folks upgrade, latency to user-level DATA blocks will go up. Upsides: less GC, more cached. Downside: those who notice added latency might get upset. Changing schema will require alter table.
          3. Enable offheap BucketCache but in additive mode where we just add in an L2 under the L1 LruBlockCache. Upside: Additive. Downside: Could make for more GC.

          Of the above, maybe 1. is the way to go? 2. may surprise in that perf and GC gets better of a sudden (this would be ok) but others may be surprised that their latencies have gone up for some key tables. 3. may actually make GC worse (at least that is case in the SlabCache case which is similar and the L1/L2 layout doesn't get good review to date going by HBASE-8894).

          Will test 3.

          Show
          stack added a comment - Lars Hofhansl I can see this both way. Since off caching is a separate option your way is better: Once off heap cache is enabled, data goes there by default. Come again Lars. I don't follow. Options are: 1. Do NOT enable offheap by default. Just talk it up as the way to go underlining it will make pure in-memory access slower (but you can make it so some of your tables are pegged in memory if you want because of the flag here). Upside: No surprise. Downside: Folks don't read manuals nor change defaults. 2. Enable offheap BucketCache using CombinedBucketCache. When folks upgrade, latency to user-level DATA blocks will go up. Upsides: less GC, more cached. Downside: those who notice added latency might get upset. Changing schema will require alter table. 3. Enable offheap BucketCache but in additive mode where we just add in an L2 under the L1 LruBlockCache. Upside: Additive. Downside: Could make for more GC. Of the above, maybe 1. is the way to go? 2. may surprise in that perf and GC gets better of a sudden (this would be ok) but others may be surprised that their latencies have gone up for some key tables. 3. may actually make GC worse (at least that is case in the SlabCache case which is similar and the L1/L2 layout doesn't get good review to date going by HBASE-8894 ). Will test 3.
          Hide
          stack added a comment -

          Moving the last comment to the more appropriate issue, HBASE-11323 bucketcache on all the time.

          Show
          stack added a comment - Moving the last comment to the more appropriate issue, HBASE-11323 bucketcache on all the time.
          Hide
          Lars Hofhansl added a comment -

          I was saying the way you have it is good

          Show
          Lars Hofhansl added a comment - I was saying the way you have it is good
          Hide
          stack added a comment -

          Lars Hofhansl So you are for option #1 (unless option #3 shows well in testing I suppose).

          Show
          stack added a comment - Lars Hofhansl So you are for option #1 (unless option #3 shows well in testing I suppose).
          Hide
          Lars Hofhansl added a comment -

          Yeah option #1.
          I.e.:

          • Have config in hbase-site.xml to enable the bucketcache (default off)
          • when bucket cache is enabled via config it is the default (not schema changes required)
          • folks can selectively pull tables into L1 only via a schema change

          We can also make the config option default on going forward. That'd be almost identical to your option #2, only that it could disabled via a config. But maybe we do not want more configs?

          #3 will be incredibly hard to get right for all cases and lead to double caching and potentially even more GC as we churn blocks through L2 to L1 and back.

          Show
          Lars Hofhansl added a comment - Yeah option #1. I.e.: Have config in hbase-site.xml to enable the bucketcache (default off) when bucket cache is enabled via config it is the default (not schema changes required) folks can selectively pull tables into L1 only via a schema change We can also make the config option default on going forward. That'd be almost identical to your option #2, only that it could disabled via a config. But maybe we do not want more configs? #3 will be incredibly hard to get right for all cases and lead to double caching and potentially even more GC as we churn blocks through L2 to L1 and back.
          Hide
          stack added a comment -

          We can also make the config option default on going forward. That'd be almost identical to your option #2, only that it could disabled via a config. But maybe we do not want more configs?

          No more configs!

          Maybe go #1 for a release then in 1.1, enable bucket cache as default. Thanks Lars Hofhansl

          Show
          stack added a comment - We can also make the config option default on going forward. That'd be almost identical to your option #2, only that it could disabled via a config. But maybe we do not want more configs? No more configs! Maybe go #1 for a release then in 1.1, enable bucket cache as default. Thanks Lars Hofhansl
          Hide
          Andrew Purtell added a comment -

          I vote for option #1 but with CACHE_DATA_IN_L1 defaulting to false if bucket cache is configured.

          Show
          Andrew Purtell added a comment - I vote for option #1 but with CACHE_DATA_IN_L1 defaulting to false if bucket cache is configured.
          Hide
          Enis Soztutar added a comment -

          Closing this issue after 0.99.0 release.

          Show
          Enis Soztutar added a comment - Closing this issue after 0.99.0 release.

            People

            • Assignee:
              stack
              Reporter:
              stack
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development