Issue Details (XML | Word | Printable)

Key: HBASE-288
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Trivial Trivial
Assignee: Tom White
Reporter: Jim Kellerman
Votes: 1
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Hadoop HBase

Add in-memory caching of data

Created: 21/May/07 06:27 PM   Updated: 27/Oct/08 11:12 PM
Return to search
Component/s: regionserver
Affects Version/s: None
Fix Version/s: 0.2.0

Time Tracking:
Not Specified

File Attachments:
  Size
Java Archive File Licensed for inclusion in ASF works commons-collections-3.2.jar 2008-01-18 05:16 PM Tom White 558 kB
Text File Licensed for inclusion in ASF works hadoop-blockcache-v2.patch 2008-01-16 01:58 PM Tom White 22 kB
Text File Licensed for inclusion in ASF works hadoop-blockcache-v3.patch 2008-01-18 05:14 PM Tom White 38 kB
Text File Licensed for inclusion in ASF works hadoop-blockcache-v4.1.patch 2008-01-21 03:08 PM Tom White 82 kB
Text File Licensed for inclusion in ASF works hadoop-blockcache-v4.patch 2008-01-21 11:48 AM Tom White 78 kB
Text File Licensed for inclusion in ASF works hadoop-blockcache-v5.patch 2008-02-01 02:45 PM Tom White 82 kB
Text File Licensed for inclusion in ASF works hadoop-blockcache-v6.patch 2008-02-04 03:26 PM Tom White 82 kB
Text File Licensed for inclusion in ASF works hadoop-blockcache-v7.patch 2008-02-05 04:03 PM Tom White 72 kB
Text File Licensed for inclusion in ASF works hadoop-blockcache.patch 2008-01-15 03:45 PM Tom White 21 kB
Issue Links:
Dependants
Duplicate
 
Incorporates
 
Reference
 

Resolution Date: 07/Feb/08 05:50 PM


 Description  « Hide
Bigtable provides two in-memory caches: one for row/column data and one for disk block caches.

The size of each cache should be configurable, data should be loaded lazily, and the cache managed by an LRU mechanism.

One complication of the block cache is that all data is read through a SequenceFile.Reader which ultimately reads data off of disk via a RPC proxy for ClientProtocol. This would imply that the block caching would have to be pushed down to either the DFSClient or SequenceFile.Reader



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Tom White added a comment - 15/Jan/08 03:45 PM
Here is an initial implementation - feedback would be much appreciated.

BlockFSInputStream reads a FSInputStream in a block-oriented manner, and caches blocks. There's also a BlockMapFile.Reader that uses a BlockFSInputStream to read the MapFile data. HStore uses a BlockMapFile.Reader to read the first HStoreFile - at startup and after compaction. New HStoreFiles produced after memcache flushes are read using a regular reader in order to keep memory use fixed. Currently block caching is configured by the hbase properties hbase.hstore.blockCache.maxSize (defaults to 0 - no cache) and hbase.hstore.blockCache.blockSize (defaults to 64k). (It would be desirable to make caches configurable on a per-column family basis - the current way is just a stop gap.)

I've also had to push details of the block caching implementation up to MapFile.Reader, which is undesirable. The problem is that the streams are opened in the constructor of SequenceFile.Reader, which is called by the constructor of MapFile.Reader, so there is no opportunity to set the final fields blockSize and maxBlockCacheSize on a subclass of MapFile.Reader before the stream is opened. I think the proper solution is to have an explicit open method on SequenceFile.Reader, but I'm not sure about the impact of this since it would be an incompatible change. Perhaps do in conjunction with HADOOP-2604?


stack added a comment - 15/Jan/08 08:04 PM
Patch looks great Tom.

You pass 'length' in the below but its not used:

+    protected FSDataInputStream openFile(FileSystem fs, Path file,
+        int bufferSize, long length) throws IOException {
+      return fs.open(file, bufferSize);

I presume you have plans for it later?

You have confidence in the LruMap class? You don't have unit tests (though these things are hard to test). I ask because though small, sometimes these kinds of classes can prove a little tricky....

Do you have any numbers for how it improves throughput when cached blocks are 'hot'? And you talked of a slight 'cost'. Do you have rough numbers for that too? (Playing on datanode adjusting the size of the CRC blocks, a similar type of blocking to what you have here, there was no discernable difference adjusting sizes).

What do we need to add to make it so its easy to enable/disable this feature on a per-column basis? Currently edits to column config. requires taking column offline. Changing this configuration looks safe-to-do while the column stays on line. Would you agree?


stack added a comment - 15/Jan/08 10:51 PM
Tom: Ignore comment above on LruMap. I just reread it.

Tom White added a comment - 16/Jan/08 01:58 PM

You pass 'length' in the below but its not used:

It is used in the subclass of SequenceFile.Reader by BlockFSInputStream.

Do you have any numbers for how it improves throughput when cached blocks are 'hot'?

I haven't got any numbers yet (working on them), but random reads will suffer in general since a whole 64KB block is retrieved to just read a single key/value. The Bigtable paper talks about reducing the block size to 8KB (see section 7).

What do we need to add to make it so its easy to enable/disable this feature on a per-column basis? Currently edits to column config. requires taking column offline. Changing this configuration looks safe-to-do while the column stays on line. Would you agree?

Agreed. I think that dynamically editing a column descriptor should go in a separate jira issue. For now, I was planning on just adding the new parameters to HColumnDescriptor. Does the version number need bumping in this case?


Tom White added a comment - 16/Jan/08 01:58 PM
A second patch with minimal changes to MapFile.Reader - there is a now a protected open() method for subclasses that wish to defer opening the streams until further initialization has been carried out.

Tom White added a comment - 17/Jan/08 09:45 PM
I'm trying to add a new parameter to HColumnDescriptor and would appreciate a little guidance. Do I need to worry about the version number? Is the order of the serialized fields important? It would be nice to group together the caching related ones if possible, so the block cache parameter would naturally sit next to the inMemory one. Ditto for the Thrift representation - how does it handle versioning? Thanks.

Jim Kellerman added a comment - 17/Jan/08 09:58 PM
Tom,

Yes, we need to start versioning everything that goes out to disk. And if we make an incompatible change, we either need to correct for it on the fly or augment the migration tool (hbase.util.Migrate.java)


Tom White added a comment - 18/Jan/08 05:14 PM
This version (v3) changes the cache to a memory sensitive cache, implemented using SoftReferences (http://commons.apache.org/collections/api-release/org/apache/commons/collections/map/ReferenceMap.html). See HADOOP-2624 for background.

Also, block caching can be enabled on a column-family basis. The size of the block is a system wide setting - this could be adjustable on a per-column basis in the future, if it were deemed necessary.

I'm still looking at a performance comparison.


Tom White added a comment - 18/Jan/08 05:16 PM
New dependency to go in src/contrib/hbase/lib/.

stack added a comment - 18/Jan/08 11:40 PM
(... continuing IRC discussion).

I didn't realize HColumnDescriptor was versioned. It doesn't seem to have been added by either Jim or I. Someone smarter no doubt. So, my comment that this change is incompatible doesn't hold since I see you have code to make it so HCD migrates itself. Nice.

In the below from HStoreFile, blockCacheEnabled method argument is not being passed to the MapFile constructors.

+  public synchronized MapFile.Reader getReader(final FileSystem fs,
+      final Filter bloomFilter, final boolean blockCacheEnabled)
+  throws IOException {
+    
+    if (isReference()) {
+      return new HStoreFile.HalfMapFileReader(fs,
+          getMapFilePath(reference).toString(), conf, 
+          reference.getFileRegion(), reference.getMidkey(), bloomFilter);
+    }
+    return new BloomFilterMapFile.Reader(fs, getMapFilePath().toString(),
+        conf, bloomFilter);
+  }

Out of interest, did you regenerate the thrift or hand-edit it? Changes look right – just wondering.

Default ReferenceMap constructor makes for hard keys and soft values. If value has been let go by the GC, does the corresponding key just stay in the Map?

Otherwise, patch looks great Tom.


Tom White added a comment - 21/Jan/08 11:48 AM - edited

In the below from HStoreFile, blockCacheEnabled method argument is not being passed to the MapFile constructors.

Thanks - this had the effect of never enabling the cache! I've fixed this.

Out of interest, did you regenerate the thrift or hand-edit it? Changes look right - just wondering.

I regenerated using the latest thrift trunk.

Default ReferenceMap constructor makes for hard keys and soft values. If value has been let go by the GC, does the corresponding key just stay in the Map?

No, both the key and the value are removed from the map - I checked the source.

This patch also includes changes to HBase Shell so you can alter a table to enable block caching.


Tom White added a comment - 21/Jan/08 03:08 PM
Fixing the v4 patch which was corrupt.

stack added a comment - 22/Jan/08 01:26 AM
Patch looks good Tom. I changed my mind since IRC this morning. Now I think hbase should align with the parent and not add new features since feature freeze untill after we make the 0.16 branch (Kick me on IRC if you think different).

Tom White added a comment - 22/Jan/08 12:37 PM
I ran some benchmarks of PerformanceEvaluation with and without block caching enabled. The setup was similar to that described in http://wiki.apache.org/hadoop/Hbase/PerformanceEvaluation, with three machines on EC2: one running the namenode and HBase master, one running a datanode and a region server, and one running a datanode and the PerformanceEvaluation program.

Number of operations per second:

Experiment Block cache disabled Block cache enabled
sequential reads 119 182
random reads 110 123

I've seen quite a lot of variation in the results of PerformanceEvaluation, so I'm reluctant to read too much into these figures. But I think we can say that the block cache doesn't seem to slow down the system.


Tom White added a comment - 01/Feb/08 02:45 PM
Regenerated patch that applies cleanly.

Hadoop QA added a comment - 01/Feb/08 04:23 PM
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12374558/hadoop-blockcache-v5.patch
against trunk revision 616796.

@author +1. The patch does not contain any @author tags.

javadoc +1. The javadoc tool did not generate any warning messages.

javac +1. The applied patch does not generate any new javac compiler warnings.

release audit +1. The applied patch does not generate any new release audit warnings.

findbugs -1. The patch appears to introduce 4 new Findbugs warnings.

core tests +1. The patch passed core unit tests.

contrib tests +1. The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1721/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1721/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1721/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1721/console

This message is automatically generated.


Hudson added a comment - 02/Feb/08 12:38 PM

Tom White added a comment - 04/Feb/08 03:26 PM
Fix for findbugs warnings. (Note that this has not been integrated yet - the previous comment by Hudson was generated since the collections jar has been checked in.)

Hadoop QA added a comment - 04/Feb/08 05:00 PM
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12374684/hadoop-blockcache-v6.patch
against trunk revision 616796.

@author +1. The patch does not contain any @author tags.

javadoc +1. The javadoc tool did not generate any warning messages.

javac +1. The applied patch does not generate any new javac compiler warnings.

release audit +1. The applied patch does not generate any new release audit warnings.

findbugs +1. The patch does not introduce any new Findbugs warnings.

core tests +1. The patch passed core unit tests.

contrib tests -1. The patch failed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1736/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1736/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1736/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1736/console

This message is automatically generated.


stack added a comment - 04/Feb/08 05:42 PM
Test passes on my machine. +1 on applying the patch after quick review of v6.

Tom White added a comment - 04/Feb/08 05:58 PM
I've just committed this.

stack added a comment - 04/Feb/08 09:08 PM
Tom, I backed out the hbase component of this patch temporarily. Notion is that we get hbase fixed up over in its new svn home, then we branch. We want the branch to go against hadoop-0.16.0. Once branch is done, we'll put this patch back into hbase TRUNK (With this patch in place, hbase requires post 0.16.0 hadoop).

Jim Kellerman added a comment - 05/Feb/08 12:02 AM
Reopening issue. Patch was not fully backed out of HBase.

Tom White added a comment - 05/Feb/08 04:03 PM
A new patch (v7) with just the HBase parts in it. I successfully ran the HBase unit tests with this patch by using a Hadoop Core 0.16 jar that had been patched with the MapFile and SequenceFile changes in core trunk.

This can be applied to trunk after the branch is created.

Jim/Stack/Bryan: Sorry about the extra work I caused you by committing too early!


stack added a comment - 07/Feb/08 05:50 PM
Committed. Thanks for the patch Tom.

Joonbok Lee added a comment - 30/Jun/08 08:27 AM
I'm trying to apply this patch to Hbase 1.3. v7 patch has only the HBase parts and seems not to be works with previous patch's hdfs part (sequenceFile and mapFile part). v7 patch use not MapFile.Reader( ..., blocksize, maxBlockCacheSize) but MapFile.Reader( ..., blockCacheEnabled) . Could you upload v7 patch file including hdfs parts?

stack added a comment - 01/Jul/08 03:51 AM
Joonbok: I thought this patch had already been applied (to hbase since 0.1 and the hadoop portion to hadoop since 0.16.x?) Is that why its not applying (Could be hard to figure it because a lot has changed in hbase since 02/07/08).