|
Patch looks great Tom.
You pass 'length' in the below but its not used: + protected FSDataInputStream openFile(FileSystem fs, Path file, + int bufferSize, long length) throws IOException { + return fs.open(file, bufferSize); I presume you have plans for it later? You have confidence in the LruMap class? You don't have unit tests (though these things are hard to test). I ask because though small, sometimes these kinds of classes can prove a little tricky.... Do you have any numbers for how it improves throughput when cached blocks are 'hot'? And you talked of a slight 'cost'. Do you have rough numbers for that too? (Playing on datanode adjusting the size of the CRC blocks, a similar type of blocking to what you have here, there was no discernable difference adjusting sizes). What do we need to add to make it so its easy to enable/disable this feature on a per-column basis? Currently edits to column config. requires taking column offline. Changing this configuration looks safe-to-do while the column stays on line. Would you agree?
It is used in the subclass of SequenceFile.Reader by BlockFSInputStream.
I haven't got any numbers yet (working on them), but random reads will suffer in general since a whole 64KB block is retrieved to just read a single key/value. The Bigtable paper talks about reducing the block size to 8KB (see section 7).
Agreed. I think that dynamically editing a column descriptor should go in a separate jira issue. For now, I was planning on just adding the new parameters to HColumnDescriptor. Does the version number need bumping in this case? I'm trying to add a new parameter to HColumnDescriptor and would appreciate a little guidance. Do I need to worry about the version number? Is the order of the serialized fields important? It would be nice to group together the caching related ones if possible, so the block cache parameter would naturally sit next to the inMemory one. Ditto for the Thrift representation - how does it handle versioning? Thanks.
Tom,
Yes, we need to start versioning everything that goes out to disk. And if we make an incompatible change, we either need to correct for it on the fly or augment the migration tool (hbase.util.Migrate.java) This version (v3) changes the cache to a memory sensitive cache, implemented using SoftReferences (http://commons.apache.org/collections/api-release/org/apache/commons/collections/map/ReferenceMap.html
Also, block caching can be enabled on a column-family basis. The size of the block is a system wide setting - this could be adjustable on a per-column basis in the future, if it were deemed necessary. I'm still looking at a performance comparison. (... continuing IRC discussion).
I didn't realize HColumnDescriptor was versioned. It doesn't seem to have been added by either Jim or I. Someone smarter no doubt. So, my comment that this change is incompatible doesn't hold since I see you have code to make it so HCD migrates itself. Nice. In the below from HStoreFile, blockCacheEnabled method argument is not being passed to the MapFile constructors. + public synchronized MapFile.Reader getReader(final FileSystem fs, + final Filter bloomFilter, final boolean blockCacheEnabled) + throws IOException { + + if (isReference()) { + return new HStoreFile.HalfMapFileReader(fs, + getMapFilePath(reference).toString(), conf, + reference.getFileRegion(), reference.getMidkey(), bloomFilter); + } + return new BloomFilterMapFile.Reader(fs, getMapFilePath().toString(), + conf, bloomFilter); + } Out of interest, did you regenerate the thrift or hand-edit it? Changes look right – just wondering. Default ReferenceMap constructor makes for hard keys and soft values. If value has been let go by the GC, does the corresponding key just stay in the Map? Otherwise, patch looks great Tom.
Thanks - this had the effect of never enabling the cache! I've fixed this.
I regenerated using the latest thrift trunk.
No, both the key and the value are removed from the map - I checked the source. This patch also includes changes to HBase Shell so you can alter a table to enable block caching. I ran some benchmarks of PerformanceEvaluation with and without block caching enabled. The setup was similar to that described in http://wiki.apache.org/hadoop/Hbase/PerformanceEvaluation
Number of operations per second:
I've seen quite a lot of variation in the results of PerformanceEvaluation, so I'm reluctant to read too much into these figures. But I think we can say that the block cache doesn't seem to slow down the system. -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12374558/hadoop-blockcache-v5.patch against trunk revision 616796. @author +1. The patch does not contain any @author tags. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new javac compiler warnings. release audit +1. The applied patch does not generate any new release audit warnings. findbugs -1. The patch appears to introduce 4 new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1721/testReport/ This message is automatically generated. Integrated in Hadoop-trunk #387 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/387/
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12374684/hadoop-blockcache-v6.patch against trunk revision 616796. @author +1. The patch does not contain any @author tags. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new javac compiler warnings. release audit +1. The applied patch does not generate any new release audit warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests -1. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1736/testReport/ This message is automatically generated. Tom, I backed out the hbase component of this patch temporarily. Notion is that we get hbase fixed up over in its new svn home, then we branch. We want the branch to go against hadoop-0.16.0. Once branch is done, we'll put this patch back into hbase TRUNK (With this patch in place, hbase requires post 0.16.0 hadoop).
Reopening issue. Patch was not fully backed out of HBase.
A new patch (v7) with just the HBase parts in it. I successfully ran the HBase unit tests with this patch by using a Hadoop Core 0.16 jar that had been patched with the MapFile and SequenceFile changes in core trunk.
This can be applied to trunk after the branch is created. Jim/Stack/Bryan: Sorry about the extra work I caused you by committing too early! I'm trying to apply this patch to Hbase 1.3. v7 patch has only the HBase parts and seems not to be works with previous patch's hdfs part (sequenceFile and mapFile part). v7 patch use not MapFile.Reader( ..., blocksize, maxBlockCacheSize) but MapFile.Reader( ..., blockCacheEnabled) . Could you upload v7 patch file including hdfs parts?
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
BlockFSInputStream reads a FSInputStream in a block-oriented manner, and caches blocks. There's also a BlockMapFile.Reader that uses a BlockFSInputStream to read the MapFile data. HStore uses a BlockMapFile.Reader to read the first HStoreFile - at startup and after compaction. New HStoreFiles produced after memcache flushes are read using a regular reader in order to keep memory use fixed. Currently block caching is configured by the hbase properties hbase.hstore.blockCache.maxSize (defaults to 0 - no cache) and hbase.hstore.blockCache.blockSize (defaults to 64k). (It would be desirable to make caches configurable on a per-column family basis - the current way is just a stop gap.)
I've also had to push details of the block caching implementation up to MapFile.Reader, which is undesirable. The problem is that the streams are opened in the constructor of SequenceFile.Reader, which is called by the constructor of MapFile.Reader, so there is no opportunity to set the final fields blockSize and maxBlockCacheSize on a subclass of MapFile.Reader before the stream is opened. I think the proper solution is to have an explicit open method on SequenceFile.Reader, but I'm not sure about the impact of this since it would be an incompatible change. Perhaps do in conjunction with HADOOP-2604?