[HDFS-7435] PB encoding of block reports is very inefficient - ASF JIRA

Daryn Sharp added a comment - 24/Nov/14 18:30

Adds a backwards-compatible that is more efficient for the primitive long array. Both array re-allocs and the unnecessary boxing/unboxing of the longs is eliminated.

Datanode detects namenode version to determine if it should use the current repeating field or a new more efficient buffer. CodedOutputStream is used to explicitly encode the longs into a buffer with # of longs followed by varint64 encoded longs.

Namenode detects if datanode has sent old or new format. If new format, it can immediately allocate a long[] of the correct size and decode straight into it which eliminates both re-allocs & unboxing overhead.

Daryn Sharp added a comment - 24/Nov/14 18:30 Adds a backwards-compatible that is more efficient for the primitive long array. Both array re-allocs and the unnecessary boxing/unboxing of the longs is eliminated. Datanode detects namenode version to determine if it should use the current repeating field or a new more efficient buffer. CodedOutputStream is used to explicitly encode the longs into a buffer with # of longs followed by varint64 encoded longs. Namenode detects if datanode has sent old or new format. If new format, it can immediately allocate a long[] of the correct size and decode straight into it which eliminates both re-allocs & unboxing overhead.

Jing Zhao added a comment - 24/Nov/14 19:04

Can we also use some blocked arrays (and each block with a size limit) instead of a contiguous array for the full block report? In practice we've seen big full block reports cause NameNode gc. This can be similar idea with ~~HDFS-4879~~.

Jing Zhao added a comment - 24/Nov/14 19:04 Can we also use some blocked arrays (and each block with a size limit) instead of a contiguous array for the full block report? In practice we've seen big full block reports cause NameNode gc. This can be similar idea with HDFS-4879 .

Hadoop QA added a comment - 24/Nov/14 19:16

+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12683375/HDFS-7435.patch
against trunk revision 555fa2d.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 1 new or modified test files.

+1 javac. The applied patch does not increase the total number of javac compiler warnings.

+1 javadoc. There were no new javadoc warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

+1 release audit. The applied patch does not increase the total number of release audit warnings.

+1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice.

+1 contrib tests. The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8821//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8821//console

This message is automatically generated.

Hadoop QA added a comment - 24/Nov/14 19:16 +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12683375/HDFS-7435.patch against trunk revision 555fa2d. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8821//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8821//console This message is automatically generated.

Daryn Sharp added a comment - 24/Nov/14 20:06

We've also seen large block reports aggravate GC, not due to decoded size, but due to the spew of garbage from the re-allocs and (un)boxing. This patch is basically taking us back to pre-PB performance & behavior.

The block report is ~44,000 blocks/MB which at our scale is a few megs per report. ~~HDFS-4879~~ chunked a list of blocks to delete because of 25M entries consuming ~400MB. I don't foresee blocks/node approaching that level anytime soon. In that light, it's probably premature to introduce chunking?

Daryn Sharp added a comment - 24/Nov/14 20:06 We've also seen large block reports aggravate GC, not due to decoded size, but due to the spew of garbage from the re-allocs and (un)boxing. This patch is basically taking us back to pre-PB performance & behavior. The block report is ~44,000 blocks/MB which at our scale is a few megs per report. HDFS-4879 chunked a list of blocks to delete because of 25M entries consuming ~400MB. I don't foresee blocks/node approaching that level anytime soon. In that light, it's probably premature to introduce chunking?

Jing Zhao added a comment - 24/Nov/14 21:37

I've seen some cluster generating block report each containing >1000,000 blocks, i.e., 20~30MB. Considering we're handling block reports from multiple hundred DataNodes I guess this can still cause issue but I'm not sure if this is the main cause of the gc in that cluster. But I agree we can make the current optimization first and do chunking separately if necessary.

Jing Zhao added a comment - 24/Nov/14 21:37 I've seen some cluster generating block report each containing >1000,000 blocks, i.e., 20~30MB. Considering we're handling block reports from multiple hundred DataNodes I guess this can still cause issue but I'm not sure if this is the main cause of the gc in that cluster. But I agree we can make the current optimization first and do chunking separately if necessary.

Suresh Srinivas added a comment - 24/Nov/14 21:44

jingzhao and daryn, I think chunking the block reports into a set of arrays is a good change to avoid possible GC promotion failures due to not finding contiguous memory in old generation heap. We should consider doing that along with this change to avoid unnecessary layout change.

Suresh Srinivas added a comment - 24/Nov/14 21:44 jingzhao and daryn , I think chunking the block reports into a set of arrays is a good change to avoid possible GC promotion failures due to not finding contiguous memory in old generation heap. We should consider doing that along with this change to avoid unnecessary layout change.

Daryn Sharp added a comment - 24/Nov/14 23:27

While I agree it's a questionably nice to have feature - see below - decoding the block report into a contiguous or non-contiguous array is an implementation detail that can be changed in another jira w/o updating the layout version.

This patch simply undoes the damage caused by moving from Writables to PBs. I changed it internally from long[] -> dynamically growing ArrayList<Long> -> PB -> dynamically growing ArrayList<Long> -> long[]; to simply fixed long[] -> PB -> fixed long[]. Further changes to segment the list in ways not previously done are feature creep.

If 20-30MB is going to cause a promotion failure in a namenode servicing a hundred millions of blocks - it's already game over. 2.x easily generates over 1GB garbage/sec at a mere ~20k ops/sec. That's an average of 54MB/request. This is why I'm submitting GC friendly changes.

Daryn Sharp added a comment - 24/Nov/14 23:27 While I agree it's a questionably nice to have feature - see below - decoding the block report into a contiguous or non-contiguous array is an implementation detail that can be changed in another jira w/o updating the layout version. This patch simply undoes the damage caused by moving from Writables to PBs. I changed it internally from long[] -> dynamically growing ArrayList<Long> -> PB -> dynamically growing ArrayList<Long> -> long[]; to simply fixed long[] -> PB -> fixed long[]. Further changes to segment the list in ways not previously done are feature creep. If 20-30MB is going to cause a promotion failure in a namenode servicing a hundred millions of blocks - it's already game over. 2.x easily generates over 1GB garbage/sec at a mere ~20k ops/sec. That's an average of 54MB/request. This is why I'm submitting GC friendly changes.

Suresh Srinivas added a comment - 25/Nov/14 00:16 - edited

While I agree it's a questionably nice to have feature

Not sure why you think it is questionably nice to have feature...

If 20-30MB is going to cause a promotion failure in a namenode servicing a hundred millions of blocks - it's already game over. 2.x easily generates over 1GB garbage/sec at a mere ~20k ops/sec.

Just so that we are on the same page, java arrays requires contiguous space in memory. In many installs, when namenode becomes unresponsive and datanodes end up sending block reports, these block reports get promoted to older generation (because namenode is processing them slowly). Since old generation may be fragmented, promotions can fail when large arrays need to be promoted.

That said, I am fine doing it in a subsequent jira. It will end up touching the same parts or replacing the code that you are adding.

Suresh Srinivas added a comment - 25/Nov/14 00:16 - edited While I agree it's a questionably nice to have feature Not sure why you think it is questionably nice to have feature... If 20-30MB is going to cause a promotion failure in a namenode servicing a hundred millions of blocks - it's already game over. 2.x easily generates over 1GB garbage/sec at a mere ~20k ops/sec. Just so that we are on the same page, java arrays requires contiguous space in memory. In many installs, when namenode becomes unresponsive and datanodes end up sending block reports, these block reports get promoted to older generation (because namenode is processing them slowly). Since old generation may be fragmented, promotions can fail when large arrays need to be promoted. That said, I am fine doing it in a subsequent jira. It will end up touching the same parts or replacing the code that you are adding.

Daryn Sharp added a comment - 25/Nov/14 14:57

We're definitely on the same page. I meant questionably in the sense that the garbage generation rate is so high that CMS (you mention fragmentation) will be slow, bloated with free lists, and not keeping up. Now granted, a 20M allocation is likely to be prematurely tenured - along with the IPC protobuf containing the large report. One solution/workaround to both is reducing the size of the individual BR via multiple storages. The storages don't have to be individual drives but just subdirs.

Segmenting shouldn't be an outright replacement. The decode will emit a long[][] which requires updating StorageBlockReport and BlockListAsLongs. Similar changes to the datanode, although not required for the namenode changes, will be more complex. I already considered segmenting. I found the complexity vs time vs benefit to be much lower than improvements to other subsystems.

Daryn Sharp added a comment - 25/Nov/14 14:57 We're definitely on the same page. I meant questionably in the sense that the garbage generation rate is so high that CMS (you mention fragmentation) will be slow, bloated with free lists, and not keeping up. Now granted, a 20M allocation is likely to be prematurely tenured - along with the IPC protobuf containing the large report. One solution/workaround to both is reducing the size of the individual BR via multiple storages. The storages don't have to be individual drives but just subdirs. Segmenting shouldn't be an outright replacement. The decode will emit a long[][] which requires updating StorageBlockReport and BlockListAsLongs . Similar changes to the datanode, although not required for the namenode changes, will be more complex. I already considered segmenting. I found the complexity vs time vs benefit to be much lower than improvements to other subsystems.

Jing Zhao added a comment - 26/Nov/14 02:37

Uploading a demo patch (based on Daryn's patch) for chunking. Still need to do more testing.

Jing Zhao added a comment - 26/Nov/14 02:37 Uploading a demo patch (based on Daryn's patch) for chunking. Still need to do more testing.

Yi Liu added a comment - 26/Nov/14 03:37 - edited

Hi guys, I think this is a good improvement.
I also find a similar issue related to this when I'm doing Hadoop RPC related optimization in my local branch.
As we all known that the blocks report from DNs may become very large in big cluster, and there is chance to cause full GC if there is no enough contiguous space in old generation.
We will try to reuse available connections for RPC calls, but when we process each rpc in the same connection, we will allocate a fresh heap byte buffer to store the rpc bytes data. The rpc message may be very large. So it will cause the same issue.
My thought is to reuse the data buffer in the connection, I will open a new JIRA to track it.

Yi Liu added a comment - 26/Nov/14 03:37 - edited Hi guys, I think this is a good improvement. I also find a similar issue related to this when I'm doing Hadoop RPC related optimization in my local branch. As we all known that the blocks report from DNs may become very large in big cluster, and there is chance to cause full GC if there is no enough contiguous space in old generation. We will try to reuse available connections for RPC calls, but when we process each rpc in the same connection, we will allocate a fresh heap byte buffer to store the rpc bytes data. The rpc message may be very large. So it will cause the same issue. My thought is to reuse the data buffer in the connection, I will open a new JIRA to track it.

Hadoop QA added a comment - 26/Nov/14 06:24

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12683743/HDFS-7435.000.patch
against trunk revision a655973.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 10 new or modified test files.

+1 javac. The applied patch does not increase the total number of javac compiler warnings.

+1 javadoc. There were no new javadoc warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

+1 release audit. The applied patch does not increase the total number of release audit warnings.

-1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.server.namenode.TestStartup
org.apache.hadoop.hdfs.server.namenode.TestFileLimit
org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics
org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus
org.apache.hadoop.hdfs.TestDecommission
org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA
org.apache.hadoop.fs.TestSymlinkHdfsFileContext
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
org.apache.hadoop.hdfs.server.namenode.TestFSEditLogLoader
org.apache.hadoop.hdfs.TestSetrepDecreasing
org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyBlockManagement
org.apache.hadoop.hdfs.server.namenode.TestProcessCorruptBlocks
org.apache.hadoop.hdfs.TestReplication
org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot
org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyIsHot
org.apache.hadoop.hdfs.server.namenode.TestListCorruptFileBlocks
org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer
org.apache.hadoop.hdfs.server.namenode.TestHostsFiles
org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives
org.apache.hadoop.hdfs.server.namenode.TestINodeFile
org.apache.hadoop.hdfs.TestEncryptedTransfer

The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.server.namenode.ha.TestPendingCorruptDnMessages
org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing
org.apache.hadoop.hdfs.server.namenode.ha.TestHAAppend
org.apache.hadoop.hdfs.server.namenode.TestFsck

+1 contrib tests. The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8842//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8842//console

This message is automatically generated.

Hadoop QA added a comment - 26/Nov/14 06:24 -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12683743/HDFS-7435.000.patch against trunk revision a655973. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 10 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.TestStartup org.apache.hadoop.hdfs.server.namenode.TestFileLimit org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus org.apache.hadoop.hdfs.TestDecommission org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA org.apache.hadoop.fs.TestSymlinkHdfsFileContext org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover org.apache.hadoop.hdfs.server.namenode.TestFSEditLogLoader org.apache.hadoop.hdfs.TestSetrepDecreasing org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyBlockManagement org.apache.hadoop.hdfs.server.namenode.TestProcessCorruptBlocks org.apache.hadoop.hdfs.TestReplication org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyIsHot org.apache.hadoop.hdfs.server.namenode.TestListCorruptFileBlocks org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer org.apache.hadoop.hdfs.server.namenode.TestHostsFiles org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives org.apache.hadoop.hdfs.server.namenode.TestINodeFile org.apache.hadoop.hdfs.TestEncryptedTransfer The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestPendingCorruptDnMessages org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing org.apache.hadoop.hdfs.server.namenode.ha.TestHAAppend org.apache.hadoop.hdfs.server.namenode.TestFsck +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8842//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8842//console This message is automatically generated.

Yi Liu added a comment - 26/Nov/14 11:50

I open HADOOP-11339 to reuse buffer for Hadoop RPC as stated in my above comment.

Yi Liu added a comment - 26/Nov/14 11:50 I open HADOOP-11339 to reuse buffer for Hadoop RPC as stated in my above comment.

Daryn Sharp added a comment - 26/Nov/14 15:06

Upon quick scan, it appears the the chunk size is encoded in the buffer? If yes, I think each side should be able chose an appropriate chunk size - as an implementation detail - which is not leaked into the PB encoding. If we can agree on that point, can we move the segmentation to a separate jira?

Daryn Sharp added a comment - 26/Nov/14 15:06 Upon quick scan, it appears the the chunk size is encoded in the buffer? If yes, I think each side should be able chose an appropriate chunk size - as an implementation detail - which is not leaked into the PB encoding. If we can agree on that point, can we move the segmentation to a separate jira?

Jing Zhao added a comment - 26/Nov/14 18:21

In the current demo patch the max chunk size is not encoded in the buffer. It is currently simply determined by the DataNode based on configuration. The segmentation is done by the DataNode and NameNode does not need to know the max chunk size. For simplicity each chunk still follows the same format with the original long[] in BlockListAsLongs (i.e., it still encodes the number of finalized blocks and number of uc replicas in the first two elements).

I guess to let DataNode be the only side doing segmentation can avoid NameNode still allocating a big contiguous array before chunking. Thus I have to change the optional bytes blocksBuffer into repeated bytes blocksBuffers. Maybe we can use repeated bytes blocksBuffers here but assume the number of buffer is always 1, then move the real segmentation change into a separate jira?

Jing Zhao added a comment - 26/Nov/14 18:21 In the current demo patch the max chunk size is not encoded in the buffer. It is currently simply determined by the DataNode based on configuration. The segmentation is done by the DataNode and NameNode does not need to know the max chunk size. For simplicity each chunk still follows the same format with the original long[] in BlockListAsLongs (i.e., it still encodes the number of finalized blocks and number of uc replicas in the first two elements). I guess to let DataNode be the only side doing segmentation can avoid NameNode still allocating a big contiguous array before chunking. Thus I have to change the optional bytes blocksBuffer into repeated bytes blocksBuffers . Maybe we can use repeated bytes blocksBuffers here but assume the number of buffer is always 1, then move the real segmentation change into a separate jira?

Jing Zhao added a comment - 26/Nov/14 20:55

Fix a bug in the demo patch.

Jing Zhao added a comment - 26/Nov/14 20:55 Fix a bug in the demo patch.

Hadoop QA added a comment - 26/Nov/14 23:09

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12683918/HDFS-7435.001.patch
against trunk revision 8f1454c.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 10 new or modified test files.

+1 javac. The applied patch does not increase the total number of javac compiler warnings.

+1 javadoc. There were no new javadoc warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

+1 release audit. The applied patch does not increase the total number of release audit warnings.

-1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.TestEncryptionZonesWithKMS
org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer

The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.TestDFSClientRetries

+1 contrib tests. The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8851//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8851//console

This message is automatically generated.

Hadoop QA added a comment - 26/Nov/14 23:09 -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12683918/HDFS-7435.001.patch against trunk revision 8f1454c. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 10 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestEncryptionZonesWithKMS org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestDFSClientRetries +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8851//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8851//console This message is automatically generated.

Daryn Sharp added a comment - 01/Dec/14 17:24

Apologies if I wasn't clear. My original patch added a blocksBuffer PB field with encoding of size : long[]. The demo patch for chunking changes this new field into a repeating field. The resulting encoding is chunk-size : long[], chunk-size long[], etc.

I'm fine with chunking, but I don't think the tail should wag the dog. The in-memory representation of the blocks list is merely an implementation detail that should not influence the PB encoding. Ie. Leave blocksBuffer as a non-repeating field, and let the NN chunk the block list however it sees fit.

Daryn Sharp added a comment - 01/Dec/14 17:24 Apologies if I wasn't clear. My original patch added a blocksBuffer PB field with encoding of size : long[]. The demo patch for chunking changes this new field into a repeating field. The resulting encoding is chunk-size : long[], chunk-size long[], etc. I'm fine with chunking, but I don't think the tail should wag the dog. The in-memory representation of the blocks list is merely an implementation detail that should not influence the PB encoding. Ie. Leave blocksBuffer as a non-repeating field, and let the NN chunk the block list however it sees fit.

Suresh Srinivas added a comment - 01/Dec/14 18:28

The in-memory representation of the blocks list is merely an implementation detail that should not influence the PB encoding. Ie. Leave blocksBuffer as a non-repeating field, and let the NN chunk the block list however it sees fit.

I think the chunked buffers should be sent all the way from DataNode to the NameNode. NameNode can store it in a different way that fits it needs (though if it is chunked, I do not see NameNode storing it differently). This is important because the problems of large contiguous array is also a problem for DataNode, since there are deployments that use 60 disks in a single node with more than 10 million blocks in a single DataNode.

Suresh Srinivas added a comment - 01/Dec/14 18:28 The in-memory representation of the blocks list is merely an implementation detail that should not influence the PB encoding. Ie. Leave blocksBuffer as a non-repeating field, and let the NN chunk the block list however it sees fit. I think the chunked buffers should be sent all the way from DataNode to the NameNode. NameNode can store it in a different way that fits it needs (though if it is chunked, I do not see NameNode storing it differently). This is important because the problems of large contiguous array is also a problem for DataNode, since there are deployments that use 60 disks in a single node with more than 10 million blocks in a single DataNode.

Kihwal Lee added a comment - 03/Dec/14 17:21

This is important because the problems of large contiguous array is also a problem for DataNode, since there are deployments that use 60 disks in a single node with more than 10 million blocks in a single DataNode.

Just to be clear, the datanode and namenode require the same size of contiguous memory for the backing arrays today for the ArrayLists used internally by protobuf. So, this patch is not making the situation any worse. But I agree that chunking is a good idea and is better to be done along with this feature.

Kihwal Lee added a comment - 03/Dec/14 17:21 This is important because the problems of large contiguous array is also a problem for DataNode, since there are deployments that use 60 disks in a single node with more than 10 million blocks in a single DataNode. Just to be clear, the datanode and namenode require the same size of contiguous memory for the backing arrays today for the ArrayLists used internally by protobuf. So, this patch is not making the situation any worse. But I agree that chunking is a good idea and is better to be done along with this feature.

Daryn Sharp added a comment - 03/Dec/14 17:53

I agree that it makes sense for both the DN and NN to chunk the block list for their internal data structures. All I'm saying is that the wire protocol may not need to be built around that implementation detail - but I think a middle ground is

{ block-count, chunk-count, blocks[], chunk-count, blocks[], ... }

. However I think it can be done in a simpler and less invasive fashion. I'll toss another patch up this afternoon.

On a side note: we're having some DN OOM issues too during block reports + heavy load. In your use case, I'm presuming each disk is a storage. If yes, each unencoded storage report will be ~4MB, ~1.5MB encoded. The DN heap must be >64GB, easily more for HA, so a handful of MB appears to pales in comparison to how much memory the DN wastes building the reports. Again, there are default capacity ArrayLists to build the finalized & UC lists! I'm going to make minimal changes to reduce the gross inefficiency here too.

Daryn Sharp added a comment - 03/Dec/14 17:53 I agree that it makes sense for both the DN and NN to chunk the block list for their internal data structures. All I'm saying is that the wire protocol may not need to be built around that implementation detail - but I think a middle ground is { block-count, chunk-count, blocks[], chunk-count, blocks[], ... } . However I think it can be done in a simpler and less invasive fashion. I'll toss another patch up this afternoon. On a side note: we're having some DN OOM issues too during block reports + heavy load. In your use case, I'm presuming each disk is a storage. If yes, each unencoded storage report will be ~4MB, ~1.5MB encoded. The DN heap must be >64GB, easily more for HA, so a handful of MB appears to pales in comparison to how much memory the DN wastes building the reports. Again, there are default capacity ArrayLists to build the finalized & UC lists! I'm going to make minimal changes to reduce the gross inefficiency here too.

Suresh Srinivas added a comment - 03/Dec/14 18:22

kihwal, agree with you. We are on the same page.

daryn, some comments:

All I'm saying is that the wire protocol may not need to be built around that implementation detail - but I think a middle ground is

{ block-count, chunk-count, blocks[], chunk-count, blocks[], ... }

Have you looked at Jing's patch? Can you post comments on why it cannot be enhanced to add chunk-count? I believe its close to being ready with additional changes? Is there a need to do a branch new patch?

. In your use case, I'm presuming each disk is a storage. If yes, each unencoded storage report will be ~4MB, ~1.5MB encoded. The DN heap must be >64GB, easily more for HA, so a handful of MB appears to pales in comparison to how much memory the DN wastes building the reports.

I do not follow these numbers well. BTW for very large nodes (< 200 TB), Datanodes can be run with just 8G heap space. Not sure about the > 64GB number you are mentioning.

Suresh Srinivas added a comment - 03/Dec/14 18:22 kihwal , agree with you. We are on the same page. daryn , some comments: All I'm saying is that the wire protocol may not need to be built around that implementation detail - but I think a middle ground is { block-count, chunk-count, blocks[], chunk-count, blocks[], ... } Have you looked at Jing's patch? Can you post comments on why it cannot be enhanced to add chunk-count? I believe its close to being ready with additional changes? Is there a need to do a branch new patch? . In your use case, I'm presuming each disk is a storage. If yes, each unencoded storage report will be ~4MB, ~1.5MB encoded. The DN heap must be >64GB, easily more for HA, so a handful of MB appears to pales in comparison to how much memory the DN wastes building the reports. I do not follow these numbers well. BTW for very large nodes (< 200 TB), Datanodes can be run with just 8G heap space. Not sure about the > 64GB number you are mentioning.

Daryn Sharp added a comment - 15/Dec/14 23:14

Apologies for delay, I've been dealing with production issues, including block reports which are becoming a very big issue.

I've studied the way PB is implemented and I'm not sure fragmenting the buffer will add value. Proper encoding (mine is buggy) will not allocate a full buffer but uses a tree to hold fragments of the byte string. Decoding the PB will use the full byte array of the PB as the backing store for slices (ref to full array + offset/length). We've already paid the price for a large allocation of the full PB, carried it around in the Call object, etc, so extracting the field is essentially "free". Whether it's one or more is irrelevant.

I'm trying to performance test a patch that internally segments the BlockListAsLongs and correctly outputs the byte buffer.

Daryn Sharp added a comment - 15/Dec/14 23:14 Apologies for delay, I've been dealing with production issues, including block reports which are becoming a very big issue. I've studied the way PB is implemented and I'm not sure fragmenting the buffer will add value. Proper encoding (mine is buggy) will not allocate a full buffer but uses a tree to hold fragments of the byte string. Decoding the PB will use the full byte array of the PB as the backing store for slices (ref to full array + offset/length). We've already paid the price for a large allocation of the full PB, carried it around in the Call object, etc, so extracting the field is essentially "free". Whether it's one or more is irrelevant. I'm trying to performance test a patch that internally segments the BlockListAsLongs and correctly outputs the byte buffer.

Jing Zhao added a comment - 19/Feb/15 22:20

Rebase the patch.

I'm trying to performance test a patch that internally segments the BlockListAsLongs and correctly outputs the byte buffer.

daryn, do you still plan to post your updated patch and performance test results? I do not know the details of your new proposal here. But sounds like you will also segment the BlockListAsLongs?

Jing Zhao added a comment - 19/Feb/15 22:20 Rebase the patch. I'm trying to performance test a patch that internally segments the BlockListAsLongs and correctly outputs the byte buffer. daryn , do you still plan to post your updated patch and performance test results? I do not know the details of your new proposal here. But sounds like you will also segment the BlockListAsLongs?

Suresh Srinivas added a comment - 19/Feb/15 22:52

jingzhao, if you have a patch, can you please post it? In clusters with lot of small files and clusters with high density nodes, we have already seen issues related to large block reports. We can improve upon your patch in subsequent jiras.

Suresh Srinivas added a comment - 19/Feb/15 22:52 jingzhao , if you have a patch, can you please post it? In clusters with lot of small files and clusters with high density nodes, we have already seen issues related to large block reports. We can improve upon your patch in subsequent jiras.

Jing Zhao added a comment - 19/Feb/15 23:00

Yeah, the 002 patch is the new rebased patch.

Jing Zhao added a comment - 19/Feb/15 23:00 Yeah, the 002 patch is the new rebased patch.

Hadoop QA added a comment - 20/Feb/15 01:45

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12699754/HDFS-7435.002.patch
against trunk revision d49ae72.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 11 new or modified test files.

+1 javac. The applied patch does not increase the total number of javac compiler warnings.

+1 javadoc. There were no new javadoc warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

+1 release audit. The applied patch does not increase the total number of release audit warnings.

-1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.TestMiniDFSCluster
org.apache.hadoop.hdfs.tools.TestDFSHAAdminMiniCluster
org.apache.hadoop.hdfs.TestBalancerBandwidth
org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer

Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9622//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9622//console

This message is automatically generated.

Hadoop QA added a comment - 20/Feb/15 01:45 -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12699754/HDFS-7435.002.patch against trunk revision d49ae72. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 11 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestMiniDFSCluster org.apache.hadoop.hdfs.tools.TestDFSHAAdminMiniCluster org.apache.hadoop.hdfs.TestBalancerBandwidth org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9622//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9622//console This message is automatically generated.

Hadoop QA added a comment - 20/Feb/15 05:16

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12699754/HDFS-7435.002.patch
against trunk revision c0d9b93.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 11 new or modified test files.

+1 javac. The applied patch does not increase the total number of javac compiler warnings.

+1 javadoc. There were no new javadoc warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

+1 release audit. The applied patch does not increase the total number of release audit warnings.

-1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.web.TestWebHDFS
org.apache.hadoop.hdfs.tools.TestDFSHAAdminMiniCluster
org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer

Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9625//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9625//console

This message is automatically generated.

Hadoop QA added a comment - 20/Feb/15 05:16 -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12699754/HDFS-7435.002.patch against trunk revision c0d9b93. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 11 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.web.TestWebHDFS org.apache.hadoop.hdfs.tools.TestDFSHAAdminMiniCluster org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9625//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9625//console This message is automatically generated.

Daryn Sharp added a comment - 20/Feb/15 16:00

Let me get back to this patch this afternoon. I'll post perf numbers too.

Daryn Sharp added a comment - 20/Feb/15 16:00 Let me get back to this patch this afternoon. I'll post perf numbers too.

Suresh Srinivas added a comment - 20/Feb/15 16:11

daryn, can you please comment on jingzhao's patch and why it does not address the issue? It would be great to get this done early.

Suresh Srinivas added a comment - 20/Feb/15 16:11 daryn , can you please comment on jingzhao 's patch and why it does not address the issue? It would be great to get this done early.

Daryn Sharp added a comment - 23/Feb/15 18:00

Almost about to post a zero-copy block report impl via iterators so we don't need to worry about segmenting at all.

Daryn Sharp added a comment - 23/Feb/15 18:00 Almost about to post a zero-copy block report impl via iterators so we don't need to worry about segmenting at all.

Daryn Sharp added a comment - 24/Feb/15 17:32

Do I have the luxury of bumping the DN's minimum NN version? That would greatly simplify the implementation. It's easy for the NN to use the presence of the protobuf fields to determine if old/new. However the prior patches illustrate it's not so easy for the DN to auto-detect.

I believe the standard upgrade procedure is upgrade the NN, then rolling upgrade the DNs. Per above, upgraded NN supports old/new reports from DNs. The only scenario in which a "problem" can occur is the cluster is fully or partially upgraded, and the NN is downgraded. The new DNs won't be able to communicate with the old NN, hence why I'd like to bump the minimum version so the DN doesn't continue to send block reports that appear to be empty to the old NN. I'd argue that if the NN is downgraded, there's going to be downtime, so you might as well rollback the DNs too.

Thoughts?

Daryn Sharp added a comment - 24/Feb/15 17:32 Do I have the luxury of bumping the DN's minimum NN version? That would greatly simplify the implementation. It's easy for the NN to use the presence of the protobuf fields to determine if old/new. However the prior patches illustrate it's not so easy for the DN to auto-detect. I believe the standard upgrade procedure is upgrade the NN, then rolling upgrade the DNs. Per above, upgraded NN supports old/new reports from DNs. The only scenario in which a "problem" can occur is the cluster is fully or partially upgraded, and the NN is downgraded. The new DNs won't be able to communicate with the old NN, hence why I'd like to bump the minimum version so the DN doesn't continue to send block reports that appear to be empty to the old NN. I'd argue that if the NN is downgraded, there's going to be downtime, so you might as well rollback the DNs too. Thoughts?

Jing Zhao added a comment - 24/Feb/15 18:42

Thanks for sharing the thoughts, daryn. Why not posting your current patch first so that we can also have a better understanding about why bumping the DN's min NN version is necessary?

Jing Zhao added a comment - 24/Feb/15 18:42 Thanks for sharing the thoughts, daryn . Why not posting your current patch first so that we can also have a better understanding about why bumping the DN's min NN version is necessary?

Daryn Sharp added a comment - 25/Feb/15 23:42

I don't need to bump the DN's min NN version. I leveraged the auto-detection I did before.

So here's the deal. As wasteful as PBs can be in some cases, they are smart in other cases to avoid unnecessary copying. I've leveraged those apis to their maximum. I’ve also optimized BlockListAsLongs to remove intermediate encodings.

Instead of a pre-allocated array, a ByteString.Output stream can be used by a custom specified buffer size. Internally the ByteString is building up a balanced tree of rope buffers of the given size. Extracting the buffer via Output#toByteString doesn't really copy anything. So on the DN we have cheaply built up a blocks buffer of ByteString that is internally segmented.

I relented to sending the blocks buffer in a repeating field. ByteString provides no direct API for internally accessing its roped strings. However, we can slice it up with ByteString#substring. It prefers to create bounded instances (offset+len) much like String did in earlier JDKs. Alas, a wasted instantiation. However a substring that aligns with an entire rope buffer will get the actual buffer reference – not wrapped with a bounded instance. So I slice it up that way.

On the NN, if the new repeating field is not present, it decodes as old-style format. Otherwise it uses ByteString#concat to reassemble the blocks buffer fragments. Concat doesn't copy, but again builds a balanced tree for later decoding.

The best part is the encoding/decoding. BlockListAsLongs directly encodes the Replicas into a ByteString, not into a wasted intermediate long[]. To support the unlikely and surely shorted lived case of a new DN reporting to a old NN, the PB layer will decode the ByteString and re-encode into old-style longs. Not efficient, but it’s the least ugly way to handle something that will probably never happen.

BlockListAsLongs doesn’t decode the buffer into a wasted and intermediate long[]. Instead, the iterator decodes on-demand.

I also changed the format of the encoded buffer. Might as well while the patient is on the table. It’s documented in the code, but now finalized and uc blocks aren’t treated specially. The finalized and uc counts, followed by 3-long finalized, a 3-long delimiter, and 4-long uc blocks is now just total block count, 4-long blocks. Every block encodes its ReplicaState. One compelling reason is the capability to encode additional bits into the state field. One anticipated use is encoding the pinned block status.

Phew. I need to write some new tests to verify old/new compatibility. I’ll also manually test on a perf cluster.

Daryn Sharp added a comment - 25/Feb/15 23:42 I don't need to bump the DN's min NN version. I leveraged the auto-detection I did before. So here's the deal. As wasteful as PBs can be in some cases, they are smart in other cases to avoid unnecessary copying. I've leveraged those apis to their maximum. I’ve also optimized BlockListAsLongs to remove intermediate encodings. Instead of a pre-allocated array, a ByteString.Output stream can be used by a custom specified buffer size. Internally the ByteString is building up a balanced tree of rope buffers of the given size. Extracting the buffer via Output#toByteString doesn't really copy anything. So on the DN we have cheaply built up a blocks buffer of ByteString that is internally segmented. I relented to sending the blocks buffer in a repeating field. ByteString provides no direct API for internally accessing its roped strings. However, we can slice it up with ByteString#substring . It prefers to create bounded instances (offset+len) much like String did in earlier JDKs. Alas, a wasted instantiation. However a substring that aligns with an entire rope buffer will get the actual buffer reference – not wrapped with a bounded instance. So I slice it up that way. On the NN, if the new repeating field is not present, it decodes as old-style format. Otherwise it uses ByteString#concat to reassemble the blocks buffer fragments. Concat doesn't copy, but again builds a balanced tree for later decoding. The best part is the encoding/decoding. BlockListAsLongs directly encodes the Replicas into a ByteString , not into a wasted intermediate long[] . To support the unlikely and surely shorted lived case of a new DN reporting to a old NN, the PB layer will decode the ByteString and re-encode into old-style longs. Not efficient, but it’s the least ugly way to handle something that will probably never happen. BlockListAsLongs doesn’t decode the buffer into a wasted and intermediate long[] . Instead, the iterator decodes on-demand. I also changed the format of the encoded buffer. Might as well while the patient is on the table. It’s documented in the code, but now finalized and uc blocks aren’t treated specially. The finalized and uc counts, followed by 3-long finalized, a 3-long delimiter, and 4-long uc blocks is now just total block count, 4-long blocks. Every block encodes its ReplicaState . One compelling reason is the capability to encode additional bits into the state field. One anticipated use is encoding the pinned block status. Phew. I need to write some new tests to verify old/new compatibility. I’ll also manually test on a perf cluster.

Hadoop QA added a comment - 25/Feb/15 23:50

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12700906/HDFS-7435.patch
against trunk revision caa42ad.

-1 patch. The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9668//console

This message is automatically generated.

Hadoop QA added a comment - 25/Feb/15 23:50 -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12700906/HDFS-7435.patch against trunk revision caa42ad. -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9668//console This message is automatically generated.

Charles Lamb added a comment - 26/Feb/15 17:44

@daryn,

This looks really good. I like the new approach and your current patch does a pre-emptive strike on several of the comments I was going to make on the .002 patch.

I really only have nits.

The patch needs to be rebased. There was one .rej when I applied it (obviously I worked past that for my review).

BlockListAsLongs.java:

BlockListAsLongs(Collection) needs an @param for the javadoc.

In #BlockListAsLongs(Collection), the ReplicaState is being written as a varint64. I realize it's a varint, but since it's really only a single byte in the implementation, it seems a little heavy handed to write it to the cos as a varint64. I also realize that it will need to be a long on the way back out for the uc long[]. If you don't want to change it from being a varint64 in the cos, then perhaps just add a comment saying that you know it's a byte (actually int) in the impl but for consistency you're using a varint64?

Since you throw UnsupportedOperationException from multiple #remove methods, you might want to add the class name to the message. e.g. "Sorry. remove not implemented for BlockReportListIterator". Along a similar vein, would it be appropriate to add a message to BlockReportReplica.getVisibleLength, getStorageUuid, and isOnTransientStorage's UnsupportedOperationException?

BlockReportTestBase.java:

getBlockReports has one line that exceeds the 80 char width.

DatanodeProtocolClientSideTranslatorPB.java:

the import of NameNodeLayoutVersion is unused.

the decl of useBlocksBuffer busts the 80 char width.

DatanodeProtocolServerSideTranslatorPB.java:

import ByteString is unused.

FsDatasetImpl.java:

in #getBlockReports, the line under case RUR busts the 80 char limit.

NameNodeLayoutVersion.java:

Perhaps s/Protobuf optimized/Optimized protobuf/

NNThroughputBenchmark.java:

Thanks for fixing the formatting in here.

TestBlockHasMultipleReplicasOnSameDN.java:

blocks.add(... busts the 80 char limit.

Charles Lamb added a comment - 26/Feb/15 17:44 @daryn, This looks really good. I like the new approach and your current patch does a pre-emptive strike on several of the comments I was going to make on the .002 patch. I really only have nits. The patch needs to be rebased. There was one .rej when I applied it (obviously I worked past that for my review). BlockListAsLongs.java: BlockListAsLongs(Collection) needs an @param for the javadoc. In #BlockListAsLongs(Collection), the ReplicaState is being written as a varint64. I realize it's a varint, but since it's really only a single byte in the implementation, it seems a little heavy handed to write it to the cos as a varint64. I also realize that it will need to be a long on the way back out for the uc long[]. If you don't want to change it from being a varint64 in the cos, then perhaps just add a comment saying that you know it's a byte (actually int) in the impl but for consistency you're using a varint64? Since you throw UnsupportedOperationException from multiple #remove methods, you might want to add the class name to the message. e.g. "Sorry. remove not implemented for BlockReportListIterator". Along a similar vein, would it be appropriate to add a message to BlockReportReplica.getVisibleLength, getStorageUuid, and isOnTransientStorage's UnsupportedOperationException? BlockReportTestBase.java: getBlockReports has one line that exceeds the 80 char width. DatanodeProtocolClientSideTranslatorPB.java: the import of NameNodeLayoutVersion is unused. the decl of useBlocksBuffer busts the 80 char width. DatanodeProtocolServerSideTranslatorPB.java: import ByteString is unused. FsDatasetImpl.java: in #getBlockReports, the line under case RUR busts the 80 char limit. NameNodeLayoutVersion.java: Perhaps s/Protobuf optimized/Optimized protobuf/ NNThroughputBenchmark.java: Thanks for fixing the formatting in here. TestBlockHasMultipleReplicasOnSameDN.java: blocks.add(... busts the 80 char limit.

Daryn Sharp added a comment - 26/Feb/15 23:08

Thanks for the review, Charles. I think I've addressed your comments. I mildly disagree will overly verbose message for UnsupportedOperationExceptions since the JDK rarely uses message and the class & method is in the stack trace.

The ReplicaState, although a varint64, is actually just a byte on the wire. Using a varint64 just allows for future use of the bits.

I’ll note that I must backpedal a bit on some of my claims. The ByteString.Output creates new overflow buffers with larger capacity. That means substring returns bounded instances, not that big a deal, unless we assume the internal implementation of PBs. Even then, it returns bounded instances when it shouldn’t. Multiple buffers are required before it might return a non-bounded buffer. Therefor I optimized the single small buffer case to not substring.

I think the whole fragmented over-the-write buffer implementation is over engineered. The average encoded block is ~15 bytes. The majority of cluster aren’t likely to exceed one buffer (I reduced from 1MB to 64K) per storage, and if they do, multiple little allocations is more work for GC. Removing the intermediate long[] conversions is the real problem. But I digress.

I added comprehensive tests for client auto-detection of NN support; encoding & decoding as buffers and longs.

Daryn Sharp added a comment - 26/Feb/15 23:08 Thanks for the review, Charles. I think I've addressed your comments. I mildly disagree will overly verbose message for UnsupportedOperationExceptions since the JDK rarely uses message and the class & method is in the stack trace. The ReplicaState, although a varint64, is actually just a byte on the wire. Using a varint64 just allows for future use of the bits. I’ll note that I must backpedal a bit on some of my claims. The ByteString.Output creates new overflow buffers with larger capacity. That means substring returns bounded instances, not that big a deal, unless we assume the internal implementation of PBs. Even then, it returns bounded instances when it shouldn’t. Multiple buffers are required before it might return a non-bounded buffer. Therefor I optimized the single small buffer case to not substring. I think the whole fragmented over-the-write buffer implementation is over engineered. The average encoded block is ~15 bytes. The majority of cluster aren’t likely to exceed one buffer (I reduced from 1MB to 64K) per storage, and if they do, multiple little allocations is more work for GC. Removing the intermediate long[] conversions is the real problem. But I digress. I added comprehensive tests for client auto-detection of NN support; encoding & decoding as buffers and longs.

Charles Lamb added a comment - 26/Feb/15 23:15

I mildly disagree will overly verbose message for UnsupportedOperationExceptions since the JDK rarely uses message and the class & method is in the stack trace.

Yeah, I figured there was a reason for not having a message in the UOE, so it won't cause me any heartburn if you don't put them in.

I think the whole fragmented over-the-write buffer implementation is over engineered.

Oh, I kind of liked it.

I'll try to take a look at the new patch soon. Thanks for the reply.

Charles Lamb added a comment - 26/Feb/15 23:15 I mildly disagree will overly verbose message for UnsupportedOperationExceptions since the JDK rarely uses message and the class & method is in the stack trace. Yeah, I figured there was a reason for not having a message in the UOE, so it won't cause me any heartburn if you don't put them in. I think the whole fragmented over-the-write buffer implementation is over engineered. Oh, I kind of liked it. I'll try to take a look at the new patch soon. Thanks for the reply.

Hadoop QA added a comment - 27/Feb/15 02:29

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12701196/HDFS-7435.patch
against trunk revision bfbf076.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 10 new or modified test files.

+1 javac. The applied patch does not increase the total number of javac compiler warnings.

+1 javadoc. There were no new javadoc warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

-1 findbugs. The patch appears to introduce 2 new Findbugs (version 2.0.3) warnings.

+1 release audit. The applied patch does not increase the total number of release audit warnings.

-1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer
org.apache.hadoop.hdfs.server.namenode.TestFSImage

Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9673//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/9673//artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9673//console

This message is automatically generated.

Hadoop QA added a comment - 27/Feb/15 02:29 -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12701196/HDFS-7435.patch against trunk revision bfbf076. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 10 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. -1 findbugs . The patch appears to introduce 2 new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer org.apache.hadoop.hdfs.server.namenode.TestFSImage Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9673//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/9673//artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9673//console This message is automatically generated.

Daryn Sharp added a comment - 27/Feb/15 15:48

Fixed findbugs by adding equals and reverting back the switch. Fixed fsimage test that used a checkedin image by starting minicluster with upgrade. Fixed oev test by compensating for converted output having latest version so we don't have to keep updating the checkedin binary.

Daryn Sharp added a comment - 27/Feb/15 15:48 Fixed findbugs by adding equals and reverting back the switch. Fixed fsimage test that used a checkedin image by starting minicluster with upgrade. Fixed oev test by compensating for converted output having latest version so we don't have to keep updating the checkedin binary.

Daryn Sharp added a comment - 27/Feb/15 15:50

sureshms, jingzhao, could you please review this morning so we may hopefully incorporate this into our builds?

Daryn Sharp added a comment - 27/Feb/15 15:50 sureshms , jingzhao , could you please review this morning so we may hopefully incorporate this into our builds?

Hadoop QA added a comment - 27/Feb/15 19:26

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12701376/HDFS-7435.patch
against trunk revision a979f3b.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 12 new or modified test files.

+1 javac. The applied patch does not increase the total number of javac compiler warnings.

+1 javadoc. There were no new javadoc warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

-1 findbugs. The patch appears to introduce 1 new Findbugs (version 2.0.3) warnings.

+1 release audit. The applied patch does not increase the total number of release audit warnings.

+1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs.

Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9683//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/9683//artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9683//console

This message is automatically generated.

Hadoop QA added a comment - 27/Feb/15 19:26 -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12701376/HDFS-7435.patch against trunk revision a979f3b. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 12 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. -1 findbugs . The patch appears to introduce 1 new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9683//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/9683//artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9683//console This message is automatically generated.

Daryn Sharp added a comment - 27/Feb/15 19:53

Would have been nice if findbugs told me I needed equals and hashCode at the same time. :/

Daryn Sharp added a comment - 27/Feb/15 19:53 Would have been nice if findbugs told me I needed equals and hashCode at the same time. :/

Jing Zhao added a comment - 27/Feb/15 22:36

Thanks for updating the patch, Daryn. I will review the patch.

Jing Zhao added a comment - 27/Feb/15 22:36 Thanks for updating the patch, Daryn. I will review the patch.

Hadoop QA added a comment - 27/Feb/15 23:22

+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12701436/HDFS-7435.patch
against trunk revision 01a1621.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 12 new or modified test files.

+1 javac. The applied patch does not increase the total number of javac compiler warnings.

+1 javadoc. There were no new javadoc warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

+1 release audit. The applied patch does not increase the total number of release audit warnings.

+1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs.

Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9686//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9686//console

This message is automatically generated.

Hadoop QA added a comment - 27/Feb/15 23:22 +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12701436/HDFS-7435.patch against trunk revision 01a1621. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 12 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9686//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9686//console This message is automatically generated.

Charles Lamb added a comment - 03/Mar/15 16:43

Hi @daryn,

The new patch looks pretty good to me. Just a few nits.

FsDatasetImpl still has one line that exceeds the 80 chars, and there are a couple of unused imports in the new test TestBlockListAsLongs. Also in that test, IWBNI you could use the specific Mockito static imports that are needed rather than the * import.

Charles Lamb added a comment - 03/Mar/15 16:43 Hi @daryn, The new patch looks pretty good to me. Just a few nits. FsDatasetImpl still has one line that exceeds the 80 chars, and there are a couple of unused imports in the new test TestBlockListAsLongs. Also in that test, IWBNI you could use the specific Mockito static imports that are needed rather than the * import.

Jing Zhao added a comment - 03/Mar/15 22:11

Thanks for working on this, Daryn. The patch looks good to me in general. Some comments below:

Please add javadoc for BlockListAsLongs.
I think we'd better not include the replica state for finalized replicas. Though it is encoded using a varint, considering most of the blocks are finalized in the report, this can still increase the total size of the report. How about not including the finalized state and write an extra number in the header to indicate the total number of finalized replicas?
For BlockReportLongIterator, it can be more clean to only keep either iter or currentBlockIndex.
Now the iterator becomes an internal field of a BlockListAsLongs. I guess the main motivation here is to support both new and old encoding but avoid having both a ByteString and a long[] inside of BlockListsAsLongs? But in the meanwhile, this means there can be only one user of iterator() in the whole life time of BlockListAsLongs? This usage pattern is true in the current code, but is it good for BlockListAsLongs as a pure data structure?
Also in getBlockReportIterator, if iter is null, we're actually creating a new BlockListAsLongs instance and returning its iterator. Though this may not be hit by normal code path, this may not be very clean.
To address the above two comments, maybe what we can do is:
1) declare both the ByteString and the long[] in BlockListAsLongs and build iterator on top
2) or keep the old BlockListAsLongs class (but annotate it as deprecated) and use it for handling the old style encoding. In this way we can have a more clean new BlockListAsLongs, and we can delete the old one in the future.

Jing Zhao added a comment - 03/Mar/15 22:11 Thanks for working on this, Daryn. The patch looks good to me in general. Some comments below: Please add javadoc for BlockListAsLongs. I think we'd better not include the replica state for finalized replicas. Though it is encoded using a varint, considering most of the blocks are finalized in the report, this can still increase the total size of the report. How about not including the finalized state and write an extra number in the header to indicate the total number of finalized replicas? For BlockReportLongIterator , it can be more clean to only keep either iter or currentBlockIndex . Now the iterator becomes an internal field of a BlockListAsLongs. I guess the main motivation here is to support both new and old encoding but avoid having both a ByteString and a long[] inside of BlockListsAsLongs? But in the meanwhile, this means there can be only one user of iterator() in the whole life time of BlockListAsLongs? This usage pattern is true in the current code, but is it good for BlockListAsLongs as a pure data structure? Also in getBlockReportIterator , if iter is null, we're actually creating a new BlockListAsLongs instance and returning its iterator. Though this may not be hit by normal code path, this may not be very clean. To address the above two comments, maybe what we can do is: 1) declare both the ByteString and the long[] in BlockListAsLongs and build iterator on top 2) or keep the old BlockListAsLongs class (but annotate it as deprecated) and use it for handling the old style encoding. In this way we can have a more clean new BlockListAsLongs, and we can delete the old one in the future.

Daryn Sharp added a comment - 04/Mar/15 17:12

Thanks Jing & Charles. Addressing comments, will post later today.

Daryn Sharp added a comment - 04/Mar/15 17:12 Thanks Jing & Charles. Addressing comments, will post later today.

Daryn Sharp added a comment - 05/Mar/15 21:08

Added javadoc.
There's compelling reasons I must include the state. See below.
iter.hasNext is insufficient to replace currentBlockIndex. ie. the iter running dry doesn't mean all the expected blocks were read
Yes, the iterator is/was one-shot for decoding. I added multi-iterator support although I think it's over-engineering for a currently unneeded use case.
Gone by virtue of #5
Better yet: use an abstract factory pattern

Kihwal expressed offline concern at bumping the layout version. I added a capabilities long to the version request. It's currently expected to act as a bitmask but we can change that later. If we find that compression of block reports, per Colin, is viable then we can add another bit here so the DN can conditionally compress.

Regarding always including the 1-byte state. The format, among other reasons, was intended as a precusor to improved block report creation on the DN. Currently, the DN does 3 passes over the blocks: one to filter finalized vs uc; second for BlockListAsLongs to encode into List; third to transfer into the PB. I initially reduced it to 2 passes by removing the filtering, but intended to reduce it to 1 in the future, which I did last night for this patch. 1-pass precludes the ability to build a buffer of blocks with varying fields.

The extra bits in the 1-byte can be used for other purposes. Kihwal wants to pass back preferred block "stickyness" via this field. I thought about robbing the upper bits of the block size. However that increases the size of that varint by many bytes.

As performance conscience as we are with our clusters, which is why I'm doing this, a few tens of KB per storage report isn't even a concern. The size of the reports is not a problem, so much as the many hundreds of ms processing time and garbage generation. This patch will greatly reduce the garbage on both the NN and DN, expense of encoding/decoding, and add flexibility for the cost of 1 byte. If compression is viable, then it becomes a moot point.

Daryn Sharp added a comment - 05/Mar/15 21:08 Added javadoc. There's compelling reasons I must include the state. See below. iter.hasNext is insufficient to replace currentBlockIndex. ie. the iter running dry doesn't mean all the expected blocks were read Yes, the iterator is/was one-shot for decoding. I added multi-iterator support although I think it's over-engineering for a currently unneeded use case. Gone by virtue of #5 Better yet: use an abstract factory pattern Kihwal expressed offline concern at bumping the layout version. I added a capabilities long to the version request. It's currently expected to act as a bitmask but we can change that later. If we find that compression of block reports, per Colin, is viable then we can add another bit here so the DN can conditionally compress. Regarding always including the 1-byte state. The format, among other reasons, was intended as a precusor to improved block report creation on the DN. Currently, the DN does 3 passes over the blocks: one to filter finalized vs uc; second for BlockListAsLongs to encode into List; third to transfer into the PB. I initially reduced it to 2 passes by removing the filtering, but intended to reduce it to 1 in the future, which I did last night for this patch. 1-pass precludes the ability to build a buffer of blocks with varying fields. The extra bits in the 1-byte can be used for other purposes. Kihwal wants to pass back preferred block "stickyness" via this field. I thought about robbing the upper bits of the block size. However that increases the size of that varint by many bytes. As performance conscience as we are with our clusters, which is why I'm doing this, a few tens of KB per storage report isn't even a concern. The size of the reports is not a problem, so much as the many hundreds of ms processing time and garbage generation. This patch will greatly reduce the garbage on both the NN and DN, expense of encoding/decoding, and add flexibility for the cost of 1 byte. If compression is viable, then it becomes a moot point.

Hadoop QA added a comment - 06/Mar/15 19:28

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12702905/HDFS-7435.patch
against trunk revision 95bfd08.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 12 new or modified test files.

+1 javac. The applied patch does not increase the total number of javac compiler warnings.

+1 javadoc. There were no new javadoc warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

+1 release audit. The applied patch does not increase the total number of release audit warnings.

-1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.server.datanode.TestReadOnlySharedStorage
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
org.apache.hadoop.hdfs.server.balancer.TestBalancerWithEncryptedTransfer
org.apache.hadoop.hdfs.TestSetrepIncreasing
org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA
org.apache.hadoop.hdfs.TestDecommission
org.apache.hadoop.hdfs.server.balancer.TestBalancer
org.apache.hadoop.hdfs.server.datanode.TestSimulatedFSDataset

The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes
org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager
org.apache.hadoop.hdfs.TestInjectionForSimulatedStorage

Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9776//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9776//console

This message is automatically generated.

Hadoop QA added a comment - 06/Mar/15 19:28 -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12702905/HDFS-7435.patch against trunk revision 95bfd08. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 12 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.datanode.TestReadOnlySharedStorage org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover org.apache.hadoop.hdfs.server.balancer.TestBalancerWithEncryptedTransfer org.apache.hadoop.hdfs.TestSetrepIncreasing org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA org.apache.hadoop.hdfs.TestDecommission org.apache.hadoop.hdfs.server.balancer.TestBalancer org.apache.hadoop.hdfs.server.datanode.TestSimulatedFSDataset The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager org.apache.hadoop.hdfs.TestInjectionForSimulatedStorage Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9776//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9776//console This message is automatically generated.

Daryn Sharp added a comment - 11/Mar/15 16:17

Test failed because I stubbed the simulated dataset to not return reports... Fixed.

jingzhao, please review. We'd like to add this to our internal builds to help alleviate BR processing issues. We also want to leverage this change to speed up rolling upgrades by dumping/reading the encoded BR to disk which this make trivial to do.

Daryn Sharp added a comment - 11/Mar/15 16:17 Test failed because I stubbed the simulated dataset to not return reports... Fixed. jingzhao , please review. We'd like to add this to our internal builds to help alleviate BR processing issues. We also want to leverage this change to speed up rolling upgrades by dumping/reading the encoded BR to disk which this make trivial to do.

Hadoop QA added a comment - 11/Mar/15 20:00

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12703930/HDFS-7435.patch
against trunk revision 30c428a.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 12 new or modified test files.

+1 javac. The applied patch does not increase the total number of javac compiler warnings.

+1 javadoc. There were no new javadoc warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

+1 release audit. The applied patch does not increase the total number of release audit warnings.

-1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover

The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager
org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints

Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9838//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9838//console

This message is automatically generated.

Hadoop QA added a comment - 11/Mar/15 20:00 -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12703930/HDFS-7435.patch against trunk revision 30c428a. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 12 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9838//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9838//console This message is automatically generated.

Daryn Sharp added a comment - 11/Mar/15 21:33

Tests are passing for me. The grumbling about edit log corruption should have nothing to do with this patch.

Daryn Sharp added a comment - 11/Mar/15 21:33 Tests are passing for me. The grumbling about edit log corruption should have nothing to do with this patch.

Jing Zhao added a comment - 11/Mar/15 21:40

Thanks for updating the patch, Daryn. The latest patch looks good to me, and I think the new capabilities field in NamespaceInfo is a good idea. Some minors:

Need to fix javadoc for decodeBuffer, encode, and getBlocksBuffer in BlockListAsLongs.

Here "older" I guess you mean the current version compared with a future version. Maybe we should make it more clear in the comment.

    // reserve upper bits for future use. decoding masks off these bits to
    // allow compatibility for older namenodes
    private static long NUM_BYTES_MASK = (-1L) >>> (64 - 48);
    private static long REPLICA_STATE_MASK = (-1L) >>> (64 - 4);

Looks like the following code can be simplified and we actually do not need "isSupported"?

+    Capability(boolean isSupported) {
+      int bits = ordinal() - 1;
+      mask = (bits < 0) ? 0 : (1L << bits);
+      if (isSupported) {
+        CAPABILITIES_SUPPORTED |= mask;
+      }
+    }

Jing Zhao added a comment - 11/Mar/15 21:40 Thanks for updating the patch, Daryn. The latest patch looks good to me, and I think the new capabilities field in NamespaceInfo is a good idea. Some minors: Need to fix javadoc for decodeBuffer , encode , and getBlocksBuffer in BlockListAsLongs. Here "older" I guess you mean the current version compared with a future version. Maybe we should make it more clear in the comment. // reserve upper bits for future use. decoding masks off these bits to // allow compatibility for older namenodes private static long NUM_BYTES_MASK = (-1L) >>> (64 - 48); private static long REPLICA_STATE_MASK = (-1L) >>> (64 - 4); Looks like the following code can be simplified and we actually do not need "isSupported"? + Capability( boolean isSupported) { + int bits = ordinal() - 1; + mask = (bits < 0) ? 0 : (1L << bits); + if (isSupported) { + CAPABILITIES_SUPPORTED |= mask; + } + }

Daryn Sharp added a comment - 11/Mar/15 22:16

Yes, "older" is with relation to the future NN. I tried to clarify it in this patch. Updated the javadocs. The isSupported is intended for being able to flip off unsupported/deprecated capabilities so they aren't advertised to the DN.

Daryn Sharp added a comment - 11/Mar/15 22:16 Yes, "older" is with relation to the future NN. I tried to clarify it in this patch. Updated the javadocs. The isSupported is intended for being able to flip off unsupported/deprecated capabilities so they aren't advertised to the DN.

Hadoop QA added a comment - 11/Mar/15 22:21

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12704033/HDFS-7435.patch
against trunk revision 7a346bc.

-1 patch. Trunk compilation may be broken.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9843//console

This message is automatically generated.

Hadoop QA added a comment - 11/Mar/15 22:21 -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12704033/HDFS-7435.patch against trunk revision 7a346bc. -1 patch . Trunk compilation may be broken. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9843//console This message is automatically generated.

Jing Zhao added a comment - 11/Mar/15 22:35

nit-pick:

s/blocksBuffer/blocksBuf/ in the javadoc

  /**
   * Prepare an instance to in-place decode the given ByteString buffer
   * @param numBlocks - blocks in the buffer
   * @param blocksBuffer - ByteString encoded varints
   * @return BlockListAsLongs
   */
  public static BlockListAsLongs decodeBuffer(final int numBlocks,
      final ByteString blocksBuf) {
    return new BufferDecoder(numBlocks, blocksBuf);
  }

+1 after fixing it.

Jing Zhao added a comment - 11/Mar/15 22:35 nit-pick: s/blocksBuffer/blocksBuf/ in the javadoc /** * Prepare an instance to in-place decode the given ByteString buffer * @param numBlocks - blocks in the buffer * @param blocksBuffer - ByteString encoded varints * @ return BlockListAsLongs */ public static BlockListAsLongs decodeBuffer( final int numBlocks, final ByteString blocksBuf) { return new BufferDecoder(numBlocks, blocksBuf); } +1 after fixing it.

Daryn Sharp added a comment - 12/Mar/15 17:29

Done! Thanks Jing.

Daryn Sharp added a comment - 12/Mar/15 17:29 Done! Thanks Jing.

Hadoop QA added a comment - 12/Mar/15 17:31

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12704194/HDFS-7435.patch
against trunk revision 06ce1d9.

-1 patch. Trunk compilation may be broken.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9853//console

This message is automatically generated.

Hadoop QA added a comment - 12/Mar/15 17:31 -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12704194/HDFS-7435.patch against trunk revision 06ce1d9. -1 patch . Trunk compilation may be broken. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9853//console This message is automatically generated.

Hadoop QA added a comment - 13/Mar/15 17:54

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12704194/HDFS-7435.patch
against trunk revision 387f271.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 12 new or modified test files.

+1 javac. The applied patch does not increase the total number of javac compiler warnings.

+1 javadoc. There were no new javadoc warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

+1 release audit. The applied patch does not increase the total number of release audit warnings.

-1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA

The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager

Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9874//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9874//console

This message is automatically generated.

Hadoop QA added a comment - 13/Mar/15 17:54 -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12704194/HDFS-7435.patch against trunk revision 387f271. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 12 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9874//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9874//console This message is automatically generated.

Daryn Sharp added a comment - 13/Mar/15 18:19

The retry cache test failure is HDFS-7524. I cannot reproduce the DN manager timeout, and it's impossible to tell from pre-commit why it failed (no logs). The test normally runs in ~10s, but takes 3min during a prior successful run so maybe the build machine is overwhelmed and on the edge of timing out.

Daryn Sharp added a comment - 13/Mar/15 18:19 The retry cache test failure is HDFS-7524 . I cannot reproduce the DN manager timeout, and it's impossible to tell from pre-commit why it failed (no logs). The test normally runs in ~10s, but takes 3min during a prior successful run so maybe the build machine is overwhelmed and on the edge of timing out.

Kihwal Lee added a comment - 13/Mar/15 18:30

+1 I also ran TestDatanodeManager multiple times on my machine with this patch without any failure.

Kihwal Lee added a comment - 13/Mar/15 18:30 +1 I also ran TestDatanodeManager multiple times on my machine with this patch without any failure.

Hudson added a comment - 13/Mar/15 19:32

FAILURE: Integrated in Hadoop-trunk-Commit #7318 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7318/)
~~HDFS-7435~~. PB encoding of block reports is very inefficient. Contributed by Daryn Sharp. (kihwal: rev d324164a51a43d72c02567248bd9f0f12b244a40)

hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/DatanodeRegistration.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/NamespaceInfo.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolServerSideTranslatorPB.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/BlockListAsLongs.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockHasMultipleReplicasOnSameDN.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDnRespectsBlockReportSplitThreshold.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
hadoop-hdfs-project/hadoop-hdfs/src/main/proto/DatanodeProtocol.proto
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolClientSideTranslatorPB.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/BlockReportTestBase.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/StorageBlockReport.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailure.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSImage.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocol/TestBlockListAsLongs.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java
hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/TestOfflineEditsViewer.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDeadDatanode.java
hadoop-hdfs-project/hadoop-hdfs/src/main/proto/hdfs.proto
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java

Hudson added a comment - 13/Mar/15 19:32 FAILURE: Integrated in Hadoop-trunk-Commit #7318 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7318/ ) HDFS-7435 . PB encoding of block reports is very inefficient. Contributed by Daryn Sharp. (kihwal: rev d324164a51a43d72c02567248bd9f0f12b244a40) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/DatanodeRegistration.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/NamespaceInfo.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolServerSideTranslatorPB.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/BlockListAsLongs.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockHasMultipleReplicasOnSameDN.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDnRespectsBlockReportSplitThreshold.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java hadoop-hdfs-project/hadoop-hdfs/src/main/proto/DatanodeProtocol.proto hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolClientSideTranslatorPB.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/BlockReportTestBase.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/StorageBlockReport.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailure.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSImage.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocol/TestBlockListAsLongs.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/TestOfflineEditsViewer.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDeadDatanode.java hadoop-hdfs-project/hadoop-hdfs/src/main/proto/hdfs.proto hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java

Kihwal Lee added a comment - 13/Mar/15 19:43

Thanks everyone for reviews and Daryn for working on the patch.

Kihwal Lee added a comment - 13/Mar/15 19:43 Thanks everyone for reviews and Daryn for working on the patch.

Hudson added a comment - 14/Mar/15 11:36

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #132 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/132/)
~~HDFS-7435~~. PB encoding of block reports is very inefficient. Contributed by Daryn Sharp. (kihwal: rev d324164a51a43d72c02567248bd9f0f12b244a40)

hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolClientSideTranslatorPB.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/BlockReportTestBase.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDnRespectsBlockReportSplitThreshold.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/StorageBlockReport.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolServerSideTranslatorPB.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/NamespaceInfo.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/BlockListAsLongs.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDeadDatanode.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockHasMultipleReplicasOnSameDN.java
hadoop-hdfs-project/hadoop-hdfs/src/main/proto/DatanodeProtocol.proto
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocol/TestBlockListAsLongs.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSImage.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/TestOfflineEditsViewer.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java
hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailure.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/DatanodeRegistration.java
hadoop-hdfs-project/hadoop-hdfs/src/main/proto/hdfs.proto

Hudson added a comment - 14/Mar/15 11:36 FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #132 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/132/ ) HDFS-7435 . PB encoding of block reports is very inefficient. Contributed by Daryn Sharp. (kihwal: rev d324164a51a43d72c02567248bd9f0f12b244a40) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolClientSideTranslatorPB.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/BlockReportTestBase.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDnRespectsBlockReportSplitThreshold.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/StorageBlockReport.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolServerSideTranslatorPB.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/NamespaceInfo.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/BlockListAsLongs.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDeadDatanode.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockHasMultipleReplicasOnSameDN.java hadoop-hdfs-project/hadoop-hdfs/src/main/proto/DatanodeProtocol.proto hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocol/TestBlockListAsLongs.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSImage.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/TestOfflineEditsViewer.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailure.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/DatanodeRegistration.java hadoop-hdfs-project/hadoop-hdfs/src/main/proto/hdfs.proto

Hudson added a comment - 14/Mar/15 12:00

SUCCESS: Integrated in Hadoop-Yarn-trunk #866 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/866/)
~~HDFS-7435~~. PB encoding of block reports is very inefficient. Contributed by Daryn Sharp. (kihwal: rev d324164a51a43d72c02567248bd9f0f12b244a40)

hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocol/TestBlockListAsLongs.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/StorageBlockReport.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSImage.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/BlockReportTestBase.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolServerSideTranslatorPB.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailure.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDnRespectsBlockReportSplitThreshold.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/BlockListAsLongs.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDeadDatanode.java
hadoop-hdfs-project/hadoop-hdfs/src/main/proto/hdfs.proto
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/DatanodeRegistration.java
hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolClientSideTranslatorPB.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/NamespaceInfo.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockHasMultipleReplicasOnSameDN.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/TestOfflineEditsViewer.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
hadoop-hdfs-project/hadoop-hdfs/src/main/proto/DatanodeProtocol.proto
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java

Hudson added a comment - 14/Mar/15 12:00 SUCCESS: Integrated in Hadoop-Yarn-trunk #866 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/866/ ) HDFS-7435 . PB encoding of block reports is very inefficient. Contributed by Daryn Sharp. (kihwal: rev d324164a51a43d72c02567248bd9f0f12b244a40) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocol/TestBlockListAsLongs.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/StorageBlockReport.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSImage.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/BlockReportTestBase.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolServerSideTranslatorPB.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailure.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDnRespectsBlockReportSplitThreshold.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/BlockListAsLongs.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDeadDatanode.java hadoop-hdfs-project/hadoop-hdfs/src/main/proto/hdfs.proto hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/DatanodeRegistration.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolClientSideTranslatorPB.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/NamespaceInfo.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockHasMultipleReplicasOnSameDN.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/TestOfflineEditsViewer.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java hadoop-hdfs-project/hadoop-hdfs/src/main/proto/DatanodeProtocol.proto hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java

Hudson added a comment - 14/Mar/15 14:36

FAILURE: Integrated in Hadoop-Hdfs-trunk #2064 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2064/)
~~HDFS-7435~~. PB encoding of block reports is very inefficient. Contributed by Daryn Sharp. (kihwal: rev d324164a51a43d72c02567248bd9f0f12b244a40)

hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/DatanodeRegistration.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/NamespaceInfo.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockHasMultipleReplicasOnSameDN.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolServerSideTranslatorPB.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDnRespectsBlockReportSplitThreshold.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailure.java
hadoop-hdfs-project/hadoop-hdfs/src/main/proto/DatanodeProtocol.proto
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/StorageBlockReport.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocol/TestBlockListAsLongs.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolClientSideTranslatorPB.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/BlockReportTestBase.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/TestOfflineEditsViewer.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDeadDatanode.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/BlockListAsLongs.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSImage.java
hadoop-hdfs-project/hadoop-hdfs/src/main/proto/hdfs.proto
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java

Hudson added a comment - 14/Mar/15 14:36 FAILURE: Integrated in Hadoop-Hdfs-trunk #2064 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2064/ ) HDFS-7435 . PB encoding of block reports is very inefficient. Contributed by Daryn Sharp. (kihwal: rev d324164a51a43d72c02567248bd9f0f12b244a40) hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/DatanodeRegistration.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/NamespaceInfo.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockHasMultipleReplicasOnSameDN.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolServerSideTranslatorPB.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDnRespectsBlockReportSplitThreshold.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailure.java hadoop-hdfs-project/hadoop-hdfs/src/main/proto/DatanodeProtocol.proto hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/StorageBlockReport.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocol/TestBlockListAsLongs.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolClientSideTranslatorPB.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/BlockReportTestBase.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/TestOfflineEditsViewer.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDeadDatanode.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/BlockListAsLongs.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSImage.java hadoop-hdfs-project/hadoop-hdfs/src/main/proto/hdfs.proto hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java

Hudson added a comment - 14/Mar/15 15:01

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #123 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/123/)
~~HDFS-7435~~. PB encoding of block reports is very inefficient. Contributed by Daryn Sharp. (kihwal: rev d324164a51a43d72c02567248bd9f0f12b244a40)

hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSImage.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolServerSideTranslatorPB.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDnRespectsBlockReportSplitThreshold.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailure.java
hadoop-hdfs-project/hadoop-hdfs/src/main/proto/hdfs.proto
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocol/TestBlockListAsLongs.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/NamespaceInfo.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/BlockReportTestBase.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDeadDatanode.java
hadoop-hdfs-project/hadoop-hdfs/src/main/proto/DatanodeProtocol.proto
hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/TestOfflineEditsViewer.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolClientSideTranslatorPB.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/BlockListAsLongs.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/StorageBlockReport.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockHasMultipleReplicasOnSameDN.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/DatanodeRegistration.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java

Hudson added a comment - 14/Mar/15 15:01 FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #123 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/123/ ) HDFS-7435 . PB encoding of block reports is very inefficient. Contributed by Daryn Sharp. (kihwal: rev d324164a51a43d72c02567248bd9f0f12b244a40) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSImage.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolServerSideTranslatorPB.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDnRespectsBlockReportSplitThreshold.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailure.java hadoop-hdfs-project/hadoop-hdfs/src/main/proto/hdfs.proto hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocol/TestBlockListAsLongs.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/NamespaceInfo.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/BlockReportTestBase.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDeadDatanode.java hadoop-hdfs-project/hadoop-hdfs/src/main/proto/DatanodeProtocol.proto hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/TestOfflineEditsViewer.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolClientSideTranslatorPB.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/BlockListAsLongs.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/StorageBlockReport.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockHasMultipleReplicasOnSameDN.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/DatanodeRegistration.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java

Hudson added a comment - 14/Mar/15 15:17

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #132 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/132/)
~~HDFS-7435~~. PB encoding of block reports is very inefficient. Contributed by Daryn Sharp. (kihwal: rev d324164a51a43d72c02567248bd9f0f12b244a40)

hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSImage.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDeadDatanode.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/BlockReportTestBase.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolClientSideTranslatorPB.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocol/TestBlockListAsLongs.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/StorageBlockReport.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
hadoop-hdfs-project/hadoop-hdfs/src/main/proto/hdfs.proto
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockHasMultipleReplicasOnSameDN.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/BlockListAsLongs.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/TestOfflineEditsViewer.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/DatanodeRegistration.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailure.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDnRespectsBlockReportSplitThreshold.java
hadoop-hdfs-project/hadoop-hdfs/src/main/proto/DatanodeProtocol.proto
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/NamespaceInfo.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolServerSideTranslatorPB.java

Hudson added a comment - 14/Mar/15 15:17 FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #132 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/132/ ) HDFS-7435 . PB encoding of block reports is very inefficient. Contributed by Daryn Sharp. (kihwal: rev d324164a51a43d72c02567248bd9f0f12b244a40) hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSImage.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDeadDatanode.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/BlockReportTestBase.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolClientSideTranslatorPB.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocol/TestBlockListAsLongs.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/StorageBlockReport.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java hadoop-hdfs-project/hadoop-hdfs/src/main/proto/hdfs.proto hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockHasMultipleReplicasOnSameDN.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/BlockListAsLongs.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/TestOfflineEditsViewer.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/DatanodeRegistration.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailure.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDnRespectsBlockReportSplitThreshold.java hadoop-hdfs-project/hadoop-hdfs/src/main/proto/DatanodeProtocol.proto hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/NamespaceInfo.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolServerSideTranslatorPB.java

Hudson added a comment - 14/Mar/15 15:40

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2082 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2082/)
~~HDFS-7435~~. PB encoding of block reports is very inefficient. Contributed by Daryn Sharp. (kihwal: rev d324164a51a43d72c02567248bd9f0f12b244a40)

hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/NamespaceInfo.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocol/TestBlockListAsLongs.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/BlockReportTestBase.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/BlockListAsLongs.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockHasMultipleReplicasOnSameDN.java
hadoop-hdfs-project/hadoop-hdfs/src/main/proto/hdfs.proto
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/DatanodeRegistration.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSImage.java
hadoop-hdfs-project/hadoop-hdfs/src/main/proto/DatanodeProtocol.proto
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDnRespectsBlockReportSplitThreshold.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolClientSideTranslatorPB.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDeadDatanode.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java
hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/TestOfflineEditsViewer.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailure.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/StorageBlockReport.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolServerSideTranslatorPB.java
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java

Hudson added a comment - 14/Mar/15 15:40 SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2082 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2082/ ) HDFS-7435 . PB encoding of block reports is very inefficient. Contributed by Daryn Sharp. (kihwal: rev d324164a51a43d72c02567248bd9f0f12b244a40) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/NamespaceInfo.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocol/TestBlockListAsLongs.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/BlockReportTestBase.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/BlockListAsLongs.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockHasMultipleReplicasOnSameDN.java hadoop-hdfs-project/hadoop-hdfs/src/main/proto/hdfs.proto hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/DatanodeRegistration.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSImage.java hadoop-hdfs-project/hadoop-hdfs/src/main/proto/DatanodeProtocol.proto hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDnRespectsBlockReportSplitThreshold.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolClientSideTranslatorPB.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDeadDatanode.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/TestOfflineEditsViewer.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailure.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/StorageBlockReport.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolServerSideTranslatorPB.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java

Mingliang Liu added a comment - 25/Nov/15 00:07

In NNThroughputBenchmark class, the change

     void formBlockReport() {
       // fill remaining slots with blocks that do not exist
-      for(int idx = blocks.size()-1; idx >= nrBlocks; idx--)
-        blocks.set(idx, new Block(blocks.size() - idx, 0, 0));
-      blockReportList = new BlockListAsLongs(blocks).getBlockListAsLongs();
+      for (int idx = blocks.size()-1; idx >= nrBlocks; idx--) {
+        Block block = new Block(blocks.size() - idx, 0, 0);
+        blocks.set(idx, new BlockReportReplica(block));
+      }
+      blockReportList = BlockListAsLongs.EMPTY;

Why the last line is not building the blockReportList using blocks field as following?

      blockReportList = BlockListAsLongs.encode(blocks);

I may need more context to understand this patch, while my goal is to work on NNThroughputBenchmark. Thanks.

Mingliang Liu added a comment - 25/Nov/15 00:07 In NNThroughputBenchmark class, the change void formBlockReport() { // fill remaining slots with blocks that do not exist - for ( int idx = blocks.size()-1; idx >= nrBlocks; idx--) - blocks.set(idx, new Block(blocks.size() - idx, 0, 0)); - blockReportList = new BlockListAsLongs(blocks).getBlockListAsLongs(); + for ( int idx = blocks.size()-1; idx >= nrBlocks; idx--) { + Block block = new Block(blocks.size() - idx, 0, 0); + blocks.set(idx, new BlockReportReplica(block)); + } + blockReportList = BlockListAsLongs.EMPTY; Why the last line is not building the blockReportList using blocks field as following? blockReportList = BlockListAsLongs.encode(blocks); I may need more context to understand this patch, while my goal is to work on NNThroughputBenchmark . Thanks.

Konstantin Shvachko added a comment - 26/Nov/15 00:48

I agree this broke {NNThroughputBenchmark.BlockReportStats}} operation as it makes DNs always send an empty block report no matter what the parameters of the benchmark are.
Good catch Mingliang. Let's file a jira to fix this. daryn are you available to fix this?

Konstantin Shvachko added a comment - 26/Nov/15 00:48 I agree this broke {NNThroughputBenchmark.BlockReportStats}} operation as it makes DNs always send an empty block report no matter what the parameters of the benchmark are. Good catch Mingliang. Let's file a jira to fix this. daryn are you available to fix this?

Daryn Sharp added a comment - 30/Nov/15 15:16

liuml07, good catch! Not sure how I accidentally did that and I don't know why the test still passes. I think your proposed fix is correct. I know extremely little of the benchmark tool so I'd be unable to add a test (for the test!) in a timely manner, so if you'd like the credit for the bug then I'm fine with you doing the jira.

Daryn Sharp added a comment - 30/Nov/15 15:16 liuml07 , good catch! Not sure how I accidentally did that and I don't know why the test still passes. I think your proposed fix is correct. I know extremely little of the benchmark tool so I'd be unable to add a test (for the test!) in a timely manner, so if you'd like the credit for the bug then I'm fine with you doing the jira.

Mingliang Liu added a comment - 30/Nov/15 19:14

Thanks for your comment, daryn and shv. I started to work on NNThroughputBenchmark just recently and knew nothing about it before that. If the empty block report list in this patch was not intensional, I think I don't need more context of this patch. Sure I'd like to make the change as we discussed above. The jira is ~~HDFS-9484~~. Let's continue further discussion on the fix there.

The reason why the unit tests could pass may be that the TestNNThroughputBenchmark is rather a driver to run the benchmark with default parameters than a real unit test that asserts expected behavior for different scenarios. If we need a sophisticated unit test, perhaps we can address it separately.

Mingliang Liu added a comment - 30/Nov/15 19:14 Thanks for your comment, daryn and shv . I started to work on NNThroughputBenchmark just recently and knew nothing about it before that. If the empty block report list in this patch was not intensional, I think I don't need more context of this patch. Sure I'd like to make the change as we discussed above. The jira is HDFS-9484 . Let's continue further discussion on the fix there. The reason why the unit tests could pass may be that the TestNNThroughputBenchmark is rather a driver to run the benchmark with default parameters than a real unit test that asserts expected behavior for different scenarios. If we need a sophisticated unit test, perhaps we can address it separately.

Hadoop HDFS

PB encoding of block reports is very inefficient

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates