[HADOOP-10975] org.apache.hadoop.util.DataChecksum should support native checksum calculation - ASF JIRA

James Thomas added a comment - 19/Jun/14 18:57

It seems like the best approach here would be to add some buffering to FSOutputSummer. I did some testing and it seems like going to native code is strictly faster than using the incremental summer as long as flushes happen no less often than every 100 bytes. More specifically, if we use the native code, which has no option for incremental checksumming (it could be added in, but I don't think it's necessary), if we compute a checksum of a partial chunk, we will need to recompute the entire checksum (rather than just the checksum of the newly added bytes) when the chunk either 1) fills out or 2) is incremented a bit and then a flush occurs. But the native checksum on 512 bytes (the chunk size and the maximum possible size of a partial chunk) takes at most 7 microseconds on my machine, and the incremental Java checksum requires around 7 microseconds for 100 bytes. So as long as it's reasonable to assume flushes won't happen more often than every 100 bytes, I propose that we eliminate the incremental summer from FSOutputSummer and always use the native code for byte arrays.

Buffering FSOutputSummer will mean that writes on the wire are more bursty and data is kept in client memory longer, but Todd mentioned that many clients already wrap DFSOutputStream in a BufferedWriter anyway. To achieve maximum performance in the native code, we need the buffer to be at least a few thousand bytes. The larger the buffer, the better, since we can amortize away the cost of crossing the JNI boundary (which seems to be quite small actually, on the order of a microsecond at most).

What do people think about this?

James Thomas added a comment - 19/Jun/14 18:57 It seems like the best approach here would be to add some buffering to FSOutputSummer. I did some testing and it seems like going to native code is strictly faster than using the incremental summer as long as flushes happen no less often than every 100 bytes. More specifically, if we use the native code, which has no option for incremental checksumming (it could be added in, but I don't think it's necessary), if we compute a checksum of a partial chunk, we will need to recompute the entire checksum (rather than just the checksum of the newly added bytes) when the chunk either 1) fills out or 2) is incremented a bit and then a flush occurs. But the native checksum on 512 bytes (the chunk size and the maximum possible size of a partial chunk) takes at most 7 microseconds on my machine, and the incremental Java checksum requires around 7 microseconds for 100 bytes. So as long as it's reasonable to assume flushes won't happen more often than every 100 bytes, I propose that we eliminate the incremental summer from FSOutputSummer and always use the native code for byte arrays. Buffering FSOutputSummer will mean that writes on the wire are more bursty and data is kept in client memory longer, but Todd mentioned that many clients already wrap DFSOutputStream in a BufferedWriter anyway. To achieve maximum performance in the native code, we need the buffer to be at least a few thousand bytes. The larger the buffer, the better, since we can amortize away the cost of crossing the JNI boundary (which seems to be quite small actually, on the order of a microsecond at most). What do people think about this?

Andrew Wang added a comment - 24/Jun/14 00:32

+1 sounds good to me, pretty sure that flushes are normally bigger than 100B. Not really a usecase we're optimized for anyway.

Andrew Wang added a comment - 24/Jun/14 00:32 +1 sounds good to me, pretty sure that flushes are normally bigger than 100B. Not really a usecase we're optimized for anyway.

James Thomas added a comment - 11/Jul/14 01:37

Patch attached.

Was finally able to get some perf stat output comparing a 1 GB put operation with this change + ~~HDFS-6560~~ and without. I ran the put against a MiniDFSCluster so that everything was in a single process and we could easily measure cycle savings across the cluster. We see about a 10% reduction in CPU cycles and a 4% reduction in wall clock time. These measurements were taken on my personal computer (SSD, quad core i7).

***WITHOUT CHANGES***
james@james-ThinkPad-T530:~/Desktop/hadoop-common$ perf stat -r 20 java org.junit.runner.JUnitCore org.apache.hadoop.hdfs.TestRWCycles

Performance counter stats for 'java org.junit.runner.JUnitCore org.apache.hadoop.hdfs.TestWriteCycles' (20 runs):

14848.579075 task-clock (msec) # 1.637 CPUs utilized ( +- 0.37% )
75,748 context-switches # 0.005 M/sec ( +- 0.29% )
3,855 cpu-migrations # 0.260 K/sec ( +- 1.66% )
75,120 page-faults # 0.005 M/sec ( +- 2.15% )
50,376,887,490 cycles # 3.393 GHz ( +- 0.40% )
31,427,207,136 stalled-cycles-frontend # 62.38% frontend cycles idle ( +- 0.62% )
<not supported> stalled-cycles-backend
46,918,968,840 instructions # 0.93 insns per cycle

0.67 stalled cycles per insn ( +- 0.26% )
6,523,666,030 branches # 439.346 M/sec ( +- 0.39% )
304,740,080 branch-misses # 4.67% of all branches ( +- 0.19% )

9.068159220 seconds time elapsed ( +- 0.33% )

***WITH ~~HDFS-6560~~ and ~~HDFS-6561~~ (client-side checksum calculation and DN checksum verification both nativized)***
james@james-ThinkPad-T530:~/Desktop/hadoop-common$ perf stat -r 20 java org.junit.runner.JUnitCore org.apache.hadoop.hdfs.TestWriteCycles

Performance counter stats for 'java org.junit.runner.JUnitCore org.apache.hadoop.hdfs.TestRWCycles' (20 runs):

13703.187333 task-clock (msec) # 1.575 CPUs utilized ( +- 0.74% )
79,256 context-switches # 0.006 M/sec ( +- 0.27% )
3,161 cpu-migrations # 0.231 K/sec ( +- 2.45% )
72,138 page-faults # 0.005 M/sec ( +- 2.35% )
46,553,899,994 cycles # 3.397 GHz ( +- 0.83% )
30,670,949,579 stalled-cycles-frontend # 65.88% frontend cycles idle ( +- 1.08% )
<not supported> stalled-cycles-backend
35,616,207,543 instructions # 0.77 insns per cycle

0.86 stalled cycles per insn ( +- 0.73% )
6,614,919,487 branches # 482.729 M/sec ( +- 0.73% )
313,683,166 branch-misses # 4.74% of all branches ( +- 0.56% )

8.699152006 seconds time elapsed ( +- 0.44% )

James Thomas added a comment - 11/Jul/14 01:37 Patch attached. Was finally able to get some perf stat output comparing a 1 GB put operation with this change + HDFS-6560 and without. I ran the put against a MiniDFSCluster so that everything was in a single process and we could easily measure cycle savings across the cluster. We see about a 10% reduction in CPU cycles and a 4% reduction in wall clock time. These measurements were taken on my personal computer (SSD, quad core i7). *** WITHOUT CHANGES *** james@james-ThinkPad-T530:~/Desktop/hadoop-common$ perf stat -r 20 java org.junit.runner.JUnitCore org.apache.hadoop.hdfs.TestRWCycles Performance counter stats for 'java org.junit.runner.JUnitCore org.apache.hadoop.hdfs.TestWriteCycles' (20 runs): 14848.579075 task-clock (msec) # 1.637 CPUs utilized ( +- 0.37% ) 75,748 context-switches # 0.005 M/sec ( +- 0.29% ) 3,855 cpu-migrations # 0.260 K/sec ( +- 1.66% ) 75,120 page-faults # 0.005 M/sec ( +- 2.15% ) 50,376,887,490 cycles # 3.393 GHz ( +- 0.40% ) 31,427,207,136 stalled-cycles-frontend # 62.38% frontend cycles idle ( +- 0.62% ) <not supported> stalled-cycles-backend 46,918,968,840 instructions # 0.93 insns per cycle 0.67 stalled cycles per insn ( +- 0.26% ) 6,523,666,030 branches # 439.346 M/sec ( +- 0.39% ) 304,740,080 branch-misses # 4.67% of all branches ( +- 0.19% ) 9.068159220 seconds time elapsed ( +- 0.33% ) *** WITH HDFS-6560 and HDFS-6561 (client-side checksum calculation and DN checksum verification both nativized) *** james@james-ThinkPad-T530:~/Desktop/hadoop-common$ perf stat -r 20 java org.junit.runner.JUnitCore org.apache.hadoop.hdfs.TestWriteCycles Performance counter stats for 'java org.junit.runner.JUnitCore org.apache.hadoop.hdfs.TestRWCycles' (20 runs): 13703.187333 task-clock (msec) # 1.575 CPUs utilized ( +- 0.74% ) 79,256 context-switches # 0.006 M/sec ( +- 0.27% ) 3,161 cpu-migrations # 0.231 K/sec ( +- 2.45% ) 72,138 page-faults # 0.005 M/sec ( +- 2.35% ) 46,553,899,994 cycles # 3.397 GHz ( +- 0.83% ) 30,670,949,579 stalled-cycles-frontend # 65.88% frontend cycles idle ( +- 1.08% ) <not supported> stalled-cycles-backend 35,616,207,543 instructions # 0.77 insns per cycle 0.86 stalled cycles per insn ( +- 0.73% ) 6,614,919,487 branches # 482.729 M/sec ( +- 0.73% ) 313,683,166 branch-misses # 4.74% of all branches ( +- 0.56% ) 8.699152006 seconds time elapsed ( +- 0.44% )

Hadoop QA added a comment - 11/Jul/14 01:41

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12655130/HDFS-6561.patch
against trunk revision .

-1 patch. The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7328//console

This message is automatically generated.

Hadoop QA added a comment - 11/Jul/14 01:41 -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12655130/HDFS-6561.patch against trunk revision . -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7328//console This message is automatically generated.

James Thomas added a comment - 11/Jul/14 01:44

Need to consolidate this patch with the one in ~~HDFS-6560~~ for it to apply ... will wait for review first.

James Thomas added a comment - 11/Jul/14 01:44 Need to consolidate this patch with the one in HDFS-6560 for it to apply ... will wait for review first.

James Thomas added a comment - 18/Jul/14 01:53

Updated patch based on changes to ~~HADOOP-10838~~.

James Thomas added a comment - 18/Jul/14 01:53 Updated patch based on changes to HADOOP-10838 .

Hadoop QA added a comment - 18/Jul/14 02:02

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12656413/HDFS-6561.2.patch
against trunk revision .

-1 patch. The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7377//console

This message is automatically generated.

Hadoop QA added a comment - 18/Jul/14 02:02 -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12656413/HDFS-6561.2.patch against trunk revision . -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7377//console This message is automatically generated.

James Thomas added a comment - 18/Jul/14 18:09

Updated due to changes in ~~HADOOP-10838~~.

James Thomas added a comment - 18/Jul/14 18:09 Updated due to changes in HADOOP-10838 .

Hadoop QA added a comment - 18/Jul/14 18:40

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12656547/HDFS-6561.3.patch
against trunk revision .

-1 patch. The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7387//console

This message is automatically generated.

Hadoop QA added a comment - 18/Jul/14 18:40 -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12656547/HDFS-6561.3.patch against trunk revision . -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7387//console This message is automatically generated.

Todd Lipcon added a comment - 13/Aug/14 23:03

Hey James. I looked over this patch and wrote/ran some performance tests.

At first, I was a little concerned that the way you consolidated the code for bulk_crc32 would slow things down. In particular, there's now a new branch per chunk to determine whether to store or verify the CRC. I was worried that, if this branch were mispredicted, we'd pay an extra 15-20 cycles for every 512-byte chunk (which at 0.13 cycles/byte only takes ~66 cycles). That would represent a close to 20% performance regression.

So, I wrote a simple test which approximates exactly the HDFS usage of these APIs – ie 512 byte chunks and a reasonable amount of data. In this test, I found that the above concern was unwarranted - probably because the branch prediction unit does a very good job with the simple branch pattern here. I'll attach a version of your patch which includes the benchmark that I wrote in case anyone else wants to run it.

Here are my average timings for 512MB of 512-byte-chunked checksums (on my Intel(R) Core(TM) i7-4700MQ CPU @ 2.40GHz)

Before:
Calculate: 401275us (1.28GB/sec)
Verify: 41184us (12.43GB/sec

After:
Calculate: 41808us (12.25GB/sec)
Verify: 41604us (12.31GB/sec)

These seem to match earlier results you've posted elsewhere - just wanted to confirm on my machine and make sure that the existing verify code path didn't regress due to the new functionality.

For ease of review, I also think it makes sense to split this patch up a little further, and make this JIRA only do the changes to the checksumming code to allow for native calculation. The changes to FSOutputSummer, DFSOutputStream, etc, are a bit more complex and probably should be reviewed separately. I took the liberty of removing those chunks from the patch as I was testing it, so I'll upload that here and you can take a look.

Given the above, I only reviewed the portion related to checksumming and didn't yet look in detail at the outputsummer, etc, changes.

Todd Lipcon added a comment - 13/Aug/14 23:03 Hey James. I looked over this patch and wrote/ran some performance tests. At first, I was a little concerned that the way you consolidated the code for bulk_crc32 would slow things down. In particular, there's now a new branch per chunk to determine whether to store or verify the CRC. I was worried that, if this branch were mispredicted, we'd pay an extra 15-20 cycles for every 512-byte chunk (which at 0.13 cycles/byte only takes ~66 cycles). That would represent a close to 20% performance regression. So, I wrote a simple test which approximates exactly the HDFS usage of these APIs – ie 512 byte chunks and a reasonable amount of data. In this test, I found that the above concern was unwarranted - probably because the branch prediction unit does a very good job with the simple branch pattern here. I'll attach a version of your patch which includes the benchmark that I wrote in case anyone else wants to run it. Here are my average timings for 512MB of 512-byte-chunked checksums (on my Intel(R) Core(TM) i7-4700MQ CPU @ 2.40GHz) Before : Calculate: 401275us (1.28GB/sec) Verify: 41184us (12.43GB/sec After : Calculate: 41808us (12.25GB/sec) Verify: 41604us (12.31GB/sec) These seem to match earlier results you've posted elsewhere - just wanted to confirm on my machine and make sure that the existing verify code path didn't regress due to the new functionality. For ease of review, I also think it makes sense to split this patch up a little further, and make this JIRA only do the changes to the checksumming code to allow for native calculation. The changes to FSOutputSummer, DFSOutputStream, etc, are a bit more complex and probably should be reviewed separately. I took the liberty of removing those chunks from the patch as I was testing it, so I'll upload that here and you can take a look. Given the above, I only reviewed the portion related to checksumming and didn't yet look in detail at the outputsummer, etc, changes.

Todd Lipcon added a comment - 13/Aug/14 23:10

Here's James's patch with the following changes:

only the hadoop side changes, not the HDFS changes yet (which we can break out elsewhere since they need more careful review)
adds a perf test for our common use case
fixed one trivial whitespace error in james's patch (bad indentation)

I'm +1 on this, but since I made a couple changes/additions to the patch, we should probably have another committer take a quick look as well.

Todd Lipcon added a comment - 13/Aug/14 23:10 Here's James's patch with the following changes: only the hadoop side changes, not the HDFS changes yet (which we can break out elsewhere since they need more careful review) adds a perf test for our common use case fixed one trivial whitespace error in james's patch (bad indentation) I'm +1 on this, but since I made a couple changes/additions to the patch, we should probably have another committer take a quick look as well.

Colin McCabe added a comment - 14/Aug/14 00:47

Great idea.

The hdfs-6561-just-hadoop-changes.txt patch needs to be rebased... it didn't apply cleanly for me against trunk.

JNIEXPORT void JNICALL Java_org_apache_hadoop_util_NativeCrc32_nativeComputeChunkedSums
  (JNIEnv *env, jclass clazz,
    jint bytes_per_checksum, jint j_crc_type,
    jobject j_sums, jint sums_offset,
    jobject j_data, jint data_offset, jint data_len,
    jstring j_filename, jlong base_pos, jboolean verify)

Later, you use an if(likely) on the verify boolean. Rather than do this, why not just have a utility function that both nativeComputeChunkedSumsByteArray and nativeVerifyChunkedSums call?

-#include <stdint.h>
+#include <stdbool.h>

Please, no. There are a lot of older C compilers floating around out there that will choke on this. Plus we still need stdint.h, since we're using uint32_t, etc. etc. I don't think the C99 _Bool stuff adds a lot of type safety anyway, since any non-struct type can implicitly be converted to a bool, and a bool can be used as in int in many contexts.

Colin McCabe added a comment - 14/Aug/14 00:47 Great idea. The hdfs-6561-just-hadoop-changes.txt patch needs to be rebased... it didn't apply cleanly for me against trunk. JNIEXPORT void JNICALL Java_org_apache_hadoop_util_NativeCrc32_nativeComputeChunkedSums (JNIEnv *env, jclass clazz, jint bytes_per_checksum, jint j_crc_type, jobject j_sums, jint sums_offset, jobject j_data, jint data_offset, jint data_len, jstring j_filename, jlong base_pos, jboolean verify) Later, you use an if(likely) on the verify boolean. Rather than do this, why not just have a utility function that both nativeComputeChunkedSumsByteArray and nativeVerifyChunkedSums call? -#include <stdint.h> +#include <stdbool.h> Please, no. There are a lot of older C compilers floating around out there that will choke on this. Plus we still need stdint.h , since we're using uint32_t , etc. etc. I don't think the C99 _Bool stuff adds a lot of type safety anyway, since any non-struct type can implicitly be converted to a bool, and a bool can be used as in int in many contexts.

Hadoop QA added a comment - 14/Aug/14 01:10

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12661574/hdfs-6561-just-hadoop-changes.txt
against trunk revision .

-1 patch. The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7631//console

This message is automatically generated.

Hadoop QA added a comment - 14/Aug/14 01:10 -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12661574/hdfs-6561-just-hadoop-changes.txt against trunk revision . -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7631//console This message is automatically generated.

Todd Lipcon added a comment - 14/Aug/14 04:24

This patch doesn't apply because it's dependent on ~~HADOOP-10838~~. James, want to revise the patch based on Colin's comments above?

Todd Lipcon added a comment - 14/Aug/14 04:24 This patch doesn't apply because it's dependent on HADOOP-10838 . James, want to revise the patch based on Colin's comments above?

James Thomas added a comment - 14/Aug/14 23:03

Todd's patch with the second change suggested by Colin. cmccabe, as we discussed, I think it will be difficult to split up nativeComputeChunkedSums* into separate functions for calculate and verify and still share code between them. And as Todd noted above, the branch does not cause much performance overhead.

James Thomas added a comment - 14/Aug/14 23:03 Todd's patch with the second change suggested by Colin. cmccabe , as we discussed, I think it will be difficult to split up nativeComputeChunkedSums* into separate functions for calculate and verify and still share code between them. And as Todd noted above, the branch does not cause much performance overhead.

Hadoop QA added a comment - 14/Aug/14 23:24

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12661927/HDFS-6561.4.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 2 new or modified test files.

-1 javac. The patch appears to cause the build to fail.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7638//console

This message is automatically generated.

Hadoop QA added a comment - 14/Aug/14 23:24 -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12661927/HDFS-6561.4.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified test files. -1 javac . The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7638//console This message is automatically generated.

Colin McCabe added a comment - 14/Aug/14 23:26

So, I think that we could refactor this to be nicer. I think we really don't need the boolean everywhere, but we could refactor things into utility functions and two separate code paths. This would also gain back that tiny bit of performance.

But, that might involve a lot more work, so I'm ok with combining the code paths for now. In view of Todd's results, there's shouldn't be any significant performance regression. We can always split them later.

-    if (unlikely(crc != *sums)) {
+    if (unlikely(!store_or_verify(sums, crc, is_verify)))
       goto return_crc_error;
-    }

Let's not get rid of the braces here.

+1 once that's addressed. Thanks, James.

Colin McCabe added a comment - 14/Aug/14 23:26 So, I think that we could refactor this to be nicer. I think we really don't need the boolean everywhere, but we could refactor things into utility functions and two separate code paths. This would also gain back that tiny bit of performance. But, that might involve a lot more work, so I'm ok with combining the code paths for now. In view of Todd's results, there's shouldn't be any significant performance regression. We can always split them later. - if (unlikely(crc != *sums)) { + if (unlikely(!store_or_verify(sums, crc, is_verify))) goto return_crc_error; - } Let's not get rid of the braces here. +1 once that's addressed. Thanks, James.

James Thomas added a comment - 14/Aug/14 23:45

Thanks, Colin. Posted updated patch.

James Thomas added a comment - 14/Aug/14 23:45 Thanks, Colin. Posted updated patch.

Hadoop QA added a comment - 15/Aug/14 00:38

+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12661935/HDFS-6561.5.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 2 new or modified test files.

+1 javac. The applied patch does not increase the total number of javac compiler warnings.

+1 javadoc. There were no new javadoc warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

+1 release audit. The applied patch does not increase the total number of release audit warnings.

+1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common.

+1 contrib tests. The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7640//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7640//console

This message is automatically generated.

Hadoop QA added a comment - 15/Aug/14 00:38 +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12661935/HDFS-6561.5.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7640//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7640//console This message is automatically generated.

Todd Lipcon added a comment - 18/Aug/14 18:00

Cool, I'll commit this momentarily. Thanks for the extra +1 Colin.

Todd Lipcon added a comment - 18/Aug/14 18:00 Cool, I'll commit this momentarily. Thanks for the extra +1 Colin.

Todd Lipcon added a comment - 18/Aug/14 18:03

(moved to HADOOP project since this is only the Hadoop side changes)

Todd Lipcon added a comment - 18/Aug/14 18:03 (moved to HADOOP project since this is only the Hadoop side changes)

Colin McCabe added a comment - 18/Aug/14 18:05

Hmm. I just committed this under the old HDFS name. I also tried to change it to HADOOP, but found it couldn't be done without making it no longer a subtask.

Colin McCabe added a comment - 18/Aug/14 18:05 Hmm. I just committed this under the old HDFS name. I also tried to change it to HADOOP, but found it couldn't be done without making it no longer a subtask.

Todd Lipcon added a comment - 18/Aug/14 18:11

yea, Colin and I had a race condition. I was renaming the JIRA while he was committing it under the old name. I updated CHANGES.txt to refer to the new name.

Anyway, thanks for the contribution, James

Todd Lipcon added a comment - 18/Aug/14 18:11 yea, Colin and I had a race condition. I was renaming the JIRA while he was committing it under the old name. I updated CHANGES.txt to refer to the new name. Anyway, thanks for the contribution, James

Colin McCabe added a comment - 18/Aug/14 18:22

thanks, Todd and James.

Colin McCabe added a comment - 18/Aug/14 18:22 thanks, Todd and James.

Hudson added a comment - 19/Aug/14 00:16

FAILURE: Integrated in Hadoop-trunk-Commit #6086 (See https://builds.apache.org/job/Hadoop-trunk-Commit/6086/)
Update CHANGES.txt for ~~HDFS-6561~~ which was renamed to ~~HADOOP-10975~~. (todd: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1618686)

/hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt

Hudson added a comment - 19/Aug/14 00:16 FAILURE: Integrated in Hadoop-trunk-Commit #6086 (See https://builds.apache.org/job/Hadoop-trunk-Commit/6086/ ) Update CHANGES.txt for HDFS-6561 which was renamed to HADOOP-10975 . (todd: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1618686 ) /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt

Hudson added a comment - 19/Aug/14 00:16

FAILURE: Integrated in Hadoop-trunk-Commit #6086 (See https://builds.apache.org/job/Hadoop-trunk-Commit/6086/)
Update CHANGES.txt for ~~HDFS-6561~~ which was renamed to ~~HADOOP-10975~~. (todd: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1618686)

/hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt

Hudson added a comment - 19/Aug/14 00:16 FAILURE: Integrated in Hadoop-trunk-Commit #6086 (See https://builds.apache.org/job/Hadoop-trunk-Commit/6086/ ) Update CHANGES.txt for HDFS-6561 which was renamed to HADOOP-10975 . (todd: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1618686 ) /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt

Hudson added a comment - 19/Aug/14 16:20

FAILURE: Integrated in Hadoop-Yarn-trunk #651 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/651/)
Update CHANGES.txt for ~~HDFS-6561~~ which was renamed to ~~HADOOP-10975~~. (todd: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1618686)

/hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt

Hudson added a comment - 19/Aug/14 16:20 FAILURE: Integrated in Hadoop-Yarn-trunk #651 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/651/ ) Update CHANGES.txt for HDFS-6561 which was renamed to HADOOP-10975 . (todd: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1618686 ) /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt

Hudson added a comment - 19/Aug/14 16:20

FAILURE: Integrated in Hadoop-Yarn-trunk #651 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/651/)
Update CHANGES.txt for ~~HDFS-6561~~ which was renamed to ~~HADOOP-10975~~. (todd: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1618686)

/hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt

Hudson added a comment - 19/Aug/14 16:20 FAILURE: Integrated in Hadoop-Yarn-trunk #651 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/651/ ) Update CHANGES.txt for HDFS-6561 which was renamed to HADOOP-10975 . (todd: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1618686 ) /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt

Hudson added a comment - 19/Aug/14 18:56

FAILURE: Integrated in Hadoop-Hdfs-trunk #1842 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1842/)
Update CHANGES.txt for ~~HDFS-6561~~ which was renamed to ~~HADOOP-10975~~. (todd: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1618686)

/hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt

Hudson added a comment - 19/Aug/14 18:56 FAILURE: Integrated in Hadoop-Hdfs-trunk #1842 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1842/ ) Update CHANGES.txt for HDFS-6561 which was renamed to HADOOP-10975 . (todd: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1618686 ) /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt

Hudson added a comment - 19/Aug/14 18:56

FAILURE: Integrated in Hadoop-Hdfs-trunk #1842 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1842/)
Update CHANGES.txt for ~~HDFS-6561~~ which was renamed to ~~HADOOP-10975~~. (todd: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1618686)

/hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt

Hudson added a comment - 19/Aug/14 18:56 FAILURE: Integrated in Hadoop-Hdfs-trunk #1842 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1842/ ) Update CHANGES.txt for HDFS-6561 which was renamed to HADOOP-10975 . (todd: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1618686 ) /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt

Hudson added a comment - 20/Aug/14 00:01

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1868 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1868/)
Update CHANGES.txt for ~~HDFS-6561~~ which was renamed to ~~HADOOP-10975~~. (todd: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1618686)

/hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt

Hudson added a comment - 20/Aug/14 00:01 FAILURE: Integrated in Hadoop-Mapreduce-trunk #1868 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1868/ ) Update CHANGES.txt for HDFS-6561 which was renamed to HADOOP-10975 . (todd: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1618686 ) /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt

Hudson added a comment - 20/Aug/14 00:01

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1868 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1868/)
Update CHANGES.txt for ~~HDFS-6561~~ which was renamed to ~~HADOOP-10975~~. (todd: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1618686)

/hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt

Hudson added a comment - 20/Aug/14 00:01 FAILURE: Integrated in Hadoop-Mapreduce-trunk #1868 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1868/ ) Update CHANGES.txt for HDFS-6561 which was renamed to HADOOP-10975 . (todd: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1618686 ) /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt

Hadoop Common

org.apache.hadoop.util.DataChecksum should support native checksum calculation

Details

Attachments

Attachments

Issue Links

Activity

People

Dates