Description
Cblock tests fails because cblock does not generate unique trace id for each op.
java.lang.AssertionError: expected:<0> but was:<1051> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.junit.Assert.assertEquals(Assert.java:542) at org.apache.hadoop.cblock.TestBufferManager.testRepeatedBlockWrites(TestBufferManager.java:448)
This failure is because of following error.
017-08-02 17:50:34,569 [Cache Block Writer Thread #4] ERROR scm.XceiverClientHandler (XceiverClientHandler.java:sendCommandAsync(134)) - Command with Trace already exists. Ignoring this command. . Previous Command: java.util.concurrent.CompletableFuture@7847fc2d[Not completed, 1 dependents] 2017-08-02 17:50:34,569 [Cache Block Writer Thread #4] ERROR jscsiHelper.ContainerCacheFlusher (BlockWriterTask.java:run(108)) - Writing of block:44 failed, We have attempted to write this block 7 tim es to the container container2483304118.Trace ID: java.lang.IllegalStateException: Duplicate trace ID. Command with this trace ID is already executing. Please ensure that trace IDs are not reused. ID: at org.apache.hadoop.scm.XceiverClientHandler.sendCommandAsync(XceiverClientHandler.java:139) at org.apache.hadoop.scm.XceiverClientHandler.sendCommand(XceiverClientHandler.java:114) at org.apache.hadoop.scm.XceiverClient.sendCommand(XceiverClient.java:132) at org.apache.hadoop.scm.storage.ContainerProtocolCalls.writeSmallFile(ContainerProtocolCalls.java:225) at org.apache.hadoop.cblock.jscsiHelper.BlockWriterTask.run(BlockWriterTask.java:97) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
Attachments
Attachments
- HDFS-12255-HDFS-7240.003.patch
- 9 kB
- Mukul Kumar Singh
- HDFS-12255-HDFS-7240.002.patch
- 8 kB
- Mukul Kumar Singh
- HDFS-12255-HDFS-7240.001.patch
- 7 kB
- Mukul Kumar Singh
Issue Links
- duplicates
-
HDFS-11744 Ozone: Implement the trace ID generator
- Resolved
Activity
Thanks msingh for taking care of this! TestBufferManager did pass in my local run after applying the patch.
One comment though, can we change this function getHostIP to getTraceIDPrefix or something? Because it does not always an ip address, can be UUID instead. Also, log a warning when UnknownHostException ex happens?
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 14s | Docker mode activated. |
Prechecks | |||
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
-1 | test4tests | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. |
HDFS-7240 Compile Tests | |||
+1 | mvninstall | 15m 46s | HDFS-7240 passed |
+1 | compile | 0m 52s | HDFS-7240 passed |
+1 | checkstyle | 0m 38s | HDFS-7240 passed |
+1 | mvnsite | 0m 59s | HDFS-7240 passed |
+1 | findbugs | 1m 57s | HDFS-7240 passed |
+1 | javadoc | 0m 57s | HDFS-7240 passed |
Patch Compile Tests | |||
+1 | mvninstall | 1m 5s | the patch passed |
+1 | compile | 1m 2s | the patch passed |
+1 | javac | 1m 2s | the patch passed |
-0 | checkstyle | 0m 41s | hadoop-hdfs-project/hadoop-hdfs: The patch generated 4 new + 8 unchanged - 0 fixed = 12 total (was 8) |
+1 | mvnsite | 1m 5s | the patch passed |
+1 | whitespace | 0m 0s | The patch has no whitespace issues. |
+1 | findbugs | 2m 15s | the patch passed |
+1 | javadoc | 0m 51s | the patch passed |
Other Tests | |||
-1 | unit | 77m 34s | hadoop-hdfs in the patch failed. |
+1 | asflicense | 0m 18s | The patch does not generate ASF License warnings. |
107m 39s |
Reason | Tests |
---|---|
Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
hadoop.hdfs.TestDFSStripedOutputStreamWithFailure010 | |
hadoop.ozone.web.client.TestKeys | |
hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery | |
Timed out junit tests | org.apache.hadoop.ozone.web.client.TestKeysRatis |
org.apache.hadoop.ozone.container.ozoneimpl.TestOzoneContainerRatis |
Subsystem | Report/Notes |
---|---|
Docker | Image:yetus/hadoop:14b5c93 |
JIRA Issue | |
JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12881283/HDFS-12255-HDFS-7240.001.patch |
Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle |
uname | Linux 4511f87887e1 3.13.0-123-generic #172-Ubuntu SMP Mon Jun 26 18:04:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
Build tool | maven |
Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh |
git revision | HDFS-7240 / 0e32bf1 |
Default Java | 1.8.0_131 |
findbugs | v3.1.0-RC1 |
checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/20636/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt |
unit | https://builds.apache.org/job/PreCommit-HDFS-Build/20636/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt |
Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/20636/testReport/ |
modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs |
Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/20636/console |
Powered by | Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org |
This message was automatically generated.
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 14s | Docker mode activated. |
Prechecks | |||
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
-1 | test4tests | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. |
HDFS-7240 Compile Tests | |||
+1 | mvninstall | 14m 32s | HDFS-7240 passed |
+1 | compile | 0m 51s | HDFS-7240 passed |
+1 | checkstyle | 0m 33s | HDFS-7240 passed |
+1 | mvnsite | 0m 55s | HDFS-7240 passed |
+1 | findbugs | 2m 43s | HDFS-7240 passed |
+1 | javadoc | 0m 52s | HDFS-7240 passed |
Patch Compile Tests | |||
+1 | mvninstall | 0m 55s | the patch passed |
+1 | compile | 0m 50s | the patch passed |
+1 | javac | 0m 50s | the patch passed |
-0 | checkstyle | 0m 33s | hadoop-hdfs-project/hadoop-hdfs: The patch generated 4 new + 8 unchanged - 0 fixed = 12 total (was 8) |
+1 | mvnsite | 0m 57s | the patch passed |
+1 | whitespace | 0m 0s | The patch has no whitespace issues. |
+1 | findbugs | 2m 2s | the patch passed |
+1 | javadoc | 0m 49s | the patch passed |
Other Tests | |||
-1 | unit | 62m 34s | hadoop-hdfs in the patch failed. |
+1 | asflicense | 0m 21s | The patch does not generate ASF License warnings. |
90m 54s |
Reason | Tests |
---|---|
Failed junit tests | hadoop.hdfs.server.namenode.ha.TestPipelinesFailover |
hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure | |
hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | |
hadoop.hdfs.TestDFSStripedOutputStreamWithFailure100 | |
hadoop.ozone.web.client.TestKeys | |
hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration | |
Timed out junit tests | org.apache.hadoop.hdfs.TestFileChecksum |
org.apache.hadoop.ozone.web.client.TestKeysRatis | |
org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure180 | |
org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure140 | |
org.apache.hadoop.hdfs.TestDFSFinalize |
Subsystem | Report/Notes |
---|---|
Docker | Image:yetus/hadoop:14b5c93 |
JIRA Issue | |
JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12881297/HDFS-12255-HDFS-7240.002.patch |
Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle |
uname | Linux ea3a9224a802 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
Build tool | maven |
Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh |
git revision | HDFS-7240 / 0e32bf1 |
Default Java | 1.8.0_131 |
findbugs | v3.1.0-RC1 |
checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/20639/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt |
unit | https://builds.apache.org/job/PreCommit-HDFS-Build/20639/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt |
Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/20639/testReport/ |
modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs |
Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/20639/console |
Powered by | Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org |
This message was automatically generated.
msingh +1, from me, I will commit after vagarychen's comments are addressed.
Also, log a warning when UnknownHostException ex happens?
if we decide to log this warning, can we make sure we warn only once? or max a few times. Otherwise, for a client where this lookup fails, the log file will be overrun with this warning. So while it might be a good idea to warn, we might want to restrict the number of times we warn. We use a similar pattern on data-node side, if we are not able to communicate to SCM, we don't warn each try, but only at a selected frequency, we log how many times this call has failed.
Another option is to put this as a trace message so that it does not get to the log unless we are debugging.
vagarychen I have addressed the comments in the v3 patch. Please have a look.
anu, the error is being logged in the function which is being called as part of the constructor for ContainerCacheFlusher.
So this error will be generated only once for each flusher initialization.
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 17s | Docker mode activated. |
Prechecks | |||
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
-1 | test4tests | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. |
HDFS-7240 Compile Tests | |||
+1 | mvninstall | 17m 58s | HDFS-7240 passed |
+1 | compile | 1m 2s | HDFS-7240 passed |
+1 | checkstyle | 0m 43s | HDFS-7240 passed |
+1 | mvnsite | 1m 10s | HDFS-7240 passed |
+1 | findbugs | 2m 13s | HDFS-7240 passed |
+1 | javadoc | 0m 54s | HDFS-7240 passed |
Patch Compile Tests | |||
+1 | mvninstall | 1m 5s | the patch passed |
+1 | compile | 0m 59s | the patch passed |
+1 | javac | 0m 59s | the patch passed |
+1 | checkstyle | 0m 41s | the patch passed |
+1 | mvnsite | 1m 9s | the patch passed |
+1 | whitespace | 0m 0s | The patch has no whitespace issues. |
+1 | findbugs | 2m 25s | the patch passed |
+1 | javadoc | 0m 58s | the patch passed |
Other Tests | |||
-1 | unit | 77m 3s | hadoop-hdfs in the patch failed. |
+1 | asflicense | 0m 18s | The patch does not generate ASF License warnings. |
110m 30s |
Reason | Tests |
---|---|
Failed junit tests | hadoop.ozone.web.client.TestKeys |
hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure | |
hadoop.hdfs.server.namenode.ha.TestHAAppend | |
hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks | |
hadoop.ozone.container.ozoneimpl.TestRatisManager | |
hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070 | |
Timed out junit tests | org.apache.hadoop.ozone.web.client.TestKeysRatis |
Subsystem | Report/Notes |
---|---|
Docker | Image:yetus/hadoop:14b5c93 |
JIRA Issue | |
JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12881387/HDFS-12255-HDFS-7240.003.patch |
Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle |
uname | Linux eda56e073270 3.13.0-117-generic #164-Ubuntu SMP Fri Apr 7 11:05:26 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
Build tool | maven |
Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh |
git revision | HDFS-7240 / 0e32bf1 |
Default Java | 1.8.0_144 |
findbugs | v3.1.0-RC1 |
unit | https://builds.apache.org/job/PreCommit-HDFS-Build/20656/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt |
Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/20656/testReport/ |
modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs |
Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/20656/console |
Powered by | Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org |
This message was automatically generated.
Failed tests are unrelated. Committed to the feature, thanks msingh for the contribution!
SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14057 (See https://builds.apache.org/job/Hadoop-trunk-Commit/14057/)
HDFS-12255. Block Storage: Cblock should generated unique trace ID for (cliang: rev bfc49a4b2d719b6a3451883319708841ba589fa0)
- (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/cblock/jscsiHelper/cache/impl/AsyncBlockWriter.java
- (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/cblock/jscsiHelper/ContainerCacheFlusher.java
- (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/cblock/jscsiHelper/BlockWriterTask.java
- (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/cblock/jscsiHelper/cache/impl/CBlockLocalCache.java
SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14070 (See https://builds.apache.org/job/Hadoop-trunk-Commit/14070/)
HDFS-12255. Block Storage: Cblock should generated unique trace ID for (omalley: rev 6a16d7c7ab531668e2c0000fc30bcdd241a81cbd)
- (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/cblock/jscsiHelper/cache/impl/CBlockLocalCache.java
- (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/cblock/jscsiHelper/ContainerCacheFlusher.java
- (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/cblock/jscsiHelper/BlockWriterTask.java
- (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/cblock/jscsiHelper/cache/impl/AsyncBlockWriter.java
Thanks for filing this, if you are not working on this, feel free to assign this issue to me.