Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-10178

Permanent write failures can happen if pipeline recoveries occur for the first packet

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.8.0, 2.7.3, 2.6.5, 3.0.0-alpha1
    • Component/s: None
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      We have observed that write fails permanently if the first packet doesn't go through properly and pipeline recovery happens. If the write op creates a pipeline, but the actual data packet does not reach one or more datanodes in time, the pipeline recovery will be done against the 0-byte partial block.

      If additional datanodes are added, the block is transferred to the new nodes. After the transfer, each node will have a meta file containing the header and 0-length data block file. The pipeline recovery seems to work correctly up to this point, but write fails when actual data packet is resent.

      1. HDFS-10178.patch
        8 kB
        Kihwal Lee
      2. HDFS-10178.v2.patch
        8 kB
        Kihwal Lee
      3. HDFS-10178.v3.patch
        9 kB
        Kihwal Lee
      4. HDFS-10178.v4.patch
        6 kB
        Kihwal Lee
      5. HDFS-10178.v5.patch
        5 kB
        Kihwal Lee

        Issue Links

          Activity

          Hide
          kihwal Kihwal Lee added a comment - - edited

          Datanodes logs something like this:

          java.io.IOException: Invalid checksum length: received length is 504 but expected length is 0
                  at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:586)
                  at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:895)
                  at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:804)
                  at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
                  at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
                  at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:251)
                  at java.lang.Thread.run(Thread.java:745)
          

          This causes a permanent write failure.

          The problem is in BlockSender. When transferring a block, BlockSender gets checksum type from the on-disk meta file.

            if (metaIn.getLength() > BlockMetadataHeader.getHeaderSize()) {
              ...
              csum = BlockMetadataHeader.readDataChecksum(checksumIn, block);
              ...
            }
          ...
                if (csum == null) {
                  csum = DataChecksum.newDataChecksum(DataChecksum.Type.NULL, 512);
                }
          

          Since the code sets the checksum type to NULL if the on-disk meta file contains only the header portion, the checksum type during a block transfer is set incorrectly. When a data packet arrives with checksum, datanode checks whether it has received the correct amount of checksum data.

                final int checksumLen = diskChecksum.getChecksumSize(len);
                final int checksumReceivedLen = checksumBuf.capacity();
          
                if (checksumReceivedLen > 0 && checksumReceivedLen != checksumLen) {
                  throw new IOException("Invalid checksum length: received length is "
                      + checksumReceivedLen + " but expected length is " + checksumLen);
                }
          

          The getChecksumSize() method of NULL checksum type returns 0, so this check fails.

          Show
          kihwal Kihwal Lee added a comment - - edited Datanodes logs something like this: java.io.IOException: Invalid checksum length: received length is 504 but expected length is 0 at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:586) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:895) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:804) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:251) at java.lang.Thread.run(Thread.java:745) This causes a permanent write failure. The problem is in BlockSender . When transferring a block, BlockSender gets checksum type from the on-disk meta file. if (metaIn.getLength() > BlockMetadataHeader.getHeaderSize()) { ... csum = BlockMetadataHeader.readDataChecksum(checksumIn, block); ... } ... if (csum == null ) { csum = DataChecksum.newDataChecksum(DataChecksum.Type.NULL, 512); } Since the code sets the checksum type to NULL if the on-disk meta file contains only the header portion, the checksum type during a block transfer is set incorrectly. When a data packet arrives with checksum, datanode checks whether it has received the correct amount of checksum data. final int checksumLen = diskChecksum.getChecksumSize(len); final int checksumReceivedLen = checksumBuf.capacity(); if (checksumReceivedLen > 0 && checksumReceivedLen != checksumLen) { throw new IOException( "Invalid checksum length: received length is " + checksumReceivedLen + " but expected length is " + checksumLen); } The getChecksumSize() method of NULL checksum type returns 0, so this check fails.
          Hide
          kihwal Kihwal Lee added a comment -

          The following is from BlockSender, added by HDFS-6934.

          // The meta file will contain only the header if the NULL checksum
          // type was used, or if the replica was written to transient storage.
          // Checksum verification is not performed for replicas on transient
          // storage.  The header is important for determining the checksum
          // type later when lazy persistence copies the block to non-transient
          // storage and computes the checksum.
          if (metaIn.getLength() > BlockMetadataHeader.getHeaderSize()) {
          

          The code in the BlockSender makes a wrong assumption. If I simply changes > to >=, my test passes, but some of the lazy persist test cases fail. So I added another argument to the constructor.

          Chris Nauroth, can you take a look at my patch? I am not familiar with the lazy persist feature. There might be a better way.

          Show
          kihwal Kihwal Lee added a comment - The following is from BlockSender , added by HDFS-6934 . // The meta file will contain only the header if the NULL checksum // type was used, or if the replica was written to transient storage. // Checksum verification is not performed for replicas on transient // storage. The header is important for determining the checksum // type later when lazy persistence copies the block to non- transient // storage and computes the checksum. if (metaIn.getLength() > BlockMetadataHeader.getHeaderSize()) { The code in the BlockSender makes a wrong assumption. If I simply changes > to >= , my test passes, but some of the lazy persist test cases fail. So I added another argument to the constructor. Chris Nauroth , can you take a look at my patch? I am not familiar with the lazy persist feature. There might be a better way.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 11m 8s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 7m 6s trunk passed
          +1 compile 0m 51s trunk passed with JDK v1.8.0_74
          +1 compile 0m 43s trunk passed with JDK v1.7.0_95
          +1 checkstyle 0m 24s trunk passed
          +1 mvnsite 0m 53s trunk passed
          +1 mvneclipse 0m 14s trunk passed
          +1 findbugs 1m 57s trunk passed
          +1 javadoc 1m 14s trunk passed with JDK v1.8.0_74
          +1 javadoc 1m 54s trunk passed with JDK v1.7.0_95
          +1 mvninstall 0m 48s the patch passed
          +1 compile 0m 49s the patch passed with JDK v1.8.0_74
          -1 javac 6m 52s hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74 with JDK v1.8.0_74 generated 2 new + 33 unchanged - 0 fixed = 35 total (was 33)
          +1 javac 0m 49s the patch passed
          +1 compile 0m 41s the patch passed with JDK v1.7.0_95
          -1 javac 7m 34s hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95 with JDK v1.7.0_95 generated 2 new + 35 unchanged - 0 fixed = 37 total (was 35)
          +1 javac 0m 41s the patch passed
          -1 checkstyle 0m 21s hadoop-hdfs-project/hadoop-hdfs: patch generated 3 new + 347 unchanged - 3 fixed = 350 total (was 350)
          +1 mvnsite 0m 51s the patch passed
          +1 mvneclipse 0m 10s the patch passed
          +1 whitespace 0m 0s Patch has no whitespace issues.
          +1 findbugs 2m 9s the patch passed
          +1 javadoc 1m 12s the patch passed with JDK v1.8.0_74
          +1 javadoc 1m 49s the patch passed with JDK v1.7.0_95
          -1 unit 71m 50s hadoop-hdfs in the patch failed with JDK v1.8.0_74.
          -1 unit 76m 6s hadoop-hdfs in the patch failed with JDK v1.7.0_95.
          +1 asflicense 0m 24s Patch does not generate ASF License warnings.
          185m 46s



          Reason Tests
          JDK v1.8.0_74 Failed junit tests hadoop.hdfs.server.namenode.TestEditLog
            hadoop.hdfs.server.datanode.TestDataNodeLifeline
            hadoop.hdfs.server.balancer.TestBalancer
            hadoop.hdfs.TestClientProtocolForPipelineRecovery
          JDK v1.7.0_95 Failed junit tests hadoop.hdfs.server.namenode.TestEditLog
            hadoop.hdfs.server.namenode.TestDecommissioningStatus
            hadoop.hdfs.server.namenode.TestReconstructStripedBlocks
            hadoop.hdfs.server.blockmanagement.TestReplicationPolicy
            hadoop.hdfs.server.namenode.TestCacheDirectives
            hadoop.hdfs.TestClientProtocolForPipelineRecovery
            hadoop.metrics2.sink.TestRollingFileSystemSinkWithSecureHdfs



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:0ca8df7
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12794081/HDFS-10178.patch
          JIRA Issue HDFS-10178
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 73042f929ae6 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / dc951e6
          Default Java 1.7.0_95
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_74 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95
          findbugs v3.0.0
          javac hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74: https://builds.apache.org/job/PreCommit-HDFS-Build/14854/artifact/patchprocess/diff-compile-javac-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74.txt
          javac hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95: https://builds.apache.org/job/PreCommit-HDFS-Build/14854/artifact/patchprocess/diff-compile-javac-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/14854/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/14854/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/14854/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt
          unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/14854/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74.txt https://builds.apache.org/job/PreCommit-HDFS-Build/14854/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt
          JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/14854/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/14854/console
          Powered by Apache Yetus 0.2.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 11m 8s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 7m 6s trunk passed +1 compile 0m 51s trunk passed with JDK v1.8.0_74 +1 compile 0m 43s trunk passed with JDK v1.7.0_95 +1 checkstyle 0m 24s trunk passed +1 mvnsite 0m 53s trunk passed +1 mvneclipse 0m 14s trunk passed +1 findbugs 1m 57s trunk passed +1 javadoc 1m 14s trunk passed with JDK v1.8.0_74 +1 javadoc 1m 54s trunk passed with JDK v1.7.0_95 +1 mvninstall 0m 48s the patch passed +1 compile 0m 49s the patch passed with JDK v1.8.0_74 -1 javac 6m 52s hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74 with JDK v1.8.0_74 generated 2 new + 33 unchanged - 0 fixed = 35 total (was 33) +1 javac 0m 49s the patch passed +1 compile 0m 41s the patch passed with JDK v1.7.0_95 -1 javac 7m 34s hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95 with JDK v1.7.0_95 generated 2 new + 35 unchanged - 0 fixed = 37 total (was 35) +1 javac 0m 41s the patch passed -1 checkstyle 0m 21s hadoop-hdfs-project/hadoop-hdfs: patch generated 3 new + 347 unchanged - 3 fixed = 350 total (was 350) +1 mvnsite 0m 51s the patch passed +1 mvneclipse 0m 10s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 findbugs 2m 9s the patch passed +1 javadoc 1m 12s the patch passed with JDK v1.8.0_74 +1 javadoc 1m 49s the patch passed with JDK v1.7.0_95 -1 unit 71m 50s hadoop-hdfs in the patch failed with JDK v1.8.0_74. -1 unit 76m 6s hadoop-hdfs in the patch failed with JDK v1.7.0_95. +1 asflicense 0m 24s Patch does not generate ASF License warnings. 185m 46s Reason Tests JDK v1.8.0_74 Failed junit tests hadoop.hdfs.server.namenode.TestEditLog   hadoop.hdfs.server.datanode.TestDataNodeLifeline   hadoop.hdfs.server.balancer.TestBalancer   hadoop.hdfs.TestClientProtocolForPipelineRecovery JDK v1.7.0_95 Failed junit tests hadoop.hdfs.server.namenode.TestEditLog   hadoop.hdfs.server.namenode.TestDecommissioningStatus   hadoop.hdfs.server.namenode.TestReconstructStripedBlocks   hadoop.hdfs.server.blockmanagement.TestReplicationPolicy   hadoop.hdfs.server.namenode.TestCacheDirectives   hadoop.hdfs.TestClientProtocolForPipelineRecovery   hadoop.metrics2.sink.TestRollingFileSystemSinkWithSecureHdfs Subsystem Report/Notes Docker Image:yetus/hadoop:0ca8df7 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12794081/HDFS-10178.patch JIRA Issue HDFS-10178 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 73042f929ae6 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / dc951e6 Default Java 1.7.0_95 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_74 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 findbugs v3.0.0 javac hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74: https://builds.apache.org/job/PreCommit-HDFS-Build/14854/artifact/patchprocess/diff-compile-javac-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74.txt javac hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95: https://builds.apache.org/job/PreCommit-HDFS-Build/14854/artifact/patchprocess/diff-compile-javac-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/14854/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/14854/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/14854/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/14854/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74.txt https://builds.apache.org/job/PreCommit-HDFS-Build/14854/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/14854/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/14854/console Powered by Apache Yetus 0.2.0 http://yetus.apache.org This message was automatically generated.
          Hide
          kihwal Kihwal Lee added a comment -

          I should probably make BlockScanner/VolumeScanner to follow the old way when creating BlockSender. Otherwise all lazy persisted blocks will be reported as bad when scanned.

          Show
          kihwal Kihwal Lee added a comment - I should probably make BlockScanner / VolumeScanner to follow the old way when creating BlockSender . Otherwise all lazy persisted blocks will be reported as bad when scanned.
          Hide
          kihwal Kihwal Lee added a comment -

          Made VolumeScanner retain the old behavior. Fixed javac warnings.

          Show
          kihwal Kihwal Lee added a comment - Made VolumeScanner retain the old behavior. Fixed javac warnings.
          Hide
          kihwal Kihwal Lee added a comment -

          For those who are wondering how critical this bug is: we hit this because of a transient network issue, which caused multiple timeouts while writing the first packet. After all original nodes were replaced, none of nodes in the pipeline had the replica with a valid checksum type. The network issue was gone, but the write could not continue because of the bug and permanently failed after 5 pipeline recovery attempts. I don't think it is common, but when it happens the client cannot recover.

          Show
          kihwal Kihwal Lee added a comment - For those who are wondering how critical this bug is: we hit this because of a transient network issue, which caused multiple timeouts while writing the first packet. After all original nodes were replaced, none of nodes in the pipeline had the replica with a valid checksum type. The network issue was gone, but the write could not continue because of the bug and permanently failed after 5 pipeline recovery attempts. I don't think it is common, but when it happens the client cannot recover.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 10s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 6m 38s trunk passed
          +1 compile 0m 39s trunk passed with JDK v1.8.0_74
          +1 compile 0m 41s trunk passed with JDK v1.7.0_95
          +1 checkstyle 0m 27s trunk passed
          +1 mvnsite 0m 54s trunk passed
          +1 mvneclipse 0m 14s trunk passed
          +1 findbugs 2m 6s trunk passed
          +1 javadoc 1m 10s trunk passed with JDK v1.8.0_74
          +1 javadoc 1m 54s trunk passed with JDK v1.7.0_95
          +1 mvninstall 0m 51s the patch passed
          +1 compile 0m 46s the patch passed with JDK v1.8.0_74
          +1 javac 0m 46s the patch passed
          +1 compile 0m 41s the patch passed with JDK v1.7.0_95
          +1 javac 0m 41s the patch passed
          -1 checkstyle 0m 24s hadoop-hdfs-project/hadoop-hdfs: patch generated 3 new + 347 unchanged - 3 fixed = 350 total (was 350)
          +1 mvnsite 0m 53s the patch passed
          +1 mvneclipse 0m 11s the patch passed
          +1 whitespace 0m 0s Patch has no whitespace issues.
          +1 findbugs 2m 15s the patch passed
          +1 javadoc 1m 8s the patch passed with JDK v1.8.0_74
          +1 javadoc 1m 51s the patch passed with JDK v1.7.0_95
          -1 unit 61m 33s hadoop-hdfs in the patch failed with JDK v1.8.0_74.
          -1 unit 59m 57s hadoop-hdfs in the patch failed with JDK v1.7.0_95.
          +1 asflicense 0m 26s Patch does not generate ASF License warnings.
          148m 1s



          Reason Tests
          JDK v1.8.0_74 Failed junit tests hadoop.hdfs.TestClientProtocolForPipelineRecovery
            hadoop.hdfs.TestFileAppend
            hadoop.hdfs.server.balancer.TestBalancer
            hadoop.hdfs.server.namenode.TestEditLog
            hadoop.hdfs.TestHFlush
          JDK v1.7.0_95 Failed junit tests hadoop.hdfs.TestClientProtocolForPipelineRecovery
            hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation
            hadoop.hdfs.TestHFlush



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:0ca8df7
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12794188/HDFS-10178.v2.patch
          JIRA Issue HDFS-10178
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 60346f9d15d5 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / dc951e6
          Default Java 1.7.0_95
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_74 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/14865/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/14865/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/14865/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt
          unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/14865/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74.txt https://builds.apache.org/job/PreCommit-HDFS-Build/14865/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt
          JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/14865/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/14865/console
          Powered by Apache Yetus 0.2.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 10s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 6m 38s trunk passed +1 compile 0m 39s trunk passed with JDK v1.8.0_74 +1 compile 0m 41s trunk passed with JDK v1.7.0_95 +1 checkstyle 0m 27s trunk passed +1 mvnsite 0m 54s trunk passed +1 mvneclipse 0m 14s trunk passed +1 findbugs 2m 6s trunk passed +1 javadoc 1m 10s trunk passed with JDK v1.8.0_74 +1 javadoc 1m 54s trunk passed with JDK v1.7.0_95 +1 mvninstall 0m 51s the patch passed +1 compile 0m 46s the patch passed with JDK v1.8.0_74 +1 javac 0m 46s the patch passed +1 compile 0m 41s the patch passed with JDK v1.7.0_95 +1 javac 0m 41s the patch passed -1 checkstyle 0m 24s hadoop-hdfs-project/hadoop-hdfs: patch generated 3 new + 347 unchanged - 3 fixed = 350 total (was 350) +1 mvnsite 0m 53s the patch passed +1 mvneclipse 0m 11s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 findbugs 2m 15s the patch passed +1 javadoc 1m 8s the patch passed with JDK v1.8.0_74 +1 javadoc 1m 51s the patch passed with JDK v1.7.0_95 -1 unit 61m 33s hadoop-hdfs in the patch failed with JDK v1.8.0_74. -1 unit 59m 57s hadoop-hdfs in the patch failed with JDK v1.7.0_95. +1 asflicense 0m 26s Patch does not generate ASF License warnings. 148m 1s Reason Tests JDK v1.8.0_74 Failed junit tests hadoop.hdfs.TestClientProtocolForPipelineRecovery   hadoop.hdfs.TestFileAppend   hadoop.hdfs.server.balancer.TestBalancer   hadoop.hdfs.server.namenode.TestEditLog   hadoop.hdfs.TestHFlush JDK v1.7.0_95 Failed junit tests hadoop.hdfs.TestClientProtocolForPipelineRecovery   hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation   hadoop.hdfs.TestHFlush Subsystem Report/Notes Docker Image:yetus/hadoop:0ca8df7 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12794188/HDFS-10178.v2.patch JIRA Issue HDFS-10178 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 60346f9d15d5 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / dc951e6 Default Java 1.7.0_95 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_74 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/14865/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/14865/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/14865/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/14865/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74.txt https://builds.apache.org/job/PreCommit-HDFS-Build/14865/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/14865/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/14865/console Powered by Apache Yetus 0.2.0 http://yetus.apache.org This message was automatically generated.
          Hide
          kihwal Kihwal Lee added a comment -

          Oops. I reused a fault injection method and caused test failures. I am adding a new one.

          Show
          kihwal Kihwal Lee added a comment - Oops. I reused a fault injection method and caused test failures. I am adding a new one.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 8m 48s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 8m 35s trunk passed
          +1 compile 1m 0s trunk passed with JDK v1.8.0_74
          +1 compile 0m 51s trunk passed with JDK v1.7.0_95
          +1 checkstyle 0m 25s trunk passed
          +1 mvnsite 1m 0s trunk passed
          +1 mvneclipse 0m 14s trunk passed
          +1 findbugs 2m 28s trunk passed
          +1 javadoc 1m 36s trunk passed with JDK v1.8.0_74
          +1 javadoc 2m 16s trunk passed with JDK v1.7.0_95
          +1 mvninstall 1m 4s the patch passed
          +1 compile 1m 10s the patch passed with JDK v1.8.0_74
          +1 javac 1m 10s the patch passed
          +1 compile 0m 55s the patch passed with JDK v1.7.0_95
          +1 javac 0m 55s the patch passed
          -1 checkstyle 0m 26s hadoop-hdfs-project/hadoop-hdfs: patch generated 3 new + 348 unchanged - 3 fixed = 351 total (was 351)
          +1 mvnsite 1m 12s the patch passed
          +1 mvneclipse 0m 15s the patch passed
          +1 whitespace 0m 0s Patch has no whitespace issues.
          +1 findbugs 2m 49s the patch passed
          +1 javadoc 1m 36s the patch passed with JDK v1.8.0_74
          +1 javadoc 2m 23s the patch passed with JDK v1.7.0_95
          -1 unit 76m 32s hadoop-hdfs in the patch failed with JDK v1.8.0_74.
          -1 unit 70m 18s hadoop-hdfs in the patch failed with JDK v1.7.0_95.
          -1 asflicense 0m 24s Patch generated 1 ASF License warnings.
          189m 15s



          Reason Tests
          JDK v1.8.0_74 Failed junit tests hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes
            hadoop.hdfs.server.datanode.TestDirectoryScanner
            hadoop.hdfs.TestEncryptionZones
            hadoop.hdfs.server.namenode.TestEditLog
            hadoop.hdfs.server.namenode.TestDecommissioningStatus
            hadoop.hdfs.server.namenode.ha.TestEditLogTailer
            hadoop.hdfs.security.TestDelegationTokenForProxyUser
            hadoop.hdfs.TestLeaseRecovery
            hadoop.hdfs.TestFileAppend
            hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes
            hadoop.hdfs.server.blockmanagement.TestReconstructStripedBlocksWithRackAwareness
          JDK v1.8.0_74 Timed out junit tests org.apache.hadoop.hdfs.TestHDFSFileSystemContract
            org.apache.hadoop.hdfs.TestInjectionForSimulatedStorage
            org.apache.hadoop.hdfs.TestDecommission
            org.apache.hadoop.hdfs.TestAppendSnapshotTruncate
          JDK v1.7.0_95 Failed junit tests hadoop.hdfs.qjournal.TestSecureNNWithQJM
            hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:fbe3e86
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12794234/HDFS-10178.v3.patch
          JIRA Issue HDFS-10178
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 6c1bd0b6f56c 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / fbe3e86
          Default Java 1.7.0_95
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_74 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/14868/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/14868/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/14868/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt
          unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/14868/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74.txt https://builds.apache.org/job/PreCommit-HDFS-Build/14868/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt
          JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/14868/testReport/
          asflicense https://builds.apache.org/job/PreCommit-HDFS-Build/14868/artifact/patchprocess/patch-asflicense-problems.txt
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/14868/console
          Powered by Apache Yetus 0.2.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 8m 48s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 8m 35s trunk passed +1 compile 1m 0s trunk passed with JDK v1.8.0_74 +1 compile 0m 51s trunk passed with JDK v1.7.0_95 +1 checkstyle 0m 25s trunk passed +1 mvnsite 1m 0s trunk passed +1 mvneclipse 0m 14s trunk passed +1 findbugs 2m 28s trunk passed +1 javadoc 1m 36s trunk passed with JDK v1.8.0_74 +1 javadoc 2m 16s trunk passed with JDK v1.7.0_95 +1 mvninstall 1m 4s the patch passed +1 compile 1m 10s the patch passed with JDK v1.8.0_74 +1 javac 1m 10s the patch passed +1 compile 0m 55s the patch passed with JDK v1.7.0_95 +1 javac 0m 55s the patch passed -1 checkstyle 0m 26s hadoop-hdfs-project/hadoop-hdfs: patch generated 3 new + 348 unchanged - 3 fixed = 351 total (was 351) +1 mvnsite 1m 12s the patch passed +1 mvneclipse 0m 15s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 findbugs 2m 49s the patch passed +1 javadoc 1m 36s the patch passed with JDK v1.8.0_74 +1 javadoc 2m 23s the patch passed with JDK v1.7.0_95 -1 unit 76m 32s hadoop-hdfs in the patch failed with JDK v1.8.0_74. -1 unit 70m 18s hadoop-hdfs in the patch failed with JDK v1.7.0_95. -1 asflicense 0m 24s Patch generated 1 ASF License warnings. 189m 15s Reason Tests JDK v1.8.0_74 Failed junit tests hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes   hadoop.hdfs.server.datanode.TestDirectoryScanner   hadoop.hdfs.TestEncryptionZones   hadoop.hdfs.server.namenode.TestEditLog   hadoop.hdfs.server.namenode.TestDecommissioningStatus   hadoop.hdfs.server.namenode.ha.TestEditLogTailer   hadoop.hdfs.security.TestDelegationTokenForProxyUser   hadoop.hdfs.TestLeaseRecovery   hadoop.hdfs.TestFileAppend   hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes   hadoop.hdfs.server.blockmanagement.TestReconstructStripedBlocksWithRackAwareness JDK v1.8.0_74 Timed out junit tests org.apache.hadoop.hdfs.TestHDFSFileSystemContract   org.apache.hadoop.hdfs.TestInjectionForSimulatedStorage   org.apache.hadoop.hdfs.TestDecommission   org.apache.hadoop.hdfs.TestAppendSnapshotTruncate JDK v1.7.0_95 Failed junit tests hadoop.hdfs.qjournal.TestSecureNNWithQJM   hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints Subsystem Report/Notes Docker Image:yetus/hadoop:fbe3e86 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12794234/HDFS-10178.v3.patch JIRA Issue HDFS-10178 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 6c1bd0b6f56c 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / fbe3e86 Default Java 1.7.0_95 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_74 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/14868/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/14868/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/14868/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/14868/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74.txt https://builds.apache.org/job/PreCommit-HDFS-Build/14868/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/14868/testReport/ asflicense https://builds.apache.org/job/PreCommit-HDFS-Build/14868/artifact/patchprocess/patch-asflicense-problems.txt modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/14868/console Powered by Apache Yetus 0.2.0 http://yetus.apache.org This message was automatically generated.
          Hide
          ajisakaa Akira Ajisaka added a comment -

          Mostly looks good to me. Minor nit:

                cluster = new MiniDFSCluster.Builder(conf).numDataNodes((int)3).build();
          

          (int) is unnecessarily.
          Hi Chris Nauroth and Arpit Agarwal, would you please review the patch?

          Show
          ajisakaa Akira Ajisaka added a comment - Mostly looks good to me. Minor nit: cluster = new MiniDFSCluster.Builder(conf).numDataNodes(( int )3).build(); (int) is unnecessarily. Hi Chris Nauroth and Arpit Agarwal , would you please review the patch?
          Hide
          arpitagarwal Arpit Agarwal added a comment - - edited

          Hi Kihwal, I think we can check replica.isOnTransientStorage() instead of passing the new flag. Something like this should work in BlockSender.

                      if (!replica.isOnTransientStorage() &&
                          metaIn.getLength() >= BlockMetadataHeader.getHeaderSize()) {
          
          Show
          arpitagarwal Arpit Agarwal added a comment - - edited Hi Kihwal, I think we can check replica.isOnTransientStorage() instead of passing the new flag. Something like this should work in BlockSender . if (!replica.isOnTransientStorage() && metaIn.getLength() >= BlockMetadataHeader.getHeaderSize()) {
          Hide
          kihwal Kihwal Lee added a comment -

          That's what I was looking for! Thanks, Arpit Agarwal. I will update the patch.

          Show
          kihwal Kihwal Lee added a comment - That's what I was looking for! Thanks, Arpit Agarwal . I will update the patch.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 12s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 6m 36s trunk passed
          +1 compile 0m 41s trunk passed with JDK v1.8.0_74
          +1 compile 0m 41s trunk passed with JDK v1.7.0_95
          +1 checkstyle 0m 21s trunk passed
          +1 mvnsite 0m 51s trunk passed
          +1 mvneclipse 0m 14s trunk passed
          +1 findbugs 1m 56s trunk passed
          +1 javadoc 1m 5s trunk passed with JDK v1.8.0_74
          +1 javadoc 1m 44s trunk passed with JDK v1.7.0_95
          +1 mvninstall 0m 46s the patch passed
          +1 compile 0m 38s the patch passed with JDK v1.8.0_74
          +1 javac 0m 38s the patch passed
          +1 compile 0m 38s the patch passed with JDK v1.7.0_95
          +1 javac 0m 38s the patch passed
          -1 checkstyle 0m 19s hadoop-hdfs-project/hadoop-hdfs: patch generated 2 new + 99 unchanged - 2 fixed = 101 total (was 101)
          +1 mvnsite 0m 50s the patch passed
          +1 mvneclipse 0m 11s the patch passed
          +1 whitespace 0m 1s Patch has no whitespace issues.
          +1 findbugs 2m 9s the patch passed
          +1 javadoc 1m 4s the patch passed with JDK v1.8.0_74
          +1 javadoc 1m 39s the patch passed with JDK v1.7.0_95
          -1 unit 57m 11s hadoop-hdfs in the patch failed with JDK v1.8.0_74.
          -1 unit 53m 41s hadoop-hdfs in the patch failed with JDK v1.7.0_95.
          +1 asflicense 0m 21s Patch does not generate ASF License warnings.
          135m 48s



          Reason Tests
          JDK v1.8.0_74 Failed junit tests hadoop.TestRefreshCallQueue
            hadoop.hdfs.server.datanode.TestDataNodeMetrics
          JDK v1.7.0_95 Failed junit tests hadoop.TestRefreshCallQueue
            hadoop.hdfs.shortcircuit.TestShortCircuitCache



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:fbe3e86
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12796602/HDFS-10178.v4.patch
          JIRA Issue HDFS-10178
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 3c724ec8d11d 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 256c82f
          Default Java 1.7.0_95
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_74 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/15042/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/15042/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/15042/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt
          unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/15042/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74.txt https://builds.apache.org/job/PreCommit-HDFS-Build/15042/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt
          JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15042/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15042/console
          Powered by Apache Yetus 0.2.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 12s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 6m 36s trunk passed +1 compile 0m 41s trunk passed with JDK v1.8.0_74 +1 compile 0m 41s trunk passed with JDK v1.7.0_95 +1 checkstyle 0m 21s trunk passed +1 mvnsite 0m 51s trunk passed +1 mvneclipse 0m 14s trunk passed +1 findbugs 1m 56s trunk passed +1 javadoc 1m 5s trunk passed with JDK v1.8.0_74 +1 javadoc 1m 44s trunk passed with JDK v1.7.0_95 +1 mvninstall 0m 46s the patch passed +1 compile 0m 38s the patch passed with JDK v1.8.0_74 +1 javac 0m 38s the patch passed +1 compile 0m 38s the patch passed with JDK v1.7.0_95 +1 javac 0m 38s the patch passed -1 checkstyle 0m 19s hadoop-hdfs-project/hadoop-hdfs: patch generated 2 new + 99 unchanged - 2 fixed = 101 total (was 101) +1 mvnsite 0m 50s the patch passed +1 mvneclipse 0m 11s the patch passed +1 whitespace 0m 1s Patch has no whitespace issues. +1 findbugs 2m 9s the patch passed +1 javadoc 1m 4s the patch passed with JDK v1.8.0_74 +1 javadoc 1m 39s the patch passed with JDK v1.7.0_95 -1 unit 57m 11s hadoop-hdfs in the patch failed with JDK v1.8.0_74. -1 unit 53m 41s hadoop-hdfs in the patch failed with JDK v1.7.0_95. +1 asflicense 0m 21s Patch does not generate ASF License warnings. 135m 48s Reason Tests JDK v1.8.0_74 Failed junit tests hadoop.TestRefreshCallQueue   hadoop.hdfs.server.datanode.TestDataNodeMetrics JDK v1.7.0_95 Failed junit tests hadoop.TestRefreshCallQueue   hadoop.hdfs.shortcircuit.TestShortCircuitCache Subsystem Report/Notes Docker Image:yetus/hadoop:fbe3e86 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12796602/HDFS-10178.v4.patch JIRA Issue HDFS-10178 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 3c724ec8d11d 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 256c82f Default Java 1.7.0_95 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_74 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/15042/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/15042/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/15042/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/15042/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74.txt https://builds.apache.org/job/PreCommit-HDFS-Build/15042/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15042/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15042/console Powered by Apache Yetus 0.2.0 http://yetus.apache.org This message was automatically generated.
          Hide
          iwasakims Masatake Iwasaki added a comment -
          577	        // For testing. Delay sending packet downstream
          578	        if (DataNodeFaultInjector.get().stopSendingPacketDownstream()) {
          579	          try {
          580	            Thread.sleep(60000);
          581	          } catch (InterruptedException ie) {
          582	            throw new IOException("Interrupted while sleeping. Bailing out.");
          583	          }
          584	        }
          

          Should the test logic be encapsulate in the DataNodeFaultInjector's method? like

              DataNodeFaultInjector dnFaultInjector = new DataNodeFaultInjector() {
                int tries = 1;
                @Override
                public void stopSendingPacketDownstream() throws IOException {
                  if (tries > 0) {
                    tries--;
                    try {
                      Thread.sleep(60000);
                    } catch (InterruptedException ie) {
                      throw new IOException("Interrupted while sleeping. Bailing out.");
                    }
                  }
                }
              };
          
          Show
          iwasakims Masatake Iwasaki added a comment - 577 // For testing. Delay sending packet downstream 578 if (DataNodeFaultInjector.get().stopSendingPacketDownstream()) { 579 try { 580 Thread .sleep(60000); 581 } catch (InterruptedException ie) { 582 throw new IOException( "Interrupted while sleeping. Bailing out." ); 583 } 584 } Should the test logic be encapsulate in the DataNodeFaultInjector's method? like DataNodeFaultInjector dnFaultInjector = new DataNodeFaultInjector() { int tries = 1; @Override public void stopSendingPacketDownstream() throws IOException { if (tries > 0) { tries--; try { Thread .sleep(60000); } catch (InterruptedException ie) { throw new IOException( "Interrupted while sleeping. Bailing out." ); } } } };
          Hide
          arpitagarwal Arpit Agarwal added a comment -

          +1 from me. Not committing it since Masatake has an open question.

          Show
          arpitagarwal Arpit Agarwal added a comment - +1 from me. Not committing it since Masatake has an open question.
          Hide
          kihwal Kihwal Lee added a comment -

          Should the test logic be encapsulate in the DataNodeFaultInjector's method?

          Good point. That's where it logically belong and the code will look cleaner.

          Show
          kihwal Kihwal Lee added a comment - Should the test logic be encapsulate in the DataNodeFaultInjector's method? Good point. That's where it logically belong and the code will look cleaner.
          Hide
          vinayrpet Vinayakumar B added a comment -

          Hi Kihwal Lee, I am trying to understand the issue. may be my understanding about packet sending/receiving is wrong.

          If the packet header is sent out, but the data portion of the packet does not reach one or more datanodes in time

          How Only header can be sent out without any data/datalen? header will have the payload length. So PacketReceiver should fail/wait to receive the entire packet itself right (not just receive header portion) ?
          or payload length is corrupted in the incoming packet to 0?

          packetReceiver.java#doRead(..)
              // Each packet looks like:
              //   PLEN    HLEN      HEADER     CHECKSUMS  DATA
              //   32-bit  16-bit   <protobuf>  <variable length>
              //
              // PLEN:      Payload length
              //            = length(PLEN) + length(CHECKSUMS) + length(DATA)
              //            This length includes its own encoded length in
              //            the sum for historical reasons.
              //
              // HLEN:      Header length
              //            = length(HEADER)
              //
              // HEADER:    the actual packet header fields, encoded in protobuf
              // CHECKSUMS: the crcs for the data chunk. May be missing if
              //            checksums were not requested
              // DATA       the actual block data
          
          Show
          vinayrpet Vinayakumar B added a comment - Hi Kihwal Lee , I am trying to understand the issue. may be my understanding about packet sending/receiving is wrong. If the packet header is sent out, but the data portion of the packet does not reach one or more datanodes in time How Only header can be sent out without any data/datalen? header will have the payload length. So PacketReceiver should fail/wait to receive the entire packet itself right (not just receive header portion) ? or payload length is corrupted in the incoming packet to 0? packetReceiver.java#doRead(..) // Each packet looks like: // PLEN HLEN HEADER CHECKSUMS DATA // 32-bit 16-bit <protobuf> <variable length> // // PLEN: Payload length // = length(PLEN) + length(CHECKSUMS) + length(DATA) // This length includes its own encoded length in // the sum for historical reasons. // // HLEN: Header length // = length(HEADER) // // HEADER: the actual packet header fields, encoded in protobuf // CHECKSUMS: the crcs for the data chunk. May be missing if // checksums were not requested // DATA the actual block data
          Hide
          kihwal Kihwal Lee added a comment -

          Vinayakumar B, sorry, I gave a confusing description of the problem. I was mixing the meta file header and the non-payload protobuf fields. After a connection is made and the command is parsed, a BlockReceiver is created and createRbw() is called before getting to the packet. It creates a meta file with header only. If this is used for transferring the block, the checksum type is lost.

          Show
          kihwal Kihwal Lee added a comment - Vinayakumar B , sorry, I gave a confusing description of the problem. I was mixing the meta file header and the non-payload protobuf fields. After a connection is made and the command is parsed, a BlockReceiver is created and createRbw() is called before getting to the packet. It creates a meta file with header only. If this is used for transferring the block, the checksum type is lost.
          Hide
          kihwal Kihwal Lee added a comment -

          Five time's a charm. Moved the fault injection code. Verified it still fails without the fix and passes with the fix.

          Show
          kihwal Kihwal Lee added a comment - Five time's a charm. Moved the fault injection code. Verified it still fails without the fix and passes with the fix.
          Hide
          arpitagarwal Arpit Agarwal added a comment -

          +1 pending Jenkins.

          Show
          arpitagarwal Arpit Agarwal added a comment - +1 pending Jenkins.
          Hide
          iwasakims Masatake Iwasaki added a comment -

          Thanks for the update, Kihwal Lee. +1 too on v5.

          Show
          iwasakims Masatake Iwasaki added a comment - Thanks for the update, Kihwal Lee . +1 too on v5.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 10s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 6m 49s trunk passed
          +1 compile 0m 43s trunk passed with JDK v1.8.0_77
          +1 compile 0m 41s trunk passed with JDK v1.7.0_95
          +1 checkstyle 0m 21s trunk passed
          +1 mvnsite 0m 52s trunk passed
          +1 mvneclipse 0m 13s trunk passed
          +1 findbugs 1m 55s trunk passed
          +1 javadoc 1m 7s trunk passed with JDK v1.8.0_77
          +1 javadoc 1m 42s trunk passed with JDK v1.7.0_95
          +1 mvninstall 0m 46s the patch passed
          +1 compile 0m 39s the patch passed with JDK v1.8.0_77
          +1 javac 0m 39s the patch passed
          +1 compile 0m 39s the patch passed with JDK v1.7.0_95
          +1 javac 0m 39s the patch passed
          -1 checkstyle 0m 19s hadoop-hdfs-project/hadoop-hdfs: patch generated 2 new + 99 unchanged - 2 fixed = 101 total (was 101)
          +1 mvnsite 0m 50s the patch passed
          +1 mvneclipse 0m 11s the patch passed
          +1 whitespace 0m 0s Patch has no whitespace issues.
          +1 findbugs 2m 10s the patch passed
          +1 javadoc 1m 1s the patch passed with JDK v1.8.0_77
          +1 javadoc 1m 43s the patch passed with JDK v1.7.0_95
          -1 unit 58m 42s hadoop-hdfs in the patch failed with JDK v1.8.0_77.
          -1 unit 53m 59s hadoop-hdfs in the patch failed with JDK v1.7.0_95.
          +1 asflicense 0m 27s Patch does not generate ASF License warnings.
          137m 58s



          Reason Tests
          JDK v1.8.0_77 Failed junit tests hadoop.hdfs.TestDFSClientRetries
            hadoop.hdfs.TestReplication
            hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS
          JDK v1.7.0_95 Failed junit tests hadoop.hdfs.TestHFlush



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:fbe3e86
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12796873/HDFS-10178.v5.patch
          JIRA Issue HDFS-10178
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 47eac507384a 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 7280550
          Default Java 1.7.0_95
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_77 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/15056/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/15056/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_77.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/15056/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt
          unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/15056/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_77.txt https://builds.apache.org/job/PreCommit-HDFS-Build/15056/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt
          JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15056/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15056/console
          Powered by Apache Yetus 0.2.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 10s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 6m 49s trunk passed +1 compile 0m 43s trunk passed with JDK v1.8.0_77 +1 compile 0m 41s trunk passed with JDK v1.7.0_95 +1 checkstyle 0m 21s trunk passed +1 mvnsite 0m 52s trunk passed +1 mvneclipse 0m 13s trunk passed +1 findbugs 1m 55s trunk passed +1 javadoc 1m 7s trunk passed with JDK v1.8.0_77 +1 javadoc 1m 42s trunk passed with JDK v1.7.0_95 +1 mvninstall 0m 46s the patch passed +1 compile 0m 39s the patch passed with JDK v1.8.0_77 +1 javac 0m 39s the patch passed +1 compile 0m 39s the patch passed with JDK v1.7.0_95 +1 javac 0m 39s the patch passed -1 checkstyle 0m 19s hadoop-hdfs-project/hadoop-hdfs: patch generated 2 new + 99 unchanged - 2 fixed = 101 total (was 101) +1 mvnsite 0m 50s the patch passed +1 mvneclipse 0m 11s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 findbugs 2m 10s the patch passed +1 javadoc 1m 1s the patch passed with JDK v1.8.0_77 +1 javadoc 1m 43s the patch passed with JDK v1.7.0_95 -1 unit 58m 42s hadoop-hdfs in the patch failed with JDK v1.8.0_77. -1 unit 53m 59s hadoop-hdfs in the patch failed with JDK v1.7.0_95. +1 asflicense 0m 27s Patch does not generate ASF License warnings. 137m 58s Reason Tests JDK v1.8.0_77 Failed junit tests hadoop.hdfs.TestDFSClientRetries   hadoop.hdfs.TestReplication   hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS JDK v1.7.0_95 Failed junit tests hadoop.hdfs.TestHFlush Subsystem Report/Notes Docker Image:yetus/hadoop:fbe3e86 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12796873/HDFS-10178.v5.patch JIRA Issue HDFS-10178 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 47eac507384a 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 7280550 Default Java 1.7.0_95 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_77 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/15056/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/15056/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_77.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/15056/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/15056/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_77.txt https://builds.apache.org/job/PreCommit-HDFS-Build/15056/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15056/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15056/console Powered by Apache Yetus 0.2.0 http://yetus.apache.org This message was automatically generated.
          Hide
          kihwal Kihwal Lee added a comment -

          TestHFlush: HDFS-2043 Will review the patch.
          JDK8 failures don't have logs, so it is hard to debug.
          TestDFSClientRetries: timed out. Tried to restart namenode, but timed out. Without seeing the log, it s hard to know what went wrong.
          TestReplication: timed out. Datanode shutdown hung at netty shutdown.

          java.lang.Thread.State: RUNNABLE
                  at sun.nio.ch.EPollArrayWrapper.interrupt(Native Method)
                  at sun.nio.ch.EPollArrayWrapper.interrupt(EPollArrayWrapper.java:317)
                  at sun.nio.ch.EPollSelectorImpl.wakeup(EPollSelectorImpl.java:207)
                  at io.netty.channel.nio.NioEventLoop.wakeup(NioEventLoop.java:590)
                  at io.netty.util.concurrent.SingleThreadEventExecutor.shutdownGracefully(SingleThreadEventExecutor.java:503)
                  at io.netty.util.concurrent.MultithreadEventExecutorGroup.shutdownGracefully(MultithreadEventExecutorGroup.java:160)
                  at io.netty.util.concurrent.AbstractEventExecutorGroup.shutdownGracefully(AbstractEventExecutorGroup.java:70)
                  at org.apache.hadoop.hdfs.server.datanode.web.DatanodeHttpServer.close(DatanodeHttpServer.java:249)
                  at org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1863)
          

          TestBlockTokenWithDFS: The datanode was restarted and having bind exception. The old port is taken.

          Test failures are not related to this patch. They are passing when run on my machine.

          Show
          kihwal Kihwal Lee added a comment - TestHFlush : HDFS-2043 Will review the patch. JDK8 failures don't have logs, so it is hard to debug. TestDFSClientRetries : timed out. Tried to restart namenode, but timed out. Without seeing the log, it s hard to know what went wrong. TestReplication : timed out. Datanode shutdown hung at netty shutdown. java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.interrupt(Native Method) at sun.nio.ch.EPollArrayWrapper.interrupt(EPollArrayWrapper.java:317) at sun.nio.ch.EPollSelectorImpl.wakeup(EPollSelectorImpl.java:207) at io.netty.channel.nio.NioEventLoop.wakeup(NioEventLoop.java:590) at io.netty.util.concurrent.SingleThreadEventExecutor.shutdownGracefully(SingleThreadEventExecutor.java:503) at io.netty.util.concurrent.MultithreadEventExecutorGroup.shutdownGracefully(MultithreadEventExecutorGroup.java:160) at io.netty.util.concurrent.AbstractEventExecutorGroup.shutdownGracefully(AbstractEventExecutorGroup.java:70) at org.apache.hadoop.hdfs.server.datanode.web.DatanodeHttpServer.close(DatanodeHttpServer.java:249) at org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1863) TestBlockTokenWithDFS : The datanode was restarted and having bind exception. The old port is taken. Test failures are not related to this patch. They are passing when run on my machine.
          Hide
          kihwal Kihwal Lee added a comment -

          Committed to trunk through branch-2.7. The 2.7 cherry-pick was clean but the test was modified to use DFSConfigKeys instead of HdfsClientConfigKeys. Thanks for reviews and comments, Akira, Arpit, Masataki and Vinay.

          Show
          kihwal Kihwal Lee added a comment - Committed to trunk through branch-2.7. The 2.7 cherry-pick was clean but the test was modified to use DFSConfigKeys instead of HdfsClientConfigKeys . Thanks for reviews and comments, Akira, Arpit, Masataki and Vinay.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #9552 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9552/)
          HDFS-10178. Permanent write failures can happen if pipeline recoveries (kihwal: rev a7d1fb0cd2fdbf830602eb4dbbd9bbe62f4d5584)

          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestClientProtocolForPipelineRecovery.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNodeFaultInjector.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #9552 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9552/ ) HDFS-10178 . Permanent write failures can happen if pipeline recoveries (kihwal: rev a7d1fb0cd2fdbf830602eb4dbbd9bbe62f4d5584) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestClientProtocolForPipelineRecovery.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNodeFaultInjector.java
          Hide
          vinayrpet Vinayakumar B added a comment -

          Thanks Kihwal Lee for making me clear about the issue, nice find and the fix.

          Show
          vinayrpet Vinayakumar B added a comment - Thanks Kihwal Lee for making me clear about the issue, nice find and the fix.
          Hide
          ctrezzo Chris Trezzo added a comment -

          Adding 2.6.5 to the target versions with the intention of backporting this to branch-2.6. Please let me know if you think otherwise. Thanks!

          Show
          ctrezzo Chris Trezzo added a comment - Adding 2.6.5 to the target versions with the intention of backporting this to branch-2.6. Please let me know if you think otherwise. Thanks!
          Hide
          vinodkv Vinod Kumar Vavilapalli added a comment -

          Closing the JIRA as part of 2.7.3 release.

          Show
          vinodkv Vinod Kumar Vavilapalli added a comment - Closing the JIRA as part of 2.7.3 release.
          Hide
          sjlee0 Sangjin Lee added a comment -

          Cherry-picked it to 2.6.5 (trivial).

          Show
          sjlee0 Sangjin Lee added a comment - Cherry-picked it to 2.6.5 (trivial).
          Hide
          ctrezzo Chris Trezzo added a comment -

          Thanks!

          Show
          ctrezzo Chris Trezzo added a comment - Thanks!

            People

            • Assignee:
              kihwal Kihwal Lee
              Reporter:
              kihwal Kihwal Lee
            • Votes:
              0 Vote for this issue
              Watchers:
              18 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development