Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-11741

Long running balancer may fail due to expired DataEncryptionKey

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.9.0, 2.7.4, 3.0.0-alpha4, 2.8.2
    • Component/s: balancer & mover
    • Labels:
      None
    • Environment:

      CDH5.8.2, Kerberos, Data transfer encryption enabled. Balancer login using keytab

      Description

      We found a long running balancer may fail despite using keytab, because KeyManager returns expired DataEncryptionKey, and it throws the following exception:

      2017-04-30 05:03:58,661 WARN  [pool-1464-thread-10] balancer.Dispatcher (Dispatcher.java:dispatch(325)) - Failed to move blk_1067352712_3913241 with size=546650 from 10.0.0.134:50010:DISK to 10.0.0.98:50010:DISK through 10.0.0.134:50010
      org.apache.hadoop.hdfs.protocol.datatransfer.InvalidEncryptionKeyException: Can't re-compute encryption key for nonce, since the required block key (keyID=1005215027) doesn't exist. Current key: 1005215030
              at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.readSaslMessageAndNegotiatedCipherOption(DataTransferSaslUtil.java:417)
              at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:474)
              at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:299)
              at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:242)
              at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:211)
              at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.socketSend(SaslDataTransferClient.java:183)
              at org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.dispatch(Dispatcher.java:311)
              at org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.access$2300(Dispatcher.java:182)
              at org.apache.hadoop.hdfs.server.balancer.Dispatcher$1.run(Dispatcher.java:899)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
              at java.lang.Thread.run(Thread.java:745)
      

      This bug is similar in nature to HDFS-10609. While balancer KeyManager actively synchronizes itself with NameNode w.r.t block keys, it does not update DataEncryptionKey accordingly.

      In a specific cluster, with Kerberos ticket life time 10 hours, and default block token expiration/life time 10 hours, a long running balancer failed after 20~30 hours.

      1. block keys.png
        9 kB
        Wei-Chiu Chuang
      2. HDFS-11741.001.patch
        5 kB
        Wei-Chiu Chuang
      3. HDFS-11741.002.patch
        13 kB
        Wei-Chiu Chuang
      4. HDFS-11741.003.patch
        16 kB
        Wei-Chiu Chuang
      5. HDFS-11741.004.patch
        16 kB
        Wei-Chiu Chuang
      6. HDFS-11741.005.patch
        15 kB
        Wei-Chiu Chuang
      7. HDFS-11741.06.patch
        14 kB
        Xiao Chen
      8. HDFS-11741.07.patch
        11 kB
        Xiao Chen
      9. HDFS-11741.08.patch
        12 kB
        Xiao Chen
      10. HDFS-11741.branch-2.01.patch
        12 kB
        Xiao Chen

        Issue Links

          Activity

          Hide
          jojochuang Wei-Chiu Chuang added a comment -

          Patch rev 001: a very simple fix with a unit test. The fix detects if the encryption key expires and generate a new one if so.

          Removing the fix, the test fails; putting it back and the test succeeds.

          Show
          jojochuang Wei-Chiu Chuang added a comment - Patch rev 001: a very simple fix with a unit test. The fix detects if the encryption key expires and generate a new one if so. Removing the fix, the test fails; putting it back and the test succeeds.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 14m 1s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 17m 19s trunk passed
          +1 compile 1m 3s trunk passed
          +1 checkstyle 0m 40s trunk passed
          +1 mvnsite 1m 8s trunk passed
          +1 mvneclipse 0m 18s trunk passed
          -1 findbugs 2m 3s hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings.
          +1 javadoc 0m 47s trunk passed
          +1 mvninstall 1m 4s the patch passed
          +1 compile 1m 1s the patch passed
          +1 javac 1m 1s the patch passed
          +1 checkstyle 0m 40s the patch passed
          +1 mvnsite 1m 7s the patch passed
          +1 mvneclipse 0m 15s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 2m 14s the patch passed
          +1 javadoc 0m 45s the patch passed
          -1 unit 113m 1s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 26s The patch does not generate ASF License warnings.
          159m 45s



          Reason Tests
          Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes
            hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting
            hadoop.hdfs.server.namenode.TestStartup
            hadoop.hdfs.server.namenode.TestMetadataVersionOutput
            hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency
            hadoop.hdfs.server.datanode.TestDirectoryScanner
          Timed out junit tests org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:14b5c93
          JIRA Issue HDFS-11741
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12866019/HDFS-11741.001.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 1fc5f3e01f5c 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / dcc292d
          Default Java 1.8.0_121
          findbugs v3.1.0-RC1
          findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/19274/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/19274/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19274/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19274/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 14m 1s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 17m 19s trunk passed +1 compile 1m 3s trunk passed +1 checkstyle 0m 40s trunk passed +1 mvnsite 1m 8s trunk passed +1 mvneclipse 0m 18s trunk passed -1 findbugs 2m 3s hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings. +1 javadoc 0m 47s trunk passed +1 mvninstall 1m 4s the patch passed +1 compile 1m 1s the patch passed +1 javac 1m 1s the patch passed +1 checkstyle 0m 40s the patch passed +1 mvnsite 1m 7s the patch passed +1 mvneclipse 0m 15s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 2m 14s the patch passed +1 javadoc 0m 45s the patch passed -1 unit 113m 1s hadoop-hdfs in the patch failed. +1 asflicense 0m 26s The patch does not generate ASF License warnings. 159m 45s Reason Tests Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting   hadoop.hdfs.server.namenode.TestStartup   hadoop.hdfs.server.namenode.TestMetadataVersionOutput   hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency   hadoop.hdfs.server.datanode.TestDirectoryScanner Timed out junit tests org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue HDFS-11741 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12866019/HDFS-11741.001.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 1fc5f3e01f5c 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / dcc292d Default Java 1.8.0_121 findbugs v3.1.0-RC1 findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/19274/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html unit https://builds.apache.org/job/PreCommit-HDFS-Build/19274/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19274/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19274/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          andrew.wang Andrew Wang added a comment -

          Hi Wei-chiu, patch looks good overall, few quick questions:

          • For other tokens, we renew before the token is expired, e.g. after half the token lifetime has elapsed. This handles clock skew and TOCTOU issues. Should we do this here too?
          • Is it possible to write a unit test using a FakeTimer rather than using Thread.sleep?
          • Test is using the JUnit 3 assert, please use JUnit 4's asserts instead.
          Show
          andrew.wang Andrew Wang added a comment - Hi Wei-chiu, patch looks good overall, few quick questions: For other tokens, we renew before the token is expired, e.g. after half the token lifetime has elapsed. This handles clock skew and TOCTOU issues. Should we do this here too? Is it possible to write a unit test using a FakeTimer rather than using Thread.sleep? Test is using the JUnit 3 assert, please use JUnit 4's asserts instead.
          Hide
          jojochuang Wei-Chiu Chuang added a comment -

          Hi Andrew, thanks for the review!

          I just realized a client side BlockTokenSecretManager generates DataEncryptionKey expiration time using now + token life time. I am not sure if that's intended, as I would have assumed the key expiration time equals the current BlockKey expiration time (which is determined by NameNode).

          So it is entirely possible that balancer has an unexpired DataEncryptionKey, corresponding to an expired BlockKey. When it talks to the other side, the expired BlockKey would fail the connection. Therefore my rev 01 patch would fix all the problems because of this mismatch.

          There are two potential fixes:

          • Change BlockTokenSecretManager so that DEK expiration is based on current BlockKey expiration.
          • Change Balancer to catch InvalidEncryptionKeyException, generate a new DEK and repeat the connection.

          I feel the first fix is the right one. But it changes every participant in HDFS, so want to double check here.

          Show
          jojochuang Wei-Chiu Chuang added a comment - Hi Andrew, thanks for the review! I just realized a client side BlockTokenSecretManager generates DataEncryptionKey expiration time using now + token life time. I am not sure if that's intended, as I would have assumed the key expiration time equals the current BlockKey expiration time (which is determined by NameNode). So it is entirely possible that balancer has an unexpired DataEncryptionKey, corresponding to an expired BlockKey. When it talks to the other side, the expired BlockKey would fail the connection. Therefore my rev 01 patch would fix all the problems because of this mismatch. There are two potential fixes: Change BlockTokenSecretManager so that DEK expiration is based on current BlockKey expiration. Change Balancer to catch InvalidEncryptionKeyException, generate a new DEK and repeat the connection. I feel the first fix is the right one. But it changes every participant in HDFS, so want to double check here.
          Hide
          jojochuang Wei-Chiu Chuang added a comment -

          Posting my rev 002 patch to address the other comments:

          Is it possible to write a unit test using a FakeTimer rather than using Thread.sleep?

          Good idea.

          Test is using the JUnit 3 assert, please use JUnit 4's asserts instead.

          Good catch!

          Show
          jojochuang Wei-Chiu Chuang added a comment - Posting my rev 002 patch to address the other comments: Is it possible to write a unit test using a FakeTimer rather than using Thread.sleep? Good idea. Test is using the JUnit 3 assert, please use JUnit 4's asserts instead. Good catch!
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 37s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 15m 12s trunk passed
          +1 compile 0m 49s trunk passed
          +1 checkstyle 0m 36s trunk passed
          +1 mvnsite 0m 52s trunk passed
          +1 mvneclipse 0m 14s trunk passed
          -1 findbugs 1m 39s hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings.
          +1 javadoc 0m 40s trunk passed
          +1 mvninstall 0m 48s the patch passed
          +1 compile 0m 45s the patch passed
          +1 javac 0m 45s the patch passed
          -0 checkstyle 0m 33s hadoop-hdfs-project/hadoop-hdfs: The patch generated 3 new + 22 unchanged - 1 fixed = 25 total (was 23)
          +1 mvnsite 0m 49s the patch passed
          +1 mvneclipse 0m 12s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 1m 43s the patch passed
          +1 javadoc 0m 37s the patch passed
          -1 unit 88m 36s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 19s The patch does not generate ASF License warnings.
          116m 19s



          Reason Tests
          Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting
            hadoop.hdfs.server.namenode.TestMetadataVersionOutput
            hadoop.hdfs.server.namenode.TestStartup
            hadoop.hdfs.server.namenode.TestDecommissioningStatus



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:14b5c93
          JIRA Issue HDFS-11741
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12866234/HDFS-11741.002.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 928953f4eb07 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / d4631e4
          Default Java 1.8.0_121
          findbugs v3.1.0-RC1
          findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/19301/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19301/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/19301/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19301/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19301/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 37s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 15m 12s trunk passed +1 compile 0m 49s trunk passed +1 checkstyle 0m 36s trunk passed +1 mvnsite 0m 52s trunk passed +1 mvneclipse 0m 14s trunk passed -1 findbugs 1m 39s hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings. +1 javadoc 0m 40s trunk passed +1 mvninstall 0m 48s the patch passed +1 compile 0m 45s the patch passed +1 javac 0m 45s the patch passed -0 checkstyle 0m 33s hadoop-hdfs-project/hadoop-hdfs: The patch generated 3 new + 22 unchanged - 1 fixed = 25 total (was 23) +1 mvnsite 0m 49s the patch passed +1 mvneclipse 0m 12s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 43s the patch passed +1 javadoc 0m 37s the patch passed -1 unit 88m 36s hadoop-hdfs in the patch failed. +1 asflicense 0m 19s The patch does not generate ASF License warnings. 116m 19s Reason Tests Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting   hadoop.hdfs.server.namenode.TestMetadataVersionOutput   hadoop.hdfs.server.namenode.TestStartup   hadoop.hdfs.server.namenode.TestDecommissioningStatus Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue HDFS-11741 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12866234/HDFS-11741.002.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 928953f4eb07 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / d4631e4 Default Java 1.8.0_121 findbugs v3.1.0-RC1 findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/19301/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19301/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/19301/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19301/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19301/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          jojochuang Wei-Chiu Chuang added a comment -

          Rev03. I've worked on the DEK/BK expiration mismatch issue whole day, while there are more elegant solutions available, I ended up with a quick fix.

          In this patch, if a balancer encounters InvalidEncryptionKeyException, it clears KeyManager DEK cache, so that it generates a DEK the next time. The existing balancer fault tolerance mechanism should allow it to retry a few more times when DEK is valid.

          Show
          jojochuang Wei-Chiu Chuang added a comment - Rev03. I've worked on the DEK/BK expiration mismatch issue whole day, while there are more elegant solutions available, I ended up with a quick fix. In this patch, if a balancer encounters InvalidEncryptionKeyException, it clears KeyManager DEK cache, so that it generates a DEK the next time. The existing balancer fault tolerance mechanism should allow it to retry a few more times when DEK is valid.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 14s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 2 new or modified test files.
          +1 mvninstall 13m 32s trunk passed
          +1 compile 0m 40s trunk passed
          +1 checkstyle 0m 30s trunk passed
          +1 mvnsite 0m 50s trunk passed
          +1 mvneclipse 0m 13s trunk passed
          -1 findbugs 1m 36s hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings.
          +1 javadoc 0m 39s trunk passed
          +1 mvninstall 0m 43s the patch passed
          +1 compile 0m 40s the patch passed
          +1 javac 0m 40s the patch passed
          -0 checkstyle 0m 31s hadoop-hdfs-project/hadoop-hdfs: The patch generated 3 new + 70 unchanged - 1 fixed = 73 total (was 71)
          +1 mvnsite 0m 47s the patch passed
          +1 mvneclipse 0m 12s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 1m 35s the patch passed
          +1 javadoc 0m 36s the patch passed
          -1 unit 62m 44s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 16s The patch does not generate ASF License warnings.
          87m 35s



          Reason Tests
          Failed junit tests hadoop.hdfs.web.TestWebHdfsTimeouts



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:14b5c93
          JIRA Issue HDFS-11741
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12866274/HDFS-11741.003.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux a0846454202a 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / fd5cb2c
          Default Java 1.8.0_121
          findbugs v3.1.0-RC1
          findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/19307/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19307/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/19307/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19307/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19307/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 14s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 2 new or modified test files. +1 mvninstall 13m 32s trunk passed +1 compile 0m 40s trunk passed +1 checkstyle 0m 30s trunk passed +1 mvnsite 0m 50s trunk passed +1 mvneclipse 0m 13s trunk passed -1 findbugs 1m 36s hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings. +1 javadoc 0m 39s trunk passed +1 mvninstall 0m 43s the patch passed +1 compile 0m 40s the patch passed +1 javac 0m 40s the patch passed -0 checkstyle 0m 31s hadoop-hdfs-project/hadoop-hdfs: The patch generated 3 new + 70 unchanged - 1 fixed = 73 total (was 71) +1 mvnsite 0m 47s the patch passed +1 mvneclipse 0m 12s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 35s the patch passed +1 javadoc 0m 36s the patch passed -1 unit 62m 44s hadoop-hdfs in the patch failed. +1 asflicense 0m 16s The patch does not generate ASF License warnings. 87m 35s Reason Tests Failed junit tests hadoop.hdfs.web.TestWebHdfsTimeouts Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue HDFS-11741 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12866274/HDFS-11741.003.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux a0846454202a 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / fd5cb2c Default Java 1.8.0_121 findbugs v3.1.0-RC1 findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/19307/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19307/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/19307/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19307/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19307/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          jojochuang Wei-Chiu Chuang added a comment -

          Zhe Zhang Rushabh S Shah mind to chime in on this observation?

          I just realized a client side BlockTokenSecretManager generates DataEncryptionKey expiration time using now + token life time. I am not sure if that's intended, as I would have assumed the key expiration time equals the current BlockKey expiration time (which is determined by NameNode).

          So it is entirely possible that balancer has an unexpired DataEncryptionKey, corresponding to an expired BlockKey. When it talks to the other side, the expired BlockKey would fail the connection. Therefore my rev 01 patch would not fix all the problems because of this mismatch.

          Thanks!

          Show
          jojochuang Wei-Chiu Chuang added a comment - Zhe Zhang Rushabh S Shah mind to chime in on this observation? I just realized a client side BlockTokenSecretManager generates DataEncryptionKey expiration time using now + token life time. I am not sure if that's intended, as I would have assumed the key expiration time equals the current BlockKey expiration time (which is determined by NameNode). So it is entirely possible that balancer has an unexpired DataEncryptionKey, corresponding to an expired BlockKey. When it talks to the other side, the expired BlockKey would fail the connection. Therefore my rev 01 patch would not fix all the problems because of this mismatch. Thanks!
          Hide
          yzhangal Yongjun Zhang added a comment -

          Hi Wei-Chiu Chuang,

          Thanks for your work here. It seems the patch doesn't apply anymore. Would you please update it?

          Thanks.

          Show
          yzhangal Yongjun Zhang added a comment - Hi Wei-Chiu Chuang , Thanks for your work here. It seems the patch doesn't apply anymore. Would you please update it? Thanks.
          Hide
          jojochuang Wei-Chiu Chuang added a comment -

          Hi Yongjun. The patch should still apply with git apply -3.
          I posted a new patch for a little update.

          Thinking it again, I am able to answer the previous question. DelegationTokenKey expires in token life time, whereas current block key expires in 2* key update interval + token life time, so in most cases DTK shouldn't expire before current BK expires.

          I wanted to add unit tests for the code in Dispatcher#dispatch that catches InvalidEncryptionKey exception, but the dispatcher code is quite monolithic and I haven't find a good way to write a unit test. Even an integration test is not trivial.

          Show
          jojochuang Wei-Chiu Chuang added a comment - Hi Yongjun. The patch should still apply with git apply -3. I posted a new patch for a little update. Thinking it again, I am able to answer the previous question. DelegationTokenKey expires in token life time, whereas current block key expires in 2* key update interval + token life time, so in most cases DTK shouldn't expire before current BK expires. I wanted to add unit tests for the code in Dispatcher#dispatch that catches InvalidEncryptionKey exception, but the dispatcher code is quite monolithic and I haven't find a good way to write a unit test. Even an integration test is not trivial.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 16s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 2 new or modified test files.
          +1 mvninstall 13m 29s trunk passed
          +1 compile 0m 49s trunk passed
          +1 checkstyle 0m 40s trunk passed
          +1 mvnsite 0m 54s trunk passed
          +1 mvneclipse 0m 15s trunk passed
          +1 findbugs 1m 41s trunk passed
          +1 javadoc 0m 39s trunk passed
          +1 mvninstall 0m 51s the patch passed
          +1 compile 0m 48s the patch passed
          +1 javac 0m 48s the patch passed
          -0 checkstyle 0m 36s hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 71 unchanged - 0 fixed = 73 total (was 71)
          +1 mvnsite 0m 52s the patch passed
          +1 mvneclipse 0m 11s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 1m 46s the patch passed
          +1 javadoc 0m 38s the patch passed
          -1 unit 70m 51s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 25s The patch does not generate ASF License warnings.
          97m 4s



          Reason Tests
          Failed junit tests hadoop.hdfs.web.TestWebHDFS
            hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080
            hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting
            hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:14b5c93
          JIRA Issue HDFS-11741
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12869326/HDFS-11741.004.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux d658dd0b9b9e 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 9cab42c
          Default Java 1.8.0_131
          findbugs v3.1.0-RC1
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19545/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/19545/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19545/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19545/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 16s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 2 new or modified test files. +1 mvninstall 13m 29s trunk passed +1 compile 0m 49s trunk passed +1 checkstyle 0m 40s trunk passed +1 mvnsite 0m 54s trunk passed +1 mvneclipse 0m 15s trunk passed +1 findbugs 1m 41s trunk passed +1 javadoc 0m 39s trunk passed +1 mvninstall 0m 51s the patch passed +1 compile 0m 48s the patch passed +1 javac 0m 48s the patch passed -0 checkstyle 0m 36s hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 71 unchanged - 0 fixed = 73 total (was 71) +1 mvnsite 0m 52s the patch passed +1 mvneclipse 0m 11s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 46s the patch passed +1 javadoc 0m 38s the patch passed -1 unit 70m 51s hadoop-hdfs in the patch failed. +1 asflicense 0m 25s The patch does not generate ASF License warnings. 97m 4s Reason Tests Failed junit tests hadoop.hdfs.web.TestWebHDFS   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting   hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue HDFS-11741 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12869326/HDFS-11741.004.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux d658dd0b9b9e 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 9cab42c Default Java 1.8.0_131 findbugs v3.1.0-RC1 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19545/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/19545/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19545/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19545/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          xiaochen Xiao Chen added a comment -

          Thanks Wei-Chiu Chuang for reporting the issue and working on the fix, and others for reviewing.

          Just want to make sure I understand correctly: the problem is the KeyManager instance in the Dispatcher uses a version of encryptionKey, which is associated with a BlockKey that is larger than 2 * keyUpdateInterval + tokenLifetime old. So the balancer side of BlockTokenSecretManager cannot find that BlockKey, and this is because the encryptionKey object isn't updated.

          If above is correct, can we go with the route to have KM's BlockKeyUpdater (or a new EKUpdater) to update the encryptionKey periodically (say, tokenLifetime / 2, or /4) as well? I think this is more future proof because KeyManager is associated with NameNodeConnector - it seems dispatcher is the only place that retrieves this KM, but I feel the problem exists with NNC.

          Show
          xiaochen Xiao Chen added a comment - Thanks Wei-Chiu Chuang for reporting the issue and working on the fix, and others for reviewing. Just want to make sure I understand correctly: the problem is the KeyManager instance in the Dispatcher uses a version of encryptionKey , which is associated with a BlockKey that is larger than 2 * keyUpdateInterval + tokenLifetime old. So the balancer side of BlockTokenSecretManager cannot find that BlockKey , and this is because the encryptionKey object isn't updated. If above is correct, can we go with the route to have KM's BlockKeyUpdater (or a new EKUpdater) to update the encryptionKey periodically (say, tokenLifetime / 2, or /4) as well? I think this is more future proof because KeyManager is associated with NameNodeConnector - it seems dispatcher is the only place that retrieves this KM, but I feel the problem exists with NNC.
          Hide
          jojochuang Wei-Chiu Chuang added a comment -

          Thanks Xiao Chen.
          I like your suggestion. I initially just wanted to maintain the parity to DFSClient#newDataEncryptionKey. But that's actually not needed: DFSClient does not have access to block key, so it has to ask NameNode for DEK. Balancer KeyManager has access to block key, so it can generate DEK on its own, no extra overhead for NN.

          Show
          jojochuang Wei-Chiu Chuang added a comment - Thanks Xiao Chen . I like your suggestion. I initially just wanted to maintain the parity to DFSClient#newDataEncryptionKey. But that's actually not needed: DFSClient does not have access to block key, so it has to ask NameNode for DEK. Balancer KeyManager has access to block key, so it can generate DEK on its own, no extra overhead for NN.
          Hide
          jojochuang Wei-Chiu Chuang added a comment -


          Assuming a block key is generated at t=0. After one key update interval (t=Tk), it becomes current key and is used for generating block tokens and data encryption key. After token life time (t=Tk+Tl), the key retires, and a new key (generated at t=Tk) becomes current. However, this key is still kept in BlockTokenSecretManager and can be used for block token verification and decrypt data. The key finally expires at (t=2*Tk+2*Tl).

          After the fix in my patch, the only way block key expires before DEK expires is that balancer's local time drifts by more than one key update interval (that is, 10 hours). If there is such a long drift, a lot of other things would already not work.

          Show
          jojochuang Wei-Chiu Chuang added a comment - Assuming a block key is generated at t=0. After one key update interval (t=Tk), it becomes current key and is used for generating block tokens and data encryption key. After token life time (t=Tk+Tl), the key retires, and a new key (generated at t=Tk) becomes current. However, this key is still kept in BlockTokenSecretManager and can be used for block token verification and decrypt data. The key finally expires at (t=2*Tk+2*Tl). After the fix in my patch, the only way block key expires before DEK expires is that balancer's local time drifts by more than one key update interval (that is, 10 hours). If there is such a long drift, a lot of other things would already not work.
          Hide
          jojochuang Wei-Chiu Chuang added a comment - - edited

          Attached rev 005 patch.
          I removed the change in Dispatcher, because it is hard to unit test and it is unclear if the fix would actually work. If this case does happen (it should only happen with extreme time drift), let's grab stack trace and logs and file a new jira to fix it.

          As for the proposal to change BlockKeyUpdater or add a DEKUpdater, I don't feel it is needed to aggressively update DEK. As in my last comment, after this patch, the only way DEK expires after the associated block key expires is the balancer node has a > 1* key update interval time drift=10 hours.

          But if you think it's still necessary, I suggest we change

          encryptionKey.expiryDate < timer.now() 
          

          to

          encryptionKey.expiryDate - 3*4*keyUpdateInterval < timer.now() 
          

          This is easier than introducing a new class.

          Show
          jojochuang Wei-Chiu Chuang added a comment - - edited Attached rev 005 patch. I removed the change in Dispatcher, because it is hard to unit test and it is unclear if the fix would actually work. If this case does happen (it should only happen with extreme time drift), let's grab stack trace and logs and file a new jira to fix it. As for the proposal to change BlockKeyUpdater or add a DEKUpdater, I don't feel it is needed to aggressively update DEK. As in my last comment, after this patch, the only way DEK expires after the associated block key expires is the balancer node has a > 1* key update interval time drift=10 hours. But if you think it's still necessary, I suggest we change encryptionKey.expiryDate < timer.now() to encryptionKey.expiryDate - 3*4*keyUpdateInterval < timer.now() This is easier than introducing a new class.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 17s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 2 new or modified test files.
          +1 mvninstall 14m 27s trunk passed
          +1 compile 1m 0s trunk passed
          +1 checkstyle 0m 40s trunk passed
          +1 mvnsite 1m 5s trunk passed
          +1 mvneclipse 0m 15s trunk passed
          +1 findbugs 1m 52s trunk passed
          +1 javadoc 0m 47s trunk passed
          +1 mvninstall 1m 0s the patch passed
          +1 compile 0m 53s the patch passed
          +1 javac 0m 53s the patch passed
          -0 checkstyle 0m 35s hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 71 unchanged - 0 fixed = 73 total (was 71)
          +1 mvnsite 1m 3s the patch passed
          +1 mvneclipse 0m 13s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 2m 1s the patch passed
          +1 javadoc 0m 44s the patch passed
          -1 unit 76m 41s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 24s The patch does not generate ASF License warnings.
          105m 31s



          Reason Tests
          Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting
            hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
            hadoop.hdfs.server.namenode.TestDecommissioningStatus
            hadoop.hdfs.TestEncryptionZones
          Timed out junit tests org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:14b5c93
          JIRA Issue HDFS-11741
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12869967/HDFS-11741.005.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 26aa1cb2844b 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 2b5ad48
          Default Java 1.8.0_131
          findbugs v3.1.0-RC1
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19622/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/19622/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19622/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19622/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 17s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 2 new or modified test files. +1 mvninstall 14m 27s trunk passed +1 compile 1m 0s trunk passed +1 checkstyle 0m 40s trunk passed +1 mvnsite 1m 5s trunk passed +1 mvneclipse 0m 15s trunk passed +1 findbugs 1m 52s trunk passed +1 javadoc 0m 47s trunk passed +1 mvninstall 1m 0s the patch passed +1 compile 0m 53s the patch passed +1 javac 0m 53s the patch passed -0 checkstyle 0m 35s hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 71 unchanged - 0 fixed = 73 total (was 71) +1 mvnsite 1m 3s the patch passed +1 mvneclipse 0m 13s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 2m 1s the patch passed +1 javadoc 0m 44s the patch passed -1 unit 76m 41s hadoop-hdfs in the patch failed. +1 asflicense 0m 24s The patch does not generate ASF License warnings. 105m 31s Reason Tests Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting   hadoop.hdfs.server.namenode.ha.TestPipelinesFailover   hadoop.hdfs.server.namenode.TestDecommissioningStatus   hadoop.hdfs.TestEncryptionZones Timed out junit tests org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue HDFS-11741 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12869967/HDFS-11741.005.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 26aa1cb2844b 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 2b5ad48 Default Java 1.8.0_131 findbugs v3.1.0-RC1 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19622/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/19622/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19622/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19622/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          xiaochen Xiao Chen added a comment - - edited

          Thanks for revving Wei-Chiu, good analysis!

          As talked offline I think generating new DEK would be sufficient.

          Prefer the encryptionKey.expiryDate - keyUpdateInterval / 4 * 3 < timer.now() route to prevent TOCTOU as Andrew pointed out earlier.

          Nits:

          • LOG.debug("Getting new encryption token from NN"); IIUC this is local
          • Please remove the stale changes in TestBalancerWithEncryptedTransfer and Dispatcher
          • I think the test case in TestKeyManager needs updating after the 3/4 interval change - new DEK not generated based on expiry, but actually on BK's update interval. Maybe we can choose different updateInterval and tokenLifetime to differentiate it in the test.
          • Let's use a safer test timeout to reduce false positives due to infra.
          Show
          xiaochen Xiao Chen added a comment - - edited Thanks for revving Wei-Chiu, good analysis! As talked offline I think generating new DEK would be sufficient. Prefer the encryptionKey.expiryDate - keyUpdateInterval / 4 * 3 < timer.now() route to prevent TOCTOU as Andrew pointed out earlier. Nits: LOG.debug("Getting new encryption token from NN"); IIUC this is local Please remove the stale changes in TestBalancerWithEncryptedTransfer and Dispatcher I think the test case in TestKeyManager needs updating after the 3/4 interval change - new DEK not generated based on expiry, but actually on BK's update interval. Maybe we can choose different updateInterval and tokenLifetime to differentiate it in the test. Let's use a safer test timeout to reduce false positives due to infra.
          Hide
          xiaochen Xiao Chen added a comment -

          Looked again at this one, and I think encryptionKey.expiryDate < timer.now() should be OK.

          Found the graph isn't exactly accurate though, I think the timeline of the block key goes like this:
          generate — (Tk) ---> becomes current — (Tk (not Tl)) ---> retiring — (Tk + Tl) ---> expire (removed)

          And the Encryption Key expires at Tl.

          So with the buffer of (Tk) before block key removal, I think we're safe to only compare Encryption Key's expiry as you said.

          Show
          xiaochen Xiao Chen added a comment - Looked again at this one, and I think encryptionKey.expiryDate < timer.now() should be OK. Found the graph isn't exactly accurate though, I think the timeline of the block key goes like this: generate — (Tk) ---> becomes current — (Tk (not Tl)) ---> retiring — (Tk + Tl) ---> expire (removed) And the Encryption Key expires at Tl. So with the buffer of (Tk) before block key removal, I think we're safe to only compare Encryption Key's expiry as you said.
          Hide
          xiaochen Xiao Chen added a comment -

          Posting a rev based on my review (since Wei-Chiu is on leave) per offline chat with Yongjun Zhang.
          Yongjun, could you please take a look? Thanks much.

          Show
          xiaochen Xiao Chen added a comment - Posting a rev based on my review (since Wei-Chiu is on leave) per offline chat with Yongjun Zhang . Yongjun, could you please take a look? Thanks much.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 21s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 12m 54s trunk passed
          +1 compile 0m 47s trunk passed
          +1 checkstyle 0m 31s trunk passed
          +1 mvnsite 0m 50s trunk passed
          +1 mvneclipse 0m 11s trunk passed
          +1 findbugs 1m 30s trunk passed
          +1 javadoc 0m 37s trunk passed
          +1 mvninstall 0m 43s the patch passed
          +1 compile 0m 40s the patch passed
          +1 javac 0m 40s the patch passed
          -0 checkstyle 0m 29s hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 23 unchanged - 0 fixed = 24 total (was 23)
          +1 mvnsite 0m 49s the patch passed
          +1 mvneclipse 0m 13s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          -1 findbugs 1m 40s hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
          +1 javadoc 0m 35s the patch passed
          -1 unit 65m 58s hadoop-hdfs in the patch failed.
          -1 asflicense 0m 19s The patch generated 1 ASF License warnings.
          90m 15s



          Reason Tests
          FindBugs module:hadoop-hdfs-project/hadoop-hdfs
            Possible null pointer dereference of KeyManager.encryptionKey in org.apache.hadoop.hdfs.server.balancer.KeyManager.newDataEncryptionKey() Dereferenced at KeyManager.java:KeyManager.encryptionKey in org.apache.hadoop.hdfs.server.balancer.KeyManager.newDataEncryptionKey() Dereferenced at KeyManager.java:[line 139]
          Failed junit tests hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
            hadoop.hdfs.server.balancer.TestKeyManager
            hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication
            hadoop.hdfs.server.balancer.TestBalancer
            hadoop.hdfs.web.TestWebHdfsTimeouts
            hadoop.hdfs.server.balancer.TestBalancerWithEncryptedTransfer



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:14b5c93
          JIRA Issue HDFS-11741
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12870500/HDFS-11741.06.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 564b5eebee40 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 4b4a652
          Default Java 1.8.0_131
          findbugs v3.1.0-RC1
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19685/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
          findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/19685/artifact/patchprocess/new-findbugs-hadoop-hdfs-project_hadoop-hdfs.html
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/19685/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19685/testReport/
          asflicense https://builds.apache.org/job/PreCommit-HDFS-Build/19685/artifact/patchprocess/patch-asflicense-problems.txt
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19685/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 21s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 12m 54s trunk passed +1 compile 0m 47s trunk passed +1 checkstyle 0m 31s trunk passed +1 mvnsite 0m 50s trunk passed +1 mvneclipse 0m 11s trunk passed +1 findbugs 1m 30s trunk passed +1 javadoc 0m 37s trunk passed +1 mvninstall 0m 43s the patch passed +1 compile 0m 40s the patch passed +1 javac 0m 40s the patch passed -0 checkstyle 0m 29s hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 23 unchanged - 0 fixed = 24 total (was 23) +1 mvnsite 0m 49s the patch passed +1 mvneclipse 0m 13s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. -1 findbugs 1m 40s hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) +1 javadoc 0m 35s the patch passed -1 unit 65m 58s hadoop-hdfs in the patch failed. -1 asflicense 0m 19s The patch generated 1 ASF License warnings. 90m 15s Reason Tests FindBugs module:hadoop-hdfs-project/hadoop-hdfs   Possible null pointer dereference of KeyManager.encryptionKey in org.apache.hadoop.hdfs.server.balancer.KeyManager.newDataEncryptionKey() Dereferenced at KeyManager.java:KeyManager.encryptionKey in org.apache.hadoop.hdfs.server.balancer.KeyManager.newDataEncryptionKey() Dereferenced at KeyManager.java: [line 139] Failed junit tests hadoop.hdfs.server.namenode.ha.TestPipelinesFailover   hadoop.hdfs.server.balancer.TestKeyManager   hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication   hadoop.hdfs.server.balancer.TestBalancer   hadoop.hdfs.web.TestWebHdfsTimeouts   hadoop.hdfs.server.balancer.TestBalancerWithEncryptedTransfer Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue HDFS-11741 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12870500/HDFS-11741.06.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 564b5eebee40 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 4b4a652 Default Java 1.8.0_131 findbugs v3.1.0-RC1 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19685/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/19685/artifact/patchprocess/new-findbugs-hadoop-hdfs-project_hadoop-hdfs.html unit https://builds.apache.org/job/PreCommit-HDFS-Build/19685/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19685/testReport/ asflicense https://builds.apache.org/job/PreCommit-HDFS-Build/19685/artifact/patchprocess/patch-asflicense-problems.txt modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19685/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          xiaochen Xiao Chen added a comment -

          Patch 7 to fix checkstyle/findbugs. Using Whitebox in tests so no need to change constructors of KM/BTSM.

          Show
          xiaochen Xiao Chen added a comment - Patch 7 to fix checkstyle/findbugs. Using Whitebox in tests so no need to change constructors of KM/BTSM.
          Hide
          yzhangal Yongjun Zhang added a comment -

          Thanks Xiao Chen for the updated patch.

          Suggest to make the following comment change, +1 after that.

                  if (encryptionKey == null ||
                      encryptionKey.expiryDate < timer.now()) {
                    // Encryption Key (EK) is generated from Block Key (BK), but its
                    // expiryDate is solely based on tokenLifetime.
                    // Once EK is expired, we need to generate a new one using the current
                    // BK. Retired BK is kept for (keyUpdateInterval + tokenLifetime)
                    // before removal.
                    // See BlockTokenSecretManager for details.
                    LOG.debug("Generating new data encryption key because current key"
                            + " expired on {}.", encryptionKey.expiryDate);
                    encryptionKey = blockTokenSecretManager.generateDataEncryptionKey();
                  }
                  return encryptionKey;
          

          to

                  if (encryptionKey == null ||
                      encryptionKey.expiryDate < timer.now()) {
                    // Encryption Key (EK) is generated from Block Key (BK). 
                    // Check if EK is expired here, and generate a new one using the current
                    // BK if so, otherwise continue to use the previously generated EK.
                    //
                    // It's important to make sure that when EK is not expired, the BK used to
                    // generate the EK  is not expired and removed, because the same BK
                    // will be used to re-generate the EK by BlockTokenSecretManager.
                    //
                    // The current implementation ensure that when an EK is not expired (even if
                    // it's close to expiration), the BK that's  used to generate it still has has at least
                    //  "key update interval" of life time before the BK gets expired and removed.
                    // See BlockTokenSecretManager for details.
                    // 
                    LOG.debug("Generating new data encryption key because current key"
                            + " expired on {}.", encryptionKey.expiryDate);
                    encryptionKey = blockTokenSecretManager.generateDataEncryptionKey();
                  }
                  return encryptionKey;
          
          Show
          yzhangal Yongjun Zhang added a comment - Thanks Xiao Chen for the updated patch. Suggest to make the following comment change, +1 after that. if (encryptionKey == null || encryptionKey.expiryDate < timer.now()) { // Encryption Key (EK) is generated from Block Key (BK), but its // expiryDate is solely based on tokenLifetime. // Once EK is expired, we need to generate a new one using the current // BK. Retired BK is kept for (keyUpdateInterval + tokenLifetime) // before removal. // See BlockTokenSecretManager for details. LOG.debug( "Generating new data encryption key because current key" + " expired on {}." , encryptionKey.expiryDate); encryptionKey = blockTokenSecretManager.generateDataEncryptionKey(); } return encryptionKey; to if (encryptionKey == null || encryptionKey.expiryDate < timer.now()) { // Encryption Key (EK) is generated from Block Key (BK). // Check if EK is expired here, and generate a new one using the current // BK if so, otherwise continue to use the previously generated EK. // // It's important to make sure that when EK is not expired, the BK used to // generate the EK is not expired and removed, because the same BK // will be used to re-generate the EK by BlockTokenSecretManager. // // The current implementation ensure that when an EK is not expired (even if // it's close to expiration), the BK that's used to generate it still has has at least // "key update interval" of life time before the BK gets expired and removed. // See BlockTokenSecretManager for details. // LOG.debug( "Generating new data encryption key because current key" + " expired on {}." , encryptionKey.expiryDate); encryptionKey = blockTokenSecretManager.generateDataEncryptionKey(); } return encryptionKey;
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 30s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 14m 44s trunk passed
          +1 compile 0m 50s trunk passed
          +1 checkstyle 0m 37s trunk passed
          +1 mvnsite 0m 56s trunk passed
          +1 mvneclipse 0m 16s trunk passed
          +1 findbugs 1m 46s trunk passed
          +1 javadoc 0m 42s trunk passed
          +1 mvninstall 0m 49s the patch passed
          +1 compile 0m 48s the patch passed
          +1 javac 0m 48s the patch passed
          +1 checkstyle 0m 34s the patch passed
          +1 mvnsite 1m 4s the patch passed
          +1 mvneclipse 0m 15s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 1m 56s the patch passed
          +1 javadoc 0m 39s the patch passed
          -1 unit 91m 14s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 22s The patch does not generate ASF License warnings.
          119m 29s



          Reason Tests
          Failed junit tests hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070
            hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes
            hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:14b5c93
          JIRA Issue HDFS-11741
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12870527/HDFS-11741.07.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 4f5408dd9642 3.13.0-108-generic #155-Ubuntu SMP Wed Jan 11 16:58:52 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 4b4a652
          Default Java 1.8.0_131
          findbugs v3.1.0-RC1
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/19689/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19689/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19689/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 30s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 14m 44s trunk passed +1 compile 0m 50s trunk passed +1 checkstyle 0m 37s trunk passed +1 mvnsite 0m 56s trunk passed +1 mvneclipse 0m 16s trunk passed +1 findbugs 1m 46s trunk passed +1 javadoc 0m 42s trunk passed +1 mvninstall 0m 49s the patch passed +1 compile 0m 48s the patch passed +1 javac 0m 48s the patch passed +1 checkstyle 0m 34s the patch passed +1 mvnsite 1m 4s the patch passed +1 mvneclipse 0m 15s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 56s the patch passed +1 javadoc 0m 39s the patch passed -1 unit 91m 14s hadoop-hdfs in the patch failed. +1 asflicense 0m 22s The patch does not generate ASF License warnings. 119m 29s Reason Tests Failed junit tests hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070   hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue HDFS-11741 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12870527/HDFS-11741.07.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 4f5408dd9642 3.13.0-108-generic #155-Ubuntu SMP Wed Jan 11 16:58:52 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 4b4a652 Default Java 1.8.0_131 findbugs v3.1.0-RC1 unit https://builds.apache.org/job/PreCommit-HDFS-Build/19689/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19689/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19689/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          xiaochen Xiao Chen added a comment -

          Thanks for the review Yongjun!

          Updated patch 8 to use your suggested comments, with some typos fixed and a minor modification at the 2nd paragraph:
          We want BK to still exist (not removed) not because the next EK can be regenerated from it, but because the current EK can be verified ('re-compute' in the exception message).
          After a current BK is rolled to retire (current -> retire, which will be kept for keyUpdateInterval + tokenLifetime), the next EK will not be generated from it, but from the next BK (next -> current).

          Show
          xiaochen Xiao Chen added a comment - Thanks for the review Yongjun! Updated patch 8 to use your suggested comments, with some typos fixed and a minor modification at the 2nd paragraph: We want BK to still exist (not removed) not because the next EK can be regenerated from it, but because the current EK can be verified ('re-compute' in the exception message). After a current BK is rolled to retire (current -> retire, which will be kept for keyUpdateInterval + tokenLifetime), the next EK will not be generated from it, but from the next BK (next -> current).
          Hide
          xiaochen Xiao Chen added a comment -

          Discussed offline with Yongjun, and understood that he meant the retrieveDataEncryptionKey using the provided nonce and key, which indeed is a re-generate.
          Applying the suggested comments on that paragraph.

          Show
          xiaochen Xiao Chen added a comment - Discussed offline with Yongjun, and understood that he meant the retrieveDataEncryptionKey using the provided nonce and key, which indeed is a re-generate. Applying the suggested comments on that paragraph.
          Hide
          yzhangal Yongjun Zhang added a comment -

          Thanks Xiao Chen, +1 pending jenkins.

          Show
          yzhangal Yongjun Zhang added a comment - Thanks Xiao Chen , +1 pending jenkins.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 23s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 15m 19s trunk passed
          +1 compile 0m 59s trunk passed
          +1 checkstyle 0m 39s trunk passed
          +1 mvnsite 1m 4s trunk passed
          +1 mvneclipse 0m 15s trunk passed
          +1 findbugs 1m 55s trunk passed
          +1 javadoc 0m 45s trunk passed
          +1 mvninstall 1m 1s the patch passed
          +1 compile 0m 57s the patch passed
          +1 javac 0m 57s the patch passed
          +1 checkstyle 0m 36s the patch passed
          +1 mvnsite 1m 0s the patch passed
          +1 mvneclipse 0m 11s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 1m 47s the patch passed
          +1 javadoc 0m 39s the patch passed
          -1 unit 102m 13s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 26s The patch does not generate ASF License warnings.
          131m 32s



          Reason Tests
          Failed junit tests hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080
            hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure
            hadoop.hdfs.TestDFSStripedInputStreamWithRandomECPolicy
            hadoop.hdfs.server.balancer.TestBalancer
            hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:14b5c93
          JIRA Issue HDFS-11741
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12870593/HDFS-11741.08.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux d70ef5e36f96 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 1543d0f
          Default Java 1.8.0_131
          findbugs v3.1.0-RC1
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/19700/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19700/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19700/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 23s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 15m 19s trunk passed +1 compile 0m 59s trunk passed +1 checkstyle 0m 39s trunk passed +1 mvnsite 1m 4s trunk passed +1 mvneclipse 0m 15s trunk passed +1 findbugs 1m 55s trunk passed +1 javadoc 0m 45s trunk passed +1 mvninstall 1m 1s the patch passed +1 compile 0m 57s the patch passed +1 javac 0m 57s the patch passed +1 checkstyle 0m 36s the patch passed +1 mvnsite 1m 0s the patch passed +1 mvneclipse 0m 11s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 47s the patch passed +1 javadoc 0m 39s the patch passed -1 unit 102m 13s hadoop-hdfs in the patch failed. +1 asflicense 0m 26s The patch does not generate ASF License warnings. 131m 32s Reason Tests Failed junit tests hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure   hadoop.hdfs.TestDFSStripedInputStreamWithRandomECPolicy   hadoop.hdfs.server.balancer.TestBalancer   hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue HDFS-11741 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12870593/HDFS-11741.08.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux d70ef5e36f96 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 1543d0f Default Java 1.8.0_131 findbugs v3.1.0-RC1 unit https://builds.apache.org/job/PreCommit-HDFS-Build/19700/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19700/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19700/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 43s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 17m 11s trunk passed
          +1 compile 0m 55s trunk passed
          +1 checkstyle 0m 43s trunk passed
          +1 mvnsite 1m 6s trunk passed
          +1 mvneclipse 0m 18s trunk passed
          +1 findbugs 1m 57s trunk passed
          +1 javadoc 0m 49s trunk passed
          +1 mvninstall 1m 0s the patch passed
          +1 compile 0m 49s the patch passed
          +1 javac 0m 49s the patch passed
          +1 checkstyle 0m 36s the patch passed
          +1 mvnsite 1m 2s the patch passed
          +1 mvneclipse 0m 12s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 1m 56s the patch passed
          +1 javadoc 0m 42s the patch passed
          -1 unit 101m 11s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 21s The patch does not generate ASF License warnings.
          133m 2s



          Reason Tests
          Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting
            hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery
            hadoop.hdfs.server.namenode.TestDecommissioningStatus



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:14b5c93
          JIRA Issue HDFS-11741
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12870598/HDFS-11741.08.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 5aa3dff26374 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 1543d0f
          Default Java 1.8.0_131
          findbugs v3.1.0-RC1
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/19703/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19703/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19703/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 43s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 17m 11s trunk passed +1 compile 0m 55s trunk passed +1 checkstyle 0m 43s trunk passed +1 mvnsite 1m 6s trunk passed +1 mvneclipse 0m 18s trunk passed +1 findbugs 1m 57s trunk passed +1 javadoc 0m 49s trunk passed +1 mvninstall 1m 0s the patch passed +1 compile 0m 49s the patch passed +1 javac 0m 49s the patch passed +1 checkstyle 0m 36s the patch passed +1 mvnsite 1m 2s the patch passed +1 mvneclipse 0m 12s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 56s the patch passed +1 javadoc 0m 42s the patch passed -1 unit 101m 11s hadoop-hdfs in the patch failed. +1 asflicense 0m 21s The patch does not generate ASF License warnings. 133m 2s Reason Tests Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting   hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery   hadoop.hdfs.server.namenode.TestDecommissioningStatus Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue HDFS-11741 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12870598/HDFS-11741.08.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 5aa3dff26374 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 1543d0f Default Java 1.8.0_131 findbugs v3.1.0-RC1 unit https://builds.apache.org/job/PreCommit-HDFS-Build/19703/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19703/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19703/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          xiaochen Xiao Chen added a comment -

          Test failures look unrelated. Attaching a branch-2 patch due to conflicts.

          Show
          xiaochen Xiao Chen added a comment - Test failures look unrelated. Attaching a branch-2 patch due to conflicts.
          Hide
          yzhangal Yongjun Zhang added a comment -

          Thanks Xiao Chen.

          I'm +1 on both the trunk and branch-2 version. Please do run the tests failed in trunk manually to see them succeed, they do look unrelated to me though. And go ahead commit the patch please.

          Show
          yzhangal Yongjun Zhang added a comment - Thanks Xiao Chen . I'm +1 on both the trunk and branch-2 version. Please do run the tests failed in trunk manually to see them succeed, they do look unrelated to me though. And go ahead commit the patch please.
          Hide
          xiaochen Xiao Chen added a comment -

          Was waiting on the branch-2 pre-commit. Just checked jenkins and seems none. Kicked off https://builds.apache.org/job/PreCommit-HDFS-Build/19714

          Show
          xiaochen Xiao Chen added a comment - Was waiting on the branch-2 pre-commit. Just checked jenkins and seems none. Kicked off https://builds.apache.org/job/PreCommit-HDFS-Build/19714
          Hide
          xiaochen Xiao Chen added a comment -

          ... and precommits are having problems, filed INFRA-14261

          Show
          xiaochen Xiao Chen added a comment - ... and precommits are having problems, filed INFRA-14261
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 0s Docker mode activated.
          -1 docker 8m 10s Docker failed to build yetus/hadoop:8515d35.



          Subsystem Report/Notes
          JIRA Issue HDFS-11741
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12870697/HDFS-11741.branch-2.01.patch
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19727/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 0s Docker mode activated. -1 docker 8m 10s Docker failed to build yetus/hadoop:8515d35. Subsystem Report/Notes JIRA Issue HDFS-11741 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12870697/HDFS-11741.branch-2.01.patch Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19727/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 0s Docker mode activated.
          -1 docker 0m 18s Docker failed to build yetus/hadoop:8515d35.



          Subsystem Report/Notes
          JIRA Issue HDFS-11741
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12870697/HDFS-11741.branch-2.01.patch
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19730/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 0s Docker mode activated. -1 docker 0m 18s Docker failed to build yetus/hadoop:8515d35. Subsystem Report/Notes JIRA Issue HDFS-11741 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12870697/HDFS-11741.branch-2.01.patch Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19730/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          xiaochen Xiao Chen added a comment -

          INFRA-14261 is fixed, but YETUS-515 surfaced....

          Show
          xiaochen Xiao Chen added a comment - INFRA-14261 is fixed, but YETUS-515 surfaced....
          Hide
          xiaochen Xiao Chen added a comment -

          Turns out YETUS-515 is the same as HADOOP-14474. Commented there to see if we can unblock branch-2 soon.
          Will manually compile and run related tests if it's not done by end of today, if no objections.

          Show
          xiaochen Xiao Chen added a comment - Turns out YETUS-515 is the same as HADOOP-14474 . Commented there to see if we can unblock branch-2 soon. Will manually compile and run related tests if it's not done by end of today, if no objections.
          Hide
          xiaochen Xiao Chen added a comment -

          Compiled and ran TestBlockToken & TestKeyManager locally on branch-2, passed.
          Ran the failed tests reported by pre-commit on trunk, passed.

          Committed this to trunk, branch-2, branch-2.8. Thanks Wei-Chiu Chuang for reporting and fixing the issue, and Andrew Wang Yongjun Zhang for reviews!

          Show
          xiaochen Xiao Chen added a comment - Compiled and ran TestBlockToken & TestKeyManager locally on branch-2, passed. Ran the failed tests reported by pre-commit on trunk, passed. Committed this to trunk, branch-2, branch-2.8. Thanks Wei-Chiu Chuang for reporting and fixing the issue, and Andrew Wang Yongjun Zhang for reviews!
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11813 (See https://builds.apache.org/job/Hadoop-trunk-Commit/11813/)
          HDFS-11741. Long running balancer may fail due to expired (xiao: rev 6a3fc685a98718742c351ed6625dc7a4dee55e77)

          • (add) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestKeyManager.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/security/token/block/BlockTokenSecretManager.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/KeyManager.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11813 (See https://builds.apache.org/job/Hadoop-trunk-Commit/11813/ ) HDFS-11741 . Long running balancer may fail due to expired (xiao: rev 6a3fc685a98718742c351ed6625dc7a4dee55e77) (add) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestKeyManager.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/security/token/block/BlockTokenSecretManager.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/KeyManager.java
          Hide
          jojochuang Wei-Chiu Chuang added a comment -

          Thanks Xiao Chen and Yongjun Zhang for pushing the patch to the finish line.

          Show
          jojochuang Wei-Chiu Chuang added a comment - Thanks Xiao Chen and Yongjun Zhang for pushing the patch to the finish line.
          Hide
          brahmareddy Brahma Reddy Battula added a comment -

          Wei-Chiu Chuang nice finding. As HDFS-9804 committed to branch-2.7, this jira also should goto branch-2.7.

          Show
          brahmareddy Brahma Reddy Battula added a comment - Wei-Chiu Chuang nice finding. As HDFS-9804 committed to branch-2.7, this jira also should goto branch-2.7.
          Hide
          jojochuang Wei-Chiu Chuang added a comment -

          Good point. Thanks for reminder.

          Pushed the commit to branch-2.7.
          There was a very trivial conflict due to HDFS-8103 refactory.

          Show
          jojochuang Wei-Chiu Chuang added a comment - Good point. Thanks for reminder. Pushed the commit to branch-2.7. There was a very trivial conflict due to HDFS-8103 refactory.

            People

            • Assignee:
              jojochuang Wei-Chiu Chuang
              Reporter:
              jojochuang Wei-Chiu Chuang
            • Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development