Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-10480

Add an admin command to list currently open files

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.9.0, 3.0.0-alpha4, 2.8.3
    • Component/s: None
    • Labels:
      None

      Description

      Currently there is no easy way to obtain the list of active leases or files being written. It will be nice if we have an admin command to list open files and their lease holders.

      1. HDFS-10480-trunk.patch
        36 kB
        Rushabh S Shah
      2. HDFS-10480-trunk-1.patch
        35 kB
        Rushabh S Shah
      3. HDFS-10480.02.patch
        53 kB
        Manoj Govindassamy
      4. HDFS-10480.03.patch
        53 kB
        Manoj Govindassamy
      5. HDFS-10480.04.patch
        56 kB
        Manoj Govindassamy
      6. HDFS-10480.05.patch
        61 kB
        Manoj Govindassamy
      7. HDFS-10480.06.patch
        62 kB
        Manoj Govindassamy
      8. HDFS-10480.07.patch
        62 kB
        Manoj Govindassamy
      9. HDFS-10480-branch-2.01.patch
        59 kB
        Manoj Govindassamy
      10. HDFS-10480-branch-2.8.01.patch
        59 kB
        Manoj Govindassamy

        Issue Links

          Activity

          Hide
          manojg Manoj Govindassamy added a comment -

          Thanks for the review Andrew Wang.
          Committed to branch-2.8.

          Show
          manojg Manoj Govindassamy added a comment - Thanks for the review Andrew Wang . Committed to branch-2.8.
          Hide
          andrew.wang Andrew Wang added a comment -

          LGTM +1 thanks Manoj!

          Show
          andrew.wang Andrew Wang added a comment - LGTM +1 thanks Manoj!
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 20s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 4 new or modified test files.
          0 mvndep 0m 34s Maven dependency ordering for branch
          +1 mvninstall 8m 57s branch-2.8 passed
          +1 compile 1m 26s branch-2.8 passed with JDK v1.8.0_131
          +1 compile 1m 31s branch-2.8 passed with JDK v1.7.0_131
          +1 checkstyle 0m 41s branch-2.8 passed
          +1 mvnsite 1m 26s branch-2.8 passed
          +1 findbugs 3m 36s branch-2.8 passed
          +1 javadoc 0m 56s branch-2.8 passed with JDK v1.8.0_131
          +1 javadoc 1m 21s branch-2.8 passed with JDK v1.7.0_131
          0 mvndep 0m 9s Maven dependency ordering for patch
          +1 mvninstall 1m 11s the patch passed
          +1 compile 1m 35s the patch passed with JDK v1.8.0_131
          +1 cc 1m 35s the patch passed
          +1 javac 1m 35s the patch passed
          +1 compile 1m 38s the patch passed with JDK v1.7.0_131
          +1 cc 1m 38s the patch passed
          +1 javac 1m 38s the patch passed
          -0 checkstyle 0m 39s hadoop-hdfs-project: The patch generated 2 new + 1134 unchanged - 0 fixed = 1136 total (was 1134)
          +1 mvnsite 1m 28s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 xml 0m 1s The patch has no ill-formed XML file.
          +1 findbugs 4m 7s the patch passed
          +1 javadoc 0m 53s the patch passed with JDK v1.8.0_131
          +1 javadoc 1m 20s the patch passed with JDK v1.7.0_131
          +1 unit 1m 16s hadoop-hdfs-client in the patch passed with JDK v1.7.0_131.
          -1 unit 64m 11s hadoop-hdfs in the patch failed with JDK v1.7.0_131.
          +1 asflicense 0m 26s The patch does not generate ASF License warnings.
          160m 20s



          Reason Tests
          JDK v1.8.0_131 Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeMetrics
            hadoop.fs.viewfs.TestViewFsAtHdfsRoot
          JDK v1.8.0_131 Timed out junit tests org.apache.hadoop.hdfs.TestLeaseRecovery2
          JDK v1.7.0_131 Failed junit tests hadoop.hdfs.TestEncryptionZones
          JDK v1.7.0_131 Timed out junit tests org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:d946387
          JIRA Issue HDFS-10480
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12874762/HDFS-10480-branch-2.8.01.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc xml
          uname Linux 0461a61445a5 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision branch-2.8 / 69480ba
          Default Java 1.7.0_131
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_131 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_131
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/20127/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/20127/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_131.txt
          JDK v1.7.0_131 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/20127/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/20127/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 20s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 4 new or modified test files. 0 mvndep 0m 34s Maven dependency ordering for branch +1 mvninstall 8m 57s branch-2.8 passed +1 compile 1m 26s branch-2.8 passed with JDK v1.8.0_131 +1 compile 1m 31s branch-2.8 passed with JDK v1.7.0_131 +1 checkstyle 0m 41s branch-2.8 passed +1 mvnsite 1m 26s branch-2.8 passed +1 findbugs 3m 36s branch-2.8 passed +1 javadoc 0m 56s branch-2.8 passed with JDK v1.8.0_131 +1 javadoc 1m 21s branch-2.8 passed with JDK v1.7.0_131 0 mvndep 0m 9s Maven dependency ordering for patch +1 mvninstall 1m 11s the patch passed +1 compile 1m 35s the patch passed with JDK v1.8.0_131 +1 cc 1m 35s the patch passed +1 javac 1m 35s the patch passed +1 compile 1m 38s the patch passed with JDK v1.7.0_131 +1 cc 1m 38s the patch passed +1 javac 1m 38s the patch passed -0 checkstyle 0m 39s hadoop-hdfs-project: The patch generated 2 new + 1134 unchanged - 0 fixed = 1136 total (was 1134) +1 mvnsite 1m 28s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 xml 0m 1s The patch has no ill-formed XML file. +1 findbugs 4m 7s the patch passed +1 javadoc 0m 53s the patch passed with JDK v1.8.0_131 +1 javadoc 1m 20s the patch passed with JDK v1.7.0_131 +1 unit 1m 16s hadoop-hdfs-client in the patch passed with JDK v1.7.0_131. -1 unit 64m 11s hadoop-hdfs in the patch failed with JDK v1.7.0_131. +1 asflicense 0m 26s The patch does not generate ASF License warnings. 160m 20s Reason Tests JDK v1.8.0_131 Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeMetrics   hadoop.fs.viewfs.TestViewFsAtHdfsRoot JDK v1.8.0_131 Timed out junit tests org.apache.hadoop.hdfs.TestLeaseRecovery2 JDK v1.7.0_131 Failed junit tests hadoop.hdfs.TestEncryptionZones JDK v1.7.0_131 Timed out junit tests org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints Subsystem Report/Notes Docker Image:yetus/hadoop:d946387 JIRA Issue HDFS-10480 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12874762/HDFS-10480-branch-2.8.01.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc xml uname Linux 0461a61445a5 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision branch-2.8 / 69480ba Default Java 1.7.0_131 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_131 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_131 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/20127/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/20127/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_131.txt JDK v1.7.0_131 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/20127/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project Console output https://builds.apache.org/job/PreCommit-HDFS-Build/20127/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          manojg Manoj Govindassamy added a comment -

          Andrew Wang, Rushabh S Shah,
          Attached branch-2.8 v01 patch. Please take a look.

          Show
          manojg Manoj Govindassamy added a comment - Andrew Wang , Rushabh S Shah , Attached branch-2.8 v01 patch. Please take a look.
          Hide
          manojg Manoj Govindassamy added a comment -

          sure Rushabh S Shah, Andrew Wang. Will post the brabch-2.8 patch once backported and tested.

          Show
          manojg Manoj Govindassamy added a comment - sure Rushabh S Shah , Andrew Wang . Will post the brabch-2.8 patch once backported and tested.
          Hide
          andrew.wang Andrew Wang added a comment -

          Sure, looks like we need a new patch though since the backport doesn't apply cleanly.

          Show
          andrew.wang Andrew Wang added a comment - Sure, looks like we need a new patch though since the backport doesn't apply cleanly.
          Hide
          shahrs87 Rushabh S Shah added a comment -

          Can we backport to branch-2.8 also ?

          Show
          shahrs87 Rushabh S Shah added a comment - Can we backport to branch-2.8 also ?
          Hide
          andrew.wang Andrew Wang added a comment -

          Committed to branch-2, thanks for the contribution Manoj!

          Show
          andrew.wang Andrew Wang added a comment - Committed to branch-2, thanks for the contribution Manoj!
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 16m 59s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 4 new or modified test files.
          0 mvndep 0m 29s Maven dependency ordering for branch
          +1 mvninstall 7m 9s branch-2 passed
          +1 compile 1m 24s branch-2 passed with JDK v1.8.0_131
          +1 compile 1m 26s branch-2 passed with JDK v1.7.0_131
          +1 checkstyle 0m 48s branch-2 passed
          +1 mvnsite 1m 26s branch-2 passed
          +1 findbugs 3m 39s branch-2 passed
          +1 javadoc 1m 3s branch-2 passed with JDK v1.8.0_131
          +1 javadoc 1m 27s branch-2 passed with JDK v1.7.0_131
          0 mvndep 0m 10s Maven dependency ordering for patch
          +1 mvninstall 1m 17s the patch passed
          +1 compile 1m 23s the patch passed with JDK v1.8.0_131
          +1 cc 1m 23s the patch passed
          +1 javac 1m 23s the patch passed
          +1 compile 1m 24s the patch passed with JDK v1.7.0_131
          +1 cc 1m 24s the patch passed
          +1 javac 1m 24s the patch passed
          -0 checkstyle 0m 42s hadoop-hdfs-project: The patch generated 2 new + 1153 unchanged - 1 fixed = 1155 total (was 1154)
          +1 mvnsite 1m 24s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 xml 0m 0s The patch has no ill-formed XML file.
          +1 findbugs 4m 4s the patch passed
          +1 javadoc 0m 55s the patch passed with JDK v1.8.0_131
          +1 javadoc 1m 20s the patch passed with JDK v1.7.0_131
          +1 unit 1m 11s hadoop-hdfs-client in the patch passed with JDK v1.7.0_131.
          -1 unit 70m 34s hadoop-hdfs in the patch failed with JDK v1.7.0_131.
          +1 asflicense 0m 22s The patch does not generate ASF License warnings.
          193m 12s



          Reason Tests
          JDK v1.8.0_131 Failed junit tests hadoop.hdfs.server.datanode.fsdataset.impl.TestSpaceReservation
          JDK v1.8.0_131 Timed out junit tests org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting
          JDK v1.7.0_131 Failed junit tests hadoop.hdfs.server.blockmanagement.TestReplicationPolicyWithUpgradeDomain
          JDK v1.7.0_131 Timed out junit tests org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:5e40efe
          JIRA Issue HDFS-10480
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12873331/HDFS-10480-branch-2.01.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc xml
          uname Linux 82bda63f1139 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision branch-2 / 43fb793
          Default Java 1.7.0_131
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_131 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_131
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19930/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/19930/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_131.txt
          JDK v1.7.0_131 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19930/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19930/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 16m 59s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 4 new or modified test files. 0 mvndep 0m 29s Maven dependency ordering for branch +1 mvninstall 7m 9s branch-2 passed +1 compile 1m 24s branch-2 passed with JDK v1.8.0_131 +1 compile 1m 26s branch-2 passed with JDK v1.7.0_131 +1 checkstyle 0m 48s branch-2 passed +1 mvnsite 1m 26s branch-2 passed +1 findbugs 3m 39s branch-2 passed +1 javadoc 1m 3s branch-2 passed with JDK v1.8.0_131 +1 javadoc 1m 27s branch-2 passed with JDK v1.7.0_131 0 mvndep 0m 10s Maven dependency ordering for patch +1 mvninstall 1m 17s the patch passed +1 compile 1m 23s the patch passed with JDK v1.8.0_131 +1 cc 1m 23s the patch passed +1 javac 1m 23s the patch passed +1 compile 1m 24s the patch passed with JDK v1.7.0_131 +1 cc 1m 24s the patch passed +1 javac 1m 24s the patch passed -0 checkstyle 0m 42s hadoop-hdfs-project: The patch generated 2 new + 1153 unchanged - 1 fixed = 1155 total (was 1154) +1 mvnsite 1m 24s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 xml 0m 0s The patch has no ill-formed XML file. +1 findbugs 4m 4s the patch passed +1 javadoc 0m 55s the patch passed with JDK v1.8.0_131 +1 javadoc 1m 20s the patch passed with JDK v1.7.0_131 +1 unit 1m 11s hadoop-hdfs-client in the patch passed with JDK v1.7.0_131. -1 unit 70m 34s hadoop-hdfs in the patch failed with JDK v1.7.0_131. +1 asflicense 0m 22s The patch does not generate ASF License warnings. 193m 12s Reason Tests JDK v1.8.0_131 Failed junit tests hadoop.hdfs.server.datanode.fsdataset.impl.TestSpaceReservation JDK v1.8.0_131 Timed out junit tests org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting JDK v1.7.0_131 Failed junit tests hadoop.hdfs.server.blockmanagement.TestReplicationPolicyWithUpgradeDomain JDK v1.7.0_131 Timed out junit tests org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting Subsystem Report/Notes Docker Image:yetus/hadoop:5e40efe JIRA Issue HDFS-10480 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12873331/HDFS-10480-branch-2.01.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc xml uname Linux 82bda63f1139 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision branch-2 / 43fb793 Default Java 1.7.0_131 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_131 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_131 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19930/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/19930/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_131.txt JDK v1.7.0_131 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19930/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19930/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          manojg Manoj Govindassamy added a comment -

          Thanks for the review and commit help Andrew Wang.

          Attached branch2 patch. One of the dependent tests in TestLeaseManager is not available in branch2 and hence few additions to that test are removed from the patch. Other than that, there are 3 other tests which verify the full feature and they are available in the attached branch2 patch.

          Show
          manojg Manoj Govindassamy added a comment - Thanks for the review and commit help Andrew Wang . Attached branch2 patch. One of the dependent tests in TestLeaseManager is not available in branch2 and hence few additions to that test are removed from the patch. Other than that, there are 3 other tests which verify the full feature and they are available in the attached branch2 patch.
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11872 (See https://builds.apache.org/job/Hadoop-trunk-Commit/11872/)
          HDFS-10480. Add an admin command to list currently open files. (wang: rev fb68980959f95f0d89e86f91909867724ad01791)

          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/TestDFSAdmin.java
          • (add) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/OpenFileEntry.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsAdmin.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/LeaseManager.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestHdfsAdmin.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/proto/ClientNamenodeProtocol.proto
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientNamenodeProtocolTranslatorPB.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSCommands.md
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/ClientProtocol.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientNamenodeProtocolServerSideTranslatorPB.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
          • (add) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestListOpenFiles.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestLeaseManager.java
          • (add) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/OpenFilesIterator.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelperClient.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11872 (See https://builds.apache.org/job/Hadoop-trunk-Commit/11872/ ) HDFS-10480 . Add an admin command to list currently open files. (wang: rev fb68980959f95f0d89e86f91909867724ad01791) (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/TestDFSAdmin.java (add) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/OpenFileEntry.java (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsAdmin.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/LeaseManager.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestHdfsAdmin.java (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/proto/ClientNamenodeProtocol.proto (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientNamenodeProtocolTranslatorPB.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSCommands.md (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/ClientProtocol.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientNamenodeProtocolServerSideTranslatorPB.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml (add) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestListOpenFiles.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestLeaseManager.java (add) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/OpenFilesIterator.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelperClient.java (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java
          Hide
          andrew.wang Andrew Wang added a comment -

          Also, if you want to start working on HDFS-11847 concurrently, happy to review that too. It'd be nice to have the functionality for the full admin workflow ready by the time this makes its way into a release.

          Show
          andrew.wang Andrew Wang added a comment - Also, if you want to start working on HDFS-11847 concurrently, happy to review that too. It'd be nice to have the functionality for the full admin workflow ready by the time this makes its way into a release.
          Hide
          andrew.wang Andrew Wang added a comment -

          I had to do a small imports conflict resolution to get this to apply to trunk, committed it there.

          However, I was unable to cherry-pick back to branch-2. Manoj, could you supply a branch-2 patch? Thanks!

          Show
          andrew.wang Andrew Wang added a comment - I had to do a small imports conflict resolution to get this to apply to trunk, committed it there. However, I was unable to cherry-pick back to branch-2. Manoj, could you supply a branch-2 patch? Thanks!
          Hide
          andrew.wang Andrew Wang added a comment -

          LGTM thanks for working on this Manoj! Will commit this shortly.

          Show
          andrew.wang Andrew Wang added a comment - LGTM thanks for working on this Manoj! Will commit this shortly.
          Hide
          manojg Manoj Govindassamy added a comment -

          Above unit test failures are not related to the patch.

          Show
          manojg Manoj Govindassamy added a comment - Above unit test failures are not related to the patch.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 23s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 5 new or modified test files.
          0 mvndep 0m 25s Maven dependency ordering for branch
          +1 mvninstall 15m 59s trunk passed
          +1 compile 1m 52s trunk passed
          +1 checkstyle 1m 6s trunk passed
          +1 mvnsite 1m 52s trunk passed
          +1 findbugs 3m 33s trunk passed
          +1 javadoc 1m 19s trunk passed
          0 mvndep 0m 9s Maven dependency ordering for patch
          +1 mvninstall 1m 48s the patch passed
          +1 compile 1m 56s the patch passed
          +1 cc 1m 56s the patch passed
          +1 javac 1m 56s the patch passed
          -0 checkstyle 1m 4s hadoop-hdfs-project: The patch generated 2 new + 1167 unchanged - 1 fixed = 1169 total (was 1168)
          +1 mvnsite 1m 53s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 xml 0m 1s The patch has no ill-formed XML file.
          +1 findbugs 3m 48s the patch passed
          +1 javadoc 1m 15s the patch passed
          +1 unit 1m 22s hadoop-hdfs-client in the patch passed.
          -1 unit 95m 11s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 25s The patch does not generate ASF License warnings.
          137m 19s



          Reason Tests
          Failed junit tests hadoop.hdfs.TestDFSStripedInputStreamWithRandomECPolicy
            hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:14b5c93
          JIRA Issue HDFS-10480
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12872186/HDFS-10480.07.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc xml
          uname Linux efd6513473f5 3.13.0-108-generic #155-Ubuntu SMP Wed Jan 11 16:58:52 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 7c9694c
          Default Java 1.8.0_131
          findbugs v3.1.0-RC1
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19843/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/19843/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19843/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19843/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 23s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 5 new or modified test files. 0 mvndep 0m 25s Maven dependency ordering for branch +1 mvninstall 15m 59s trunk passed +1 compile 1m 52s trunk passed +1 checkstyle 1m 6s trunk passed +1 mvnsite 1m 52s trunk passed +1 findbugs 3m 33s trunk passed +1 javadoc 1m 19s trunk passed 0 mvndep 0m 9s Maven dependency ordering for patch +1 mvninstall 1m 48s the patch passed +1 compile 1m 56s the patch passed +1 cc 1m 56s the patch passed +1 javac 1m 56s the patch passed -0 checkstyle 1m 4s hadoop-hdfs-project: The patch generated 2 new + 1167 unchanged - 1 fixed = 1169 total (was 1168) +1 mvnsite 1m 53s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 xml 0m 1s The patch has no ill-formed XML file. +1 findbugs 3m 48s the patch passed +1 javadoc 1m 15s the patch passed +1 unit 1m 22s hadoop-hdfs-client in the patch passed. -1 unit 95m 11s hadoop-hdfs in the patch failed. +1 asflicense 0m 25s The patch does not generate ASF License warnings. 137m 19s Reason Tests Failed junit tests hadoop.hdfs.TestDFSStripedInputStreamWithRandomECPolicy   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue HDFS-10480 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12872186/HDFS-10480.07.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc xml uname Linux efd6513473f5 3.13.0-108-generic #155-Ubuntu SMP Wed Jan 11 16:58:52 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 7c9694c Default Java 1.8.0_131 findbugs v3.1.0-RC1 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19843/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/19843/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19843/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19843/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          manojg Manoj Govindassamy added a comment - - edited

          Thanks for the review comments Andrew Wang. Attached v07 patch to address the following. Please take a look.

          DFSAdmin, I'd prefer that we don't print a special message when there aren't any open files. Just print the header with no entries.

          Done.

          DFSAdmin help text needs an additional linebreak and tab to wrap the long line in the output

          Done.

          Thinking about it a little more, we can remove the HdfsAdmin API and make the CLI the only public API. A number of DFSAdmin commands don't have corresponding HdfsAdmin APIs (e.g. evictWriters, triggerBlockReport). We can always add the Java API later if there's demand.

          Given that this jira is very useful for debugging problems, I am anticipating HdfsAdmin API demand from automation users who are already using the interface. Unit tests are also added to verify the API. Inclined to retain it for it now. There are also followup enhancements planned on top of this jira, and if any need to be removed can be taken care later. Thanks.

          Show
          manojg Manoj Govindassamy added a comment - - edited Thanks for the review comments Andrew Wang . Attached v07 patch to address the following. Please take a look. DFSAdmin, I'd prefer that we don't print a special message when there aren't any open files. Just print the header with no entries. Done. DFSAdmin help text needs an additional linebreak and tab to wrap the long line in the output Done. Thinking about it a little more, we can remove the HdfsAdmin API and make the CLI the only public API. A number of DFSAdmin commands don't have corresponding HdfsAdmin APIs (e.g. evictWriters, triggerBlockReport). We can always add the Java API later if there's demand. Given that this jira is very useful for debugging problems, I am anticipating HdfsAdmin API demand from automation users who are already using the interface. Unit tests are also added to verify the API. Inclined to retain it for it now. There are also followup enhancements planned on top of this jira, and if any need to be removed can be taken care later. Thanks.
          Hide
          andrew.wang Andrew Wang added a comment -

          Thanks for working on this Manoj. LGTM overall, a few nits:

          • DFSAdmin, I'd prefer that we don't print a special message when there aren't any open files. Just print the header with no entries.
          • DFSAdmin help text needs an additional linebreak and tab to wrap the long line in the output
          • Thinking about it a little more, we can remove the HdfsAdmin API and make the CLI the only public API. A number of DFSAdmin commands don't have corresponding HdfsAdmin APIs (e.g. evictWriters, triggerBlockReport). We can always add the Java API later if there's demand.
          Show
          andrew.wang Andrew Wang added a comment - Thanks for working on this Manoj. LGTM overall, a few nits: DFSAdmin, I'd prefer that we don't print a special message when there aren't any open files. Just print the header with no entries. DFSAdmin help text needs an additional linebreak and tab to wrap the long line in the output Thinking about it a little more, we can remove the HdfsAdmin API and make the CLI the only public API. A number of DFSAdmin commands don't have corresponding HdfsAdmin APIs (e.g. evictWriters, triggerBlockReport). We can always add the Java API later if there's demand.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 22s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 5 new or modified test files.
          0 mvndep 0m 8s Maven dependency ordering for branch
          +1 mvninstall 15m 2s trunk passed
          +1 compile 1m 47s trunk passed
          +1 checkstyle 0m 55s trunk passed
          +1 mvnsite 1m 36s trunk passed
          +1 mvneclipse 0m 27s trunk passed
          +1 findbugs 3m 4s trunk passed
          +1 javadoc 1m 8s trunk passed
          0 mvndep 0m 7s Maven dependency ordering for patch
          +1 mvninstall 1m 23s the patch passed
          +1 compile 1m 25s the patch passed
          +1 cc 1m 25s the patch passed
          +1 javac 1m 25s the patch passed
          -0 checkstyle 0m 50s hadoop-hdfs-project: The patch generated 2 new + 1173 unchanged - 1 fixed = 1175 total (was 1174)
          +1 mvnsite 1m 22s the patch passed
          +1 mvneclipse 0m 23s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 xml 0m 1s The patch has no ill-formed XML file.
          +1 findbugs 3m 14s the patch passed
          +1 javadoc 0m 59s the patch passed
          +1 unit 1m 13s hadoop-hdfs-client in the patch passed.
          -1 unit 109m 50s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 22s The patch does not generate ASF License warnings.
          147m 16s



          Reason Tests
          Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations
            hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting
            hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure
            hadoop.hdfs.server.datanode.TestNNHandlesCombinedBlockReport
            hadoop.hdfs.TestDFSStripedInputStreamWithRandomECPolicy
            hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration
            hadoop.hdfs.server.datanode.TestDirectoryScanner
            hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150
            hadoop.hdfs.server.namenode.ha.TestHAAppend
            hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped
            hadoop.hdfs.server.namenode.TestDecommissioningStatus
            hadoop.hdfs.TestDFSStripedOutputStreamWithFailure010
            hadoop.hdfs.server.datanode.TestDataNodeUUID
          Timed out junit tests org.apache.hadoop.hdfs.TestLeaseRecovery2



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:14b5c93
          JIRA Issue HDFS-10480
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12870142/HDFS-10480.06.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc xml
          uname Linux ee2f0f0f135b 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 2cd612b
          Default Java 1.8.0_131
          findbugs v3.1.0-RC1
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19640/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/19640/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19640/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19640/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 22s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 5 new or modified test files. 0 mvndep 0m 8s Maven dependency ordering for branch +1 mvninstall 15m 2s trunk passed +1 compile 1m 47s trunk passed +1 checkstyle 0m 55s trunk passed +1 mvnsite 1m 36s trunk passed +1 mvneclipse 0m 27s trunk passed +1 findbugs 3m 4s trunk passed +1 javadoc 1m 8s trunk passed 0 mvndep 0m 7s Maven dependency ordering for patch +1 mvninstall 1m 23s the patch passed +1 compile 1m 25s the patch passed +1 cc 1m 25s the patch passed +1 javac 1m 25s the patch passed -0 checkstyle 0m 50s hadoop-hdfs-project: The patch generated 2 new + 1173 unchanged - 1 fixed = 1175 total (was 1174) +1 mvnsite 1m 22s the patch passed +1 mvneclipse 0m 23s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 xml 0m 1s The patch has no ill-formed XML file. +1 findbugs 3m 14s the patch passed +1 javadoc 0m 59s the patch passed +1 unit 1m 13s hadoop-hdfs-client in the patch passed. -1 unit 109m 50s hadoop-hdfs in the patch failed. +1 asflicense 0m 22s The patch does not generate ASF License warnings. 147m 16s Reason Tests Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure   hadoop.hdfs.server.datanode.TestNNHandlesCombinedBlockReport   hadoop.hdfs.TestDFSStripedInputStreamWithRandomECPolicy   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration   hadoop.hdfs.server.datanode.TestDirectoryScanner   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150   hadoop.hdfs.server.namenode.ha.TestHAAppend   hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped   hadoop.hdfs.server.namenode.TestDecommissioningStatus   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure010   hadoop.hdfs.server.datanode.TestDataNodeUUID Timed out junit tests org.apache.hadoop.hdfs.TestLeaseRecovery2 Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue HDFS-10480 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12870142/HDFS-10480.06.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc xml uname Linux ee2f0f0f135b 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 2cd612b Default Java 1.8.0_131 findbugs v3.1.0-RC1 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19640/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/19640/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19640/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19640/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          manojg Manoj Govindassamy added a comment -

          Andrew Wang, Attached v06 patch to address the following

          testListOpenFilesInHA, it'd be nice to see what happens when there's a failover between batches while iterating.

          Done. Updated testListOpenFilesInHA to perform NN failover in parallel with a client thread listing open files in a loop.
          Also, fixed check style issues.

          Rushabh S Shah,
          Thanks for the review comments.

          Instead of creating SortedMap<Long, Lease> we can just create SortedSet<Long> and call getINodeIdWithLeases to populate it.

          We actually need a small batch of Inodes, starting from cursor point/inode that was already returned to client. So, making use of the tailmap functionality from TreeMap to get this remote iterator behavior.

          Show
          manojg Manoj Govindassamy added a comment - Andrew Wang , Attached v06 patch to address the following testListOpenFilesInHA, it'd be nice to see what happens when there's a failover between batches while iterating. Done. Updated testListOpenFilesInHA to perform NN failover in parallel with a client thread listing open files in a loop. Also, fixed check style issues. Rushabh S Shah , Thanks for the review comments. Instead of creating SortedMap<Long, Lease> we can just create SortedSet<Long> and call getINodeIdWithLeases to populate it. We actually need a small batch of Inodes, starting from cursor point/inode that was already returned to client. So, making use of the tailmap functionality from TreeMap to get this remote iterator behavior.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 22s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 5 new or modified test files.
          0 mvndep 0m 22s Maven dependency ordering for branch
          +1 mvninstall 12m 34s trunk passed
          +1 compile 1m 20s trunk passed
          +1 checkstyle 0m 54s trunk passed
          +1 mvnsite 1m 20s trunk passed
          +1 mvneclipse 0m 30s trunk passed
          +1 findbugs 2m 44s trunk passed
          +1 javadoc 0m 58s trunk passed
          0 mvndep 0m 8s Maven dependency ordering for patch
          +1 mvninstall 1m 13s the patch passed
          +1 compile 1m 16s the patch passed
          +1 cc 1m 16s the patch passed
          +1 javac 1m 16s the patch passed
          -0 checkstyle 0m 49s hadoop-hdfs-project: The patch generated 11 new + 1173 unchanged - 1 fixed = 1184 total (was 1174)
          +1 mvnsite 1m 16s the patch passed
          +1 mvneclipse 0m 24s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 xml 0m 1s The patch has no ill-formed XML file.
          +1 findbugs 3m 3s the patch passed
          +1 javadoc 0m 53s the patch passed
          +1 unit 1m 10s hadoop-hdfs-client in the patch passed.
          -1 unit 63m 38s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 22s The patch does not generate ASF License warnings.
          96m 43s



          Reason Tests
          Failed junit tests hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080
            hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure
            hadoop.hdfs.web.TestWebHdfsTimeouts
            hadoop.hdfs.TestDFSStripedOutputStreamWithFailure160
            hadoop.hdfs.server.balancer.TestBalancerRPCDelay



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:14b5c93
          JIRA Issue HDFS-10480
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12870023/HDFS-10480.05.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc xml
          uname Linux 639197c6ee83 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 47474ff
          Default Java 1.8.0_131
          findbugs v3.1.0-RC1
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19630/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/19630/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19630/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19630/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 22s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 5 new or modified test files. 0 mvndep 0m 22s Maven dependency ordering for branch +1 mvninstall 12m 34s trunk passed +1 compile 1m 20s trunk passed +1 checkstyle 0m 54s trunk passed +1 mvnsite 1m 20s trunk passed +1 mvneclipse 0m 30s trunk passed +1 findbugs 2m 44s trunk passed +1 javadoc 0m 58s trunk passed 0 mvndep 0m 8s Maven dependency ordering for patch +1 mvninstall 1m 13s the patch passed +1 compile 1m 16s the patch passed +1 cc 1m 16s the patch passed +1 javac 1m 16s the patch passed -0 checkstyle 0m 49s hadoop-hdfs-project: The patch generated 11 new + 1173 unchanged - 1 fixed = 1184 total (was 1174) +1 mvnsite 1m 16s the patch passed +1 mvneclipse 0m 24s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 xml 0m 1s The patch has no ill-formed XML file. +1 findbugs 3m 3s the patch passed +1 javadoc 0m 53s the patch passed +1 unit 1m 10s hadoop-hdfs-client in the patch passed. -1 unit 63m 38s hadoop-hdfs in the patch failed. +1 asflicense 0m 22s The patch does not generate ASF License warnings. 96m 43s Reason Tests Failed junit tests hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure   hadoop.hdfs.web.TestWebHdfsTimeouts   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure160   hadoop.hdfs.server.balancer.TestBalancerRPCDelay Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue HDFS-10480 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12870023/HDFS-10480.05.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc xml uname Linux 639197c6ee83 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 47474ff Default Java 1.8.0_131 findbugs v3.1.0-RC1 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19630/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/19630/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19630/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19630/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          manojg Manoj Govindassamy added a comment - - edited

          Thanks for the review comments Andrew Wang. Attached v05 patch to address the following.

          One high-level question first, what do we envision as the usecases for this command? I figured it was for: Debugging lease manager state

          Thats right. The prime use of this jira fix is to provide an admin command to debug LeaseManager state and provide a diagnostics platform to debug issues around open files. There were several cases in the past where stale files stay open for a very long time and without data being written to it actively. Fsck way of finding the open files is very time consuming and degrades cluster performance. The proposed admin command is very light weight and lists all open files along with client details. Admin can then make a decision on running recover lease if needed.

          Finding open files that are blocking decommission

          Yes. The plan is to extend the above admin command to help diagnose decommissioning and maintenance state issues arising from open files. HDFS-11847 will take care of this.

          We probably shouldn't skip erroneous leases:

          True. These file with valid lease but not in under construction state might be useful for diagnosing. But the client name/machine details are part of UnderConstruction feature in INode. So for the non-UC files with leases, shall we instead show some warning or error messages in place of client name and machine ?

          For the second, the admin is wondering why some DN hasn't finished decomming yet, and wants to find the UC blocks and the client and path. It looks like HDFS-11847 will make this easy, without needing to resort to fsck. Nice. But what's the workflow where we need HDFS-11848? This new command is much lighter weight than fsck -openforwrite, so I'd like to encourage users to use the new command instead. Just wondering, before we add some new functionality.

          This is an enhancement to the first usecase to make the dfsadmin -listOpenFiles command much more light weight and easy to use. When the open files count is huge, listing them all using dfsadmin command, though light weight might take several iterations to report the entire list. If the admin is interested only in specific paths, listing open files under a path might be much more faster and easy to read response list. Anyways, open for discussion on the need for this enhancement.

          Maybe bump the NUM_RESPONSES limit to 1000, to match DFS_LIST_LIMIT?

          Done.

          Should the precondition check for NUM_RESPONSES check for > 0 rather than >= 0 ? FWIW, 0 is also not a positive integer.

          That's right. 0 response entries doesn't make sense. Changed it to > 0.

          Based on HDFS-9395, we should only generate an audit event when the op is successful or fails due to an ACE. Notably, it should not log for things like an IOE.

          Done. Followed the usual pattern.

          LeaseManager#getUnderConstructionFiles makes a new TreeMap out of leasesById. This is potentially a lot of garbage. Can we make leasesById a TreeMap instead to avoid this? TreeMaps still have pretty good performance.

          Done. I was worried about the performance of the LeaseManager with HashMap switched to TreeMap. HashMap has better put/get performance compared to TreeMap. But, if that's not significant enough for predominant usecase of say max open files in the order of 1000s, then we should be ok.

          Can we also add an assert that the FSN read lock is held?

          Done.

          Testing:

          I like the step-up/step-down with the open and closed file sets. Could we take the verification one step further, and do it in a for-loop? This way we test all the way from 0..numOpenFiles rather than just at numOpenFiles and numOpenFiles/2

          Done. Also, moved the utils to DFSTestUtil so as to reduce code duplication.

          testListOpenFilesInHA, it'd be nice to see what happens when there's a failover between batches while iterating. I also suggest perhaps moving this into TestListOpenFiles since it doesn't really relate to append.

          Moved the test to TestListOpenFiles. Will need some kind of delay simulator during listing so as to effectively test the listing and failover in parallel. Will take this up as part of HDFS-11847, if you are ok.

          Do we have any tests for the HdfsAdmin API? It'd be better to test against this than the one in DistributedFileSystem, since our end users will be programming against HdfsAdmin.

          Done. Added a test in TestHdfsAdmin.

          Show
          manojg Manoj Govindassamy added a comment - - edited Thanks for the review comments Andrew Wang . Attached v05 patch to address the following. One high-level question first, what do we envision as the usecases for this command? I figured it was for: Debugging lease manager state Thats right. The prime use of this jira fix is to provide an admin command to debug LeaseManager state and provide a diagnostics platform to debug issues around open files. There were several cases in the past where stale files stay open for a very long time and without data being written to it actively. Fsck way of finding the open files is very time consuming and degrades cluster performance. The proposed admin command is very light weight and lists all open files along with client details. Admin can then make a decision on running recover lease if needed. Finding open files that are blocking decommission Yes. The plan is to extend the above admin command to help diagnose decommissioning and maintenance state issues arising from open files. HDFS-11847 will take care of this. We probably shouldn't skip erroneous leases: True. These file with valid lease but not in under construction state might be useful for diagnosing. But the client name/machine details are part of UnderConstruction feature in INode. So for the non-UC files with leases, shall we instead show some warning or error messages in place of client name and machine ? For the second, the admin is wondering why some DN hasn't finished decomming yet, and wants to find the UC blocks and the client and path. It looks like HDFS-11847 will make this easy, without needing to resort to fsck. Nice. But what's the workflow where we need HDFS-11848 ? This new command is much lighter weight than fsck -openforwrite, so I'd like to encourage users to use the new command instead. Just wondering, before we add some new functionality. This is an enhancement to the first usecase to make the dfsadmin -listOpenFiles command much more light weight and easy to use. When the open files count is huge, listing them all using dfsadmin command, though light weight might take several iterations to report the entire list. If the admin is interested only in specific paths, listing open files under a path might be much more faster and easy to read response list. Anyways, open for discussion on the need for this enhancement. Maybe bump the NUM_RESPONSES limit to 1000, to match DFS_LIST_LIMIT? Done. Should the precondition check for NUM_RESPONSES check for > 0 rather than >= 0 ? FWIW, 0 is also not a positive integer. That's right. 0 response entries doesn't make sense. Changed it to > 0. Based on HDFS-9395 , we should only generate an audit event when the op is successful or fails due to an ACE. Notably, it should not log for things like an IOE. Done. Followed the usual pattern. LeaseManager#getUnderConstructionFiles makes a new TreeMap out of leasesById. This is potentially a lot of garbage. Can we make leasesById a TreeMap instead to avoid this? TreeMaps still have pretty good performance. Done. I was worried about the performance of the LeaseManager with HashMap switched to TreeMap. HashMap has better put/get performance compared to TreeMap. But, if that's not significant enough for predominant usecase of say max open files in the order of 1000s, then we should be ok. Can we also add an assert that the FSN read lock is held? Done. Testing: I like the step-up/step-down with the open and closed file sets. Could we take the verification one step further, and do it in a for-loop? This way we test all the way from 0..numOpenFiles rather than just at numOpenFiles and numOpenFiles/2 Done. Also, moved the utils to DFSTestUtil so as to reduce code duplication. testListOpenFilesInHA, it'd be nice to see what happens when there's a failover between batches while iterating. I also suggest perhaps moving this into TestListOpenFiles since it doesn't really relate to append. Moved the test to TestListOpenFiles. Will need some kind of delay simulator during listing so as to effectively test the listing and failover in parallel. Will take this up as part of HDFS-11847 , if you are ok. Do we have any tests for the HdfsAdmin API? It'd be better to test against this than the one in DistributedFileSystem, since our end users will be programming against HdfsAdmin. Done. Added a test in TestHdfsAdmin.
          Hide
          andrew.wang Andrew Wang added a comment -

          Thanks for working on this Manoj. Looks good overall!

          One high-level question first, what do we envision as the usecases for this command? I figured it was for:

          1. Debugging lease manager state
          2. Finding open files that are blocking decommission

          To do the first, we probably shouldn't skip erroneous leases:

                if (!inodeFile.isUnderConstruction()) {
                  LOG.warn("The file " + inodeFile.getFullPathName()
                      + " is not under construction but has lease.");
                  continue;
                }
          

          The admin invoking the command also won't see this WARN since it goes to the NN log, not the client log. The log is still a bit useful, but there should be some non-NN-log way for admins to debug erroneous state here. I guess they can cross-check with fsck information?

          For the second, the admin is wondering why some DN hasn't finished decomming yet, and wants to find the UC blocks and the client and path. It looks like HDFS-11847 will make this easy, without needing to resort to fsck. Nice.

          But what's the workflow where we need HDFS-11848? This new command is much lighter weight than fsck -openforwrite, so I'd like to encourage users to use the new command instead. Just wondering, before we add some new functionality.

          Some review comments:

          • Maybe bump the NUM_RESPONSES limit to 1000, to match DFS_LIST_LIMIT?
          • Should the precondition check for NUM_RESPONSES check for > 0 rather than >= 0 ? FWIW, 0 is also not a positive integer.
          • Based on HDFS-9395, we should only generate an audit event when the op is successful, or fails due to an ACE. Notably, it should not log for things like an IOE.
          • LeaseManager#getUnderConstructionFiles makes a new TreeMap out of leasesById. This is potentially a lot of garbage. Can we make leasesById a TreeMap instead to avoid this? TreeMaps still have pretty good performance.
          • Can we also add an assert that the FSN read lock is held?

          Testing:

          • I like the step-up/step-down with the open and closed file sets. Could we take the verification one step further, and do it in a for-loop? This way we test all the way from 0..numOpenFiles rather than just at numOpenFiles and numOpenFiles/2
          • testListOpenFilesInHA, it'd be nice to see what happens when there's a failover between batches while iterating. I also suggest perhaps moving this into TestListOpenFiles since it doesn't really relate to append.
          • Do we have any tests for the HdfsAdmin API? It'd be better to test against this than the one in DistributedFileSystem, since our end users will be programming against HdfsAdmin.
          Show
          andrew.wang Andrew Wang added a comment - Thanks for working on this Manoj. Looks good overall! One high-level question first, what do we envision as the usecases for this command? I figured it was for: Debugging lease manager state Finding open files that are blocking decommission To do the first, we probably shouldn't skip erroneous leases: if (!inodeFile.isUnderConstruction()) { LOG.warn( "The file " + inodeFile.getFullPathName() + " is not under construction but has lease." ); continue ; } The admin invoking the command also won't see this WARN since it goes to the NN log, not the client log. The log is still a bit useful, but there should be some non-NN-log way for admins to debug erroneous state here. I guess they can cross-check with fsck information? For the second, the admin is wondering why some DN hasn't finished decomming yet, and wants to find the UC blocks and the client and path. It looks like HDFS-11847 will make this easy, without needing to resort to fsck. Nice. But what's the workflow where we need HDFS-11848 ? This new command is much lighter weight than fsck -openforwrite , so I'd like to encourage users to use the new command instead. Just wondering, before we add some new functionality. Some review comments: Maybe bump the NUM_RESPONSES limit to 1000, to match DFS_LIST_LIMIT ? Should the precondition check for NUM_RESPONSES check for > 0 rather than >= 0 ? FWIW, 0 is also not a positive integer. Based on HDFS-9395 , we should only generate an audit event when the op is successful, or fails due to an ACE. Notably, it should not log for things like an IOE. LeaseManager#getUnderConstructionFiles makes a new TreeMap out of leasesById . This is potentially a lot of garbage. Can we make leasesById a TreeMap instead to avoid this? TreeMaps still have pretty good performance. Can we also add an assert that the FSN read lock is held? Testing: I like the step-up/step-down with the open and closed file sets. Could we take the verification one step further, and do it in a for-loop? This way we test all the way from 0..numOpenFiles rather than just at numOpenFiles and numOpenFiles/2 testListOpenFilesInHA, it'd be nice to see what happens when there's a failover between batches while iterating. I also suggest perhaps moving this into TestListOpenFiles since it doesn't really relate to append. Do we have any tests for the HdfsAdmin API? It'd be better to test against this than the one in DistributedFileSystem , since our end users will be programming against HdfsAdmin .
          Hide
          andrew.wang Andrew Wang added a comment -

          I'll review this shortly.

          I never understand the need to check twice for checkOperation(OperationCategory.READ). It is all over namenode code.

          The issue is that the NN can go from being active to standby between the two checks. While the FSN lock is held, it will not transition HA state. The second check is sufficient for correctness, but the first one helps as an early-exit for performance.

          Show
          andrew.wang Andrew Wang added a comment - I'll review this shortly. I never understand the need to check twice for checkOperation(OperationCategory.READ). It is all over namenode code. The issue is that the NN can go from being active to standby between the two checks. While the FSN lock is held, it will not transition HA state. The second check is sufficient for correctness, but the first one helps as an early-exit for performance.
          Hide
          shahrs87 Rushabh S Shah added a comment -

          Thanks for carrying forward the work Manoj Govindassamy.
          I just took a cursory glance over the patch. Will provide a detailed review sometime next week.
          I had couple of comments.
          1.

          LeaseManager.java
          public BatchedListEntries<OpenFileEntry> getUnderConstructionFiles(
                final long prevId) throws IOException {
              SortedMap<Long, Lease> remainingLeases;
              synchronized (this) {
                remainingLeases = new TreeMap<>(leasesById).tailMap(prevId, false);
              }
          ...
          } 
          

          Instead of creating SortedMap<Long, Lease> we can just create SortedSet<Long> and call getINodeIdWithLeases to populate it.

          2.

          FSNamesystem.java
            BatchedListEntries<OpenFileEntry> listOpenFiles(long prevId)
                throws IOException {
              final String operationName = "listOpenFiles";
              boolean success = false;
              checkSuperuserPrivilege();
              checkOperation(OperationCategory.READ);
              readLock();
              try {
                checkOperation(OperationCategory.READ);
          ....
          } 
          

          I never understand the need to check twice for checkOperation(OperationCategory.READ). It is all over namenode code.
          Do we need to check it twice ?

          Show
          shahrs87 Rushabh S Shah added a comment - Thanks for carrying forward the work Manoj Govindassamy . I just took a cursory glance over the patch. Will provide a detailed review sometime next week. I had couple of comments. 1. LeaseManager.java public BatchedListEntries<OpenFileEntry> getUnderConstructionFiles( final long prevId) throws IOException { SortedMap< Long , Lease> remainingLeases; synchronized ( this ) { remainingLeases = new TreeMap<>(leasesById).tailMap(prevId, false ); } ... } Instead of creating SortedMap<Long, Lease> we can just create SortedSet<Long> and call getINodeIdWithLeases to populate it. 2. FSNamesystem.java BatchedListEntries<OpenFileEntry> listOpenFiles( long prevId) throws IOException { final String operationName = "listOpenFiles" ; boolean success = false ; checkSuperuserPrivilege(); checkOperation(OperationCategory.READ); readLock(); try { checkOperation(OperationCategory.READ); .... } I never understand the need to check twice for checkOperation(OperationCategory.READ) . It is all over namenode code. Do we need to check it twice ?
          Hide
          linyiqun Yiqun Lin added a comment -

          The latest patch LGTM. Thanks Manoj Govindassamy for fixing this!
          Hi Andrew Wang, do you have some other comments on the latest patch as I see you have given some review comments before?
          I will hold off commit a couple of days in case there are some other comments.
          Thanks

          Show
          linyiqun Yiqun Lin added a comment - The latest patch LGTM. Thanks Manoj Govindassamy for fixing this! Hi Andrew Wang , do you have some other comments on the latest patch as I see you have given some review comments before? I will hold off commit a couple of days in case there are some other comments. Thanks
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 23s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 4 new or modified test files.
          0 mvndep 0m 23s Maven dependency ordering for branch
          +1 mvninstall 14m 15s trunk passed
          +1 compile 1m 44s trunk passed
          +1 checkstyle 0m 52s trunk passed
          +1 mvnsite 1m 36s trunk passed
          +1 mvneclipse 0m 27s trunk passed
          +1 findbugs 3m 1s trunk passed
          +1 javadoc 1m 3s trunk passed
          0 mvndep 0m 8s Maven dependency ordering for patch
          +1 mvninstall 1m 23s the patch passed
          +1 compile 1m 26s the patch passed
          +1 cc 1m 26s the patch passed
          +1 javac 1m 26s the patch passed
          -0 checkstyle 0m 51s hadoop-hdfs-project: The patch generated 3 new + 1084 unchanged - 0 fixed = 1087 total (was 1084)
          +1 mvnsite 1m 22s the patch passed
          +1 mvneclipse 0m 24s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 xml 0m 1s The patch has no ill-formed XML file.
          +1 findbugs 3m 17s the patch passed
          +1 javadoc 0m 54s the patch passed
          +1 unit 1m 10s hadoop-hdfs-client in the patch passed.
          -1 unit 72m 26s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 21s The patch does not generate ASF License warnings.
          109m 0s



          Reason Tests
          Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure
          Timed out junit tests org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:14b5c93
          JIRA Issue HDFS-10480
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12869077/HDFS-10480.04.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc xml
          uname Linux a5549aea18f2 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 9855225
          Default Java 1.8.0_131
          findbugs v3.1.0-RC1
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19522/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/19522/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19522/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19522/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 23s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 4 new or modified test files. 0 mvndep 0m 23s Maven dependency ordering for branch +1 mvninstall 14m 15s trunk passed +1 compile 1m 44s trunk passed +1 checkstyle 0m 52s trunk passed +1 mvnsite 1m 36s trunk passed +1 mvneclipse 0m 27s trunk passed +1 findbugs 3m 1s trunk passed +1 javadoc 1m 3s trunk passed 0 mvndep 0m 8s Maven dependency ordering for patch +1 mvninstall 1m 23s the patch passed +1 compile 1m 26s the patch passed +1 cc 1m 26s the patch passed +1 javac 1m 26s the patch passed -0 checkstyle 0m 51s hadoop-hdfs-project: The patch generated 3 new + 1084 unchanged - 0 fixed = 1087 total (was 1084) +1 mvnsite 1m 22s the patch passed +1 mvneclipse 0m 24s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 xml 0m 1s The patch has no ill-formed XML file. +1 findbugs 3m 17s the patch passed +1 javadoc 0m 54s the patch passed +1 unit 1m 10s hadoop-hdfs-client in the patch passed. -1 unit 72m 26s hadoop-hdfs in the patch failed. +1 asflicense 0m 21s The patch does not generate ASF License warnings. 109m 0s Reason Tests Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure Timed out junit tests org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue HDFS-10480 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12869077/HDFS-10480.04.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc xml uname Linux a5549aea18f2 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 9855225 Default Java 1.8.0_131 findbugs v3.1.0-RC1 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19522/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/19522/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19522/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19522/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          manojg Manoj Govindassamy added a comment -

          Thanks for the review Yiqun Lin. Attached v04 patch to address the following comment. Kindly take a look.

          When the listOpenFiles command used in HA mode, the Tracer passed to OpenFilesIterator will be null.

          Good catch. Fixed this and added a simple test for HA to make sure the command doesn't crash.

          Show
          manojg Manoj Govindassamy added a comment - Thanks for the review Yiqun Lin . Attached v04 patch to address the following comment. Kindly take a look. When the listOpenFiles command used in HA mode, the Tracer passed to OpenFilesIterator will be null. Good catch. Fixed this and added a simple test for HA to make sure the command doesn't crash.
          Hide
          linyiqun Yiqun Lin added a comment - - edited

          Thanks for the updated patch, Manoj Govindassamy! The latest patch almost looks good to me now. Only one comment from me:
          When the listOpenFiles command used in HA mode, the Tracer passed to OpenFilesIterator will be null. And this will lead a NPE error. The related codes in OpenFilesIterator.

          +  public BatchedEntries<OpenFileEntry> makeRequest(Long prevId)
          +      throws IOException {
          +    try (TraceScope ignored = tracer.newScope("listOpenFiles")) {   <== there is a chance that tracer will be null
          +      return namenode.listOpenFiles(prevId);
          +    }
          +  }
          

          The codes in DFSAdmin#listOpenFiles

            public int listOpenFiles() throws IOException {
              DistributedFileSystem dfs = getDFS();
              Configuration dfsConf = dfs.getConf();
              URI dfsUri = dfs.getUri();
              boolean isHaEnabled = HAUtilClient.isLogicalUri(dfsConf, dfsUri);
          
              RemoteIterator<OpenFileEntry> openFilesRemoteIterator;
              if (isHaEnabled) {
                ProxyAndInfo<ClientProtocol> proxy = NameNodeProxies.createNonHAProxy(
                    dfsConf, HAUtil.getAddressOfActive(getDFS()), ClientProtocol.class,
                    UserGroupInformation.getCurrentUser(), false);
                openFilesRemoteIterator = new OpenFilesIterator(proxy.getProxy(), null);
              } else {
          
          Show
          linyiqun Yiqun Lin added a comment - - edited Thanks for the updated patch, Manoj Govindassamy ! The latest patch almost looks good to me now. Only one comment from me: When the listOpenFiles command used in HA mode, the Tracer passed to OpenFilesIterator will be null. And this will lead a NPE error. The related codes in OpenFilesIterator . + public BatchedEntries<OpenFileEntry> makeRequest( Long prevId) + throws IOException { + try (TraceScope ignored = tracer.newScope( "listOpenFiles" )) { <== there is a chance that tracer will be null + return namenode.listOpenFiles(prevId); + } + } The codes in DFSAdmin#listOpenFiles public int listOpenFiles() throws IOException { DistributedFileSystem dfs = getDFS(); Configuration dfsConf = dfs.getConf(); URI dfsUri = dfs.getUri(); boolean isHaEnabled = HAUtilClient.isLogicalUri(dfsConf, dfsUri); RemoteIterator<OpenFileEntry> openFilesRemoteIterator; if (isHaEnabled) { ProxyAndInfo<ClientProtocol> proxy = NameNodeProxies.createNonHAProxy( dfsConf, HAUtil.getAddressOfActive(getDFS()), ClientProtocol.class, UserGroupInformation.getCurrentUser(), false ); openFilesRemoteIterator = new OpenFilesIterator(proxy.getProxy(), null ); } else {
          Hide
          manojg Manoj Govindassamy added a comment - - edited

          Above test failures are not related to the patch. There are all passing in the local run for me.

          Show
          manojg Manoj Govindassamy added a comment - - edited Above test failures are not related to the patch. There are all passing in the local run for me.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 16s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 3 new or modified test files.
          0 mvndep 0m 33s Maven dependency ordering for branch
          +1 mvninstall 14m 6s trunk passed
          +1 compile 1m 39s trunk passed
          +1 checkstyle 0m 54s trunk passed
          +1 mvnsite 1m 39s trunk passed
          +1 mvneclipse 0m 29s trunk passed
          +1 findbugs 3m 20s trunk passed
          +1 javadoc 1m 9s trunk passed
          0 mvndep 0m 8s Maven dependency ordering for patch
          +1 mvninstall 1m 38s the patch passed
          +1 compile 1m 45s the patch passed
          +1 cc 1m 45s the patch passed
          +1 javac 1m 45s the patch passed
          -0 checkstyle 0m 53s hadoop-hdfs-project: The patch generated 2 new + 1092 unchanged - 0 fixed = 1094 total (was 1092)
          +1 mvnsite 1m 41s the patch passed
          +1 mvneclipse 0m 26s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 xml 0m 1s The patch has no ill-formed XML file.
          +1 findbugs 3m 39s the patch passed
          +1 javadoc 0m 57s the patch passed
          +1 unit 1m 12s hadoop-hdfs-client in the patch passed.
          -1 unit 69m 36s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 20s The patch does not generate ASF License warnings.
          107m 51s



          Reason Tests
          Failed junit tests hadoop.hdfs.server.balancer.TestBalancer
            hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070
            hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080
            hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting
            hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:14b5c93
          JIRA Issue HDFS-10480
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12868806/HDFS-10480.03.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc xml
          uname Linux f14ae2b23094 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 40e6a85
          Default Java 1.8.0_131
          findbugs v3.1.0-RC1
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19496/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/19496/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19496/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19496/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 16s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 3 new or modified test files. 0 mvndep 0m 33s Maven dependency ordering for branch +1 mvninstall 14m 6s trunk passed +1 compile 1m 39s trunk passed +1 checkstyle 0m 54s trunk passed +1 mvnsite 1m 39s trunk passed +1 mvneclipse 0m 29s trunk passed +1 findbugs 3m 20s trunk passed +1 javadoc 1m 9s trunk passed 0 mvndep 0m 8s Maven dependency ordering for patch +1 mvninstall 1m 38s the patch passed +1 compile 1m 45s the patch passed +1 cc 1m 45s the patch passed +1 javac 1m 45s the patch passed -0 checkstyle 0m 53s hadoop-hdfs-project: The patch generated 2 new + 1092 unchanged - 0 fixed = 1094 total (was 1092) +1 mvnsite 1m 41s the patch passed +1 mvneclipse 0m 26s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 xml 0m 1s The patch has no ill-formed XML file. +1 findbugs 3m 39s the patch passed +1 javadoc 0m 57s the patch passed +1 unit 1m 12s hadoop-hdfs-client in the patch passed. -1 unit 69m 36s hadoop-hdfs in the patch failed. +1 asflicense 0m 20s The patch does not generate ASF License warnings. 107m 51s Reason Tests Failed junit tests hadoop.hdfs.server.balancer.TestBalancer   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting   hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue HDFS-10480 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12868806/HDFS-10480.03.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc xml uname Linux f14ae2b23094 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 40e6a85 Default Java 1.8.0_131 findbugs v3.1.0-RC1 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19496/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/19496/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19496/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19496/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          manojg Manoj Govindassamy added a comment -

          Thanks for the detailed review Yiqun Lin. Attached v03 patch with following comments addressed. Please take a look.

          In TestDFSAdmin#verifyOpenFilesListing, line 676 LOG.info(out);, can we add one line like LOG.info("Open files: ") before this?

          Added a prefix string 'dfsadmin -listOpenFiles output:' before logging the command output.

          In TestDFSAdmin#verifyOpenFilesListing, line 677, the name openFilePath should be closedFilePath

          Fixed the variable name.

          In TestListOpenFiles#verifyOpenFiles, line 145, "open files not listed!" should be " open files not listed!". There is one space missing.

          Added a space in the above logging.

          Can you update the param name count to numFilesToCreate in method createOpenFiles(Path parentDir, String fileNamePrefix, int count)? The same to closeFiles method, we can use numFilesToClose. It will be better to understand.

          Done.

          The operation count-- is missing in the following codes.

          Good catch. Even if you don't have the count decremented, the loop gets over because of the iterator traversal and thats why the test was passing. Nevertheless, the intention was to close only count(numFilesToClose) files and not all. Fixed now.

          here should do verifyOpenFiles(openFiles) since we have closed some files and should check the open files again.

          Added the verification here.

          what are following lines intended to do? In addition, the file name prefix is same.

          In fact, the files are getting created with overwrite flag set to true. So, creating the files again would overwrite them. Also, number of files to create is much more than the last iteration. So, part of the files will get overwritten and part of the files will get newly created. To avoid the confusion in the code, I changed the prefix to "open-3" for the last batch of creates.

          PS: Few checkstyle findings are related to method length exceeding 150 and identation, which are not introduced by the patch, so not fixing them.

          Show
          manojg Manoj Govindassamy added a comment - Thanks for the detailed review Yiqun Lin . Attached v03 patch with following comments addressed. Please take a look. In TestDFSAdmin#verifyOpenFilesListing, line 676 LOG.info(out);, can we add one line like LOG.info("Open files: ") before this? Added a prefix string 'dfsadmin -listOpenFiles output:' before logging the command output. In TestDFSAdmin#verifyOpenFilesListing, line 677, the name openFilePath should be closedFilePath Fixed the variable name. In TestListOpenFiles#verifyOpenFiles, line 145, "open files not listed!" should be " open files not listed!". There is one space missing. Added a space in the above logging. Can you update the param name count to numFilesToCreate in method createOpenFiles(Path parentDir, String fileNamePrefix, int count)? The same to closeFiles method, we can use numFilesToClose. It will be better to understand. Done. The operation count-- is missing in the following codes. Good catch. Even if you don't have the count decremented, the loop gets over because of the iterator traversal and thats why the test was passing. Nevertheless, the intention was to close only count(numFilesToClose) files and not all. Fixed now. here should do verifyOpenFiles(openFiles) since we have closed some files and should check the open files again. Added the verification here. what are following lines intended to do? In addition, the file name prefix is same. In fact, the files are getting created with overwrite flag set to true. So, creating the files again would overwrite them. Also, number of files to create is much more than the last iteration. So, part of the files will get overwritten and part of the files will get newly created. To avoid the confusion in the code, I changed the prefix to "open-3" for the last batch of creates. PS: Few checkstyle findings are related to method length exceeding 150 and identation, which are not introduced by the patch, so not fixing them.
          Hide
          linyiqun Yiqun Lin added a comment -

          Thanks Manoj Govindassamy for helping filling JIRAs.
          I take some time to review the unit tests in the patch. I have some minor comments:

          1.In TestDFSAdmin#verifyOpenFilesListing, line 676 LOG.info(out);, can we add one line like LOG.info("Open files: ") before this? it will help us understand what are printed in the following lines.
          2.In TestDFSAdmin#verifyOpenFilesListing, line 677, the name openFilePath should be closedFilePath.
          3.In TestListOpenFiles#verifyOpenFiles, line 145, "open files not listed!" should be " open files not listed!". There is one space missing.
          4. Can you update the param name count to numFilesToCreate in method createOpenFiles(Path parentDir, String fileNamePrefix, int count)? The same to closeFiles method, we can use numFilesToClose. It will be better to understand.
          5.The operation count-- is missing in the following codes.

            private void closeFiles(Map<String, FSDataOutputStream> openFiles,
                int count) throws IOException {
              for (Iterator<Entry<String, FSDataOutputStream>> it =
                   openFiles.entrySet().iterator(); it.hasNext(); ) {
                Entry<String, FSDataOutputStream> entry = it.next();
                entry.getValue().close();
                LOG.info("Closed file: " + entry.getKey());
                it.remove();              <===== One line missing here: count--
                if (count == 0) {
                  break;
                }
              }
            }
          

          6. I am a little confused for the following codes:

            public void testListOpenFilesViaNameNodeRPC() throws Exception {
              Map<String, FSDataOutputStream> openFiles = new HashMap<>();
              Path baseDir = new Path("/testListOpenFiles");
              fs.mkdirs(baseDir);
          
              createFiles(baseDir, "closed", 10);
              verifyOpenFiles(openFiles);
              BatchedEntries<OpenFileEntry> openFileEntryBatchedEntries =
                  nnRpc.listOpenFiles(0);
              assertTrue("Open files list should be empty!",
                  openFileEntryBatchedEntries.size() == 0);
          
              openFiles.putAll(createOpenFiles(baseDir, "open-1", 1));
              verifyOpenFiles(openFiles);
          
              openFiles.putAll(createOpenFiles(baseDir, "open-2",
                  (BATCH_SIZE * 2 + BATCH_SIZE / 2)));
              verifyOpenFiles(openFiles);
          
              closeFiles(openFiles, openFiles.size() / 2);
                                                                    <==== here should do verifyOpenFiles(openFiles) since we have closed some files and should check the open files again. what are following lines intended to do? In addition, the file name prefix is same.
              openFiles.putAll(createOpenFiles(baseDir, "open-2",
                  (BATCH_SIZE * 5)));                  
              verifyOpenFiles(openFiles);
            }
          
          Show
          linyiqun Yiqun Lin added a comment - Thanks Manoj Govindassamy for helping filling JIRAs. I take some time to review the unit tests in the patch. I have some minor comments: 1.In TestDFSAdmin#verifyOpenFilesListing , line 676 LOG.info(out); , can we add one line like LOG.info("Open files: ") before this? it will help us understand what are printed in the following lines. 2.In TestDFSAdmin#verifyOpenFilesListing , line 677, the name openFilePath should be closedFilePath . 3.In TestListOpenFiles#verifyOpenFiles , line 145, "open files not listed!" should be " open files not listed!" . There is one space missing. 4. Can you update the param name count to numFilesToCreate in method createOpenFiles(Path parentDir, String fileNamePrefix, int count) ? The same to closeFiles method, we can use numFilesToClose . It will be better to understand. 5.The operation count-- is missing in the following codes. private void closeFiles(Map< String , FSDataOutputStream> openFiles, int count) throws IOException { for (Iterator<Entry< String , FSDataOutputStream>> it = openFiles.entrySet().iterator(); it.hasNext(); ) { Entry< String , FSDataOutputStream> entry = it.next(); entry.getValue().close(); LOG.info( "Closed file: " + entry.getKey()); it.remove(); <===== One line missing here: count-- if (count == 0) { break ; } } } 6. I am a little confused for the following codes: public void testListOpenFilesViaNameNodeRPC() throws Exception { Map< String , FSDataOutputStream> openFiles = new HashMap<>(); Path baseDir = new Path( "/testListOpenFiles" ); fs.mkdirs(baseDir); createFiles(baseDir, "closed" , 10); verifyOpenFiles(openFiles); BatchedEntries<OpenFileEntry> openFileEntryBatchedEntries = nnRpc.listOpenFiles(0); assertTrue( "Open files list should be empty!" , openFileEntryBatchedEntries.size() == 0); openFiles.putAll(createOpenFiles(baseDir, "open-1" , 1)); verifyOpenFiles(openFiles); openFiles.putAll(createOpenFiles(baseDir, "open-2" , (BATCH_SIZE * 2 + BATCH_SIZE / 2))); verifyOpenFiles(openFiles); closeFiles(openFiles, openFiles.size() / 2); <==== here should do verifyOpenFiles(openFiles) since we have closed some files and should check the open files again. what are following lines intended to do ? In addition, the file name prefix is same. openFiles.putAll(createOpenFiles(baseDir, "open-2" , (BATCH_SIZE * 5))); verifyOpenFiles(openFiles); }
          Hide
          manojg Manoj Govindassamy added a comment -

          Filed HDFS-11847 and HDFS-11848 to track the enhancements as discussed in the above few comments.

          Show
          manojg Manoj Govindassamy added a comment - Filed HDFS-11847 and HDFS-11848 to track the enhancements as discussed in the above few comments.
          Hide
          manojg Manoj Govindassamy added a comment -

          Thanks for the suggestion Yiqun Lin. Totally agree, above said enhancements will make the command much more useful. In fact, I have already started working on the first one which is to list the files blocking decommissioning nodes. Will file a jira for the same soon. The other one to list open files under a given path is also very useful. I too believe that these enhancements are better to go in new jiras. Will file these jiras soon and you can take a look. Thanks for the help.

          Show
          manojg Manoj Govindassamy added a comment - Thanks for the suggestion Yiqun Lin . Totally agree, above said enhancements will make the command much more useful. In fact, I have already started working on the first one which is to list the files blocking decommissioning nodes. Will file a jira for the same soon. The other one to list open files under a given path is also very useful. I too believe that these enhancements are better to go in new jiras. Will file these jiras soon and you can take a look. Thanks for the help.
          Hide
          linyiqun Yiqun Lin added a comment -

          Thanks Manoj Govindassamy for the great work! Just some thoughts from me.

          One case that mentioned above

          "which open files are blocking server decommissioning"

          I am thinking we can add a new option like -node to filter the open files which under these decommissioning nodes. This will help us find these files and then we can use command to revoke the lease and close the file. Also we can add the new option -path to list a part of files that under the given path and no need to list all the open files. I suppose these two additional options will make this command be more useful. If we are agreed on this, I can help make this improvement in another ticket.

          Show
          linyiqun Yiqun Lin added a comment - Thanks Manoj Govindassamy for the great work! Just some thoughts from me. One case that mentioned above "which open files are blocking server decommissioning" I am thinking we can add a new option like -node to filter the open files which under these decommissioning nodes. This will help us find these files and then we can use command to revoke the lease and close the file. Also we can add the new option -path to list a part of files that under the given path and no need to list all the open files. I suppose these two additional options will make this command be more useful. If we are agreed on this, I can help make this improvement in another ticket.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 17s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 3 new or modified test files.
          0 mvndep 0m 33s Maven dependency ordering for branch
          +1 mvninstall 16m 10s trunk passed
          +1 compile 1m 25s trunk passed
          +1 checkstyle 0m 52s trunk passed
          +1 mvnsite 1m 28s trunk passed
          +1 mvneclipse 0m 28s trunk passed
          +1 findbugs 3m 13s trunk passed
          +1 javadoc 1m 2s trunk passed
          0 mvndep 0m 7s Maven dependency ordering for patch
          +1 mvninstall 1m 22s the patch passed
          +1 compile 1m 24s the patch passed
          +1 cc 1m 24s the patch passed
          +1 javac 1m 24s the patch passed
          -0 checkstyle 0m 50s hadoop-hdfs-project: The patch generated 11 new + 1089 unchanged - 3 fixed = 1100 total (was 1092)
          +1 mvnsite 1m 27s the patch passed
          +1 mvneclipse 0m 25s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 xml 0m 1s The patch has no ill-formed XML file.
          +1 findbugs 3m 47s the patch passed
          +1 javadoc 1m 5s the patch passed
          +1 unit 1m 12s hadoop-hdfs-client in the patch passed.
          -1 unit 75m 15s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 40s The patch does not generate ASF License warnings.
          114m 42s



          Reason Tests
          Failed junit tests hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure
            hadoop.hdfs.server.datanode.TestDirectoryScanner
            hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting
            hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150
          Timed out junit tests org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:14b5c93
          JIRA Issue HDFS-10480
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12868608/HDFS-10480.02.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc xml
          uname Linux 144aeebe38aa 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / eb7791b
          Default Java 1.8.0_131
          findbugs v3.1.0-RC1
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19477/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/19477/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19477/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19477/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 17s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 3 new or modified test files. 0 mvndep 0m 33s Maven dependency ordering for branch +1 mvninstall 16m 10s trunk passed +1 compile 1m 25s trunk passed +1 checkstyle 0m 52s trunk passed +1 mvnsite 1m 28s trunk passed +1 mvneclipse 0m 28s trunk passed +1 findbugs 3m 13s trunk passed +1 javadoc 1m 2s trunk passed 0 mvndep 0m 7s Maven dependency ordering for patch +1 mvninstall 1m 22s the patch passed +1 compile 1m 24s the patch passed +1 cc 1m 24s the patch passed +1 javac 1m 24s the patch passed -0 checkstyle 0m 50s hadoop-hdfs-project: The patch generated 11 new + 1089 unchanged - 3 fixed = 1100 total (was 1092) +1 mvnsite 1m 27s the patch passed +1 mvneclipse 0m 25s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 xml 0m 1s The patch has no ill-formed XML file. +1 findbugs 3m 47s the patch passed +1 javadoc 1m 5s the patch passed +1 unit 1m 12s hadoop-hdfs-client in the patch passed. -1 unit 75m 15s hadoop-hdfs in the patch failed. +1 asflicense 0m 40s The patch does not generate ASF License warnings. 114m 42s Reason Tests Failed junit tests hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure   hadoop.hdfs.server.datanode.TestDirectoryScanner   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150 Timed out junit tests org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue HDFS-10480 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12868608/HDFS-10480.02.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc xml uname Linux 144aeebe38aa 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / eb7791b Default Java 1.8.0_131 findbugs v3.1.0-RC1 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19477/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/19477/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19477/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19477/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          manojg Manoj Govindassamy added a comment -

          Thanks for the review Andrew Wang, Kihwal Lee. Attached v02 patch to address the following. Can you please take a look.

          Is there a reason for dumping the info to a file on the NN? This makes it more difficult for admins to get the information, and is more complicated than just printing it out on the command line. Allowing a user-specified name that isn't validated is also a possible security issue. This also means normal users can't use this, since they won't have access to the NN's log directory.

          The design is changed now. Client now gets a RemoteIterator for the open files, and the list is retrieved in batches from NameNode. The fetching batch size is configurable. This light weight model helps NameNode to serve any humongous list with ease.

          Let's not change the import to a wildcard, it makes backports harder.

          Done.

          Shouldn't this only go to the active NN, since it has up-to-date info about writers? This is in reference to the Operation.UNCHECKED and the HA logic in DFSAdmin.

          Done.

          Nit: "getUnderconstructionFiles" -> "getUnderConstructionFiles"

          Done.

          Could you also add a Java API to HdfsAdmin?

          Done.

          One more thing that would be nice here is to filter the output on a passed path or DN. Usecases: An admin might already know a stale file by path (perhaps from fsck's -openforwrite), and wants to figure out who the lease holder is. A DN might be blocked from decommissioning by an open-for-write file, and the admin wants to figure out what files those are.

          With thousand+ node clusters, where you might be adding and removing machines regularly for maintenance, a huge use case on top of the directory filter would be a "which open files are blocking server decommissioning" filter (identify files with blocks on hosts that are currently in decommisioning state).

          With the attached patch, the infrastructure is now available to get the above enhancements. In the interest of patch size and easy backports, can take up above enhancements in a new jira, if you are ok.

          Show
          manojg Manoj Govindassamy added a comment - Thanks for the review Andrew Wang , Kihwal Lee . Attached v02 patch to address the following. Can you please take a look. Is there a reason for dumping the info to a file on the NN? This makes it more difficult for admins to get the information, and is more complicated than just printing it out on the command line. Allowing a user-specified name that isn't validated is also a possible security issue. This also means normal users can't use this, since they won't have access to the NN's log directory. The design is changed now. Client now gets a RemoteIterator for the open files, and the list is retrieved in batches from NameNode. The fetching batch size is configurable. This light weight model helps NameNode to serve any humongous list with ease. Let's not change the import to a wildcard, it makes backports harder. Done. Shouldn't this only go to the active NN, since it has up-to-date info about writers? This is in reference to the Operation.UNCHECKED and the HA logic in DFSAdmin. Done. Nit: "getUnderconstructionFiles" -> "getUnderConstructionFiles" Done. Could you also add a Java API to HdfsAdmin? Done. One more thing that would be nice here is to filter the output on a passed path or DN. Usecases: An admin might already know a stale file by path (perhaps from fsck's -openforwrite), and wants to figure out who the lease holder is. A DN might be blocked from decommissioning by an open-for-write file, and the admin wants to figure out what files those are. With thousand+ node clusters, where you might be adding and removing machines regularly for maintenance, a huge use case on top of the directory filter would be a "which open files are blocking server decommissioning" filter (identify files with blocks on hosts that are currently in decommisioning state). With the attached patch, the infrastructure is now available to get the above enhancements. In the interest of patch size and easy backports, can take up above enhancements in a new jira, if you are ok.
          Hide
          manojg Manoj Govindassamy added a comment -

          Had an offline chat with Rushabh S Shah. Taking up this bug to complete the pending items as he is currently fully occupied with other higher priority tasks. Thanks for the patch contribution Rushabh.

          Show
          manojg Manoj Govindassamy added a comment - Had an offline chat with Rushabh S Shah . Taking up this bug to complete the pending items as he is currently fully occupied with other higher priority tasks. Thanks for the patch contribution Rushabh.
          Hide
          shahrs87 Rushabh S Shah added a comment -

          Just curious, what's the difference of this command with hdfs fsck / -openforwrite ?

          The output will be the same. Just the time to reach that output is vastly different.
          hdfs fsck / will crawl the whole filesystem.

          Show
          shahrs87 Rushabh S Shah added a comment - Just curious, what's the difference of this command with hdfs fsck / -openforwrite ? The output will be the same. Just the time to reach that output is vastly different. hdfs fsck / will crawl the whole filesystem.
          Hide
          cheersyang Weiwei Yang added a comment -

          Hi Rushabh S Shah

          Just curious, what's the difference of this command with hdfs fsck / -openforwrite ?

          Show
          cheersyang Weiwei Yang added a comment - Hi Rushabh S Shah Just curious, what's the difference of this command with hdfs fsck / -openforwrite ?
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 18s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          0 mvndep 0m 22s Maven dependency ordering for branch
          +1 mvninstall 14m 10s trunk passed
          +1 compile 1m 25s trunk passed
          +1 checkstyle 0m 49s trunk passed
          +1 mvnsite 1m 28s trunk passed
          +1 mvneclipse 0m 29s trunk passed
          -1 findbugs 1m 22s hadoop-hdfs-project/hadoop-hdfs-client in trunk has 2 extant Findbugs warnings.
          -1 findbugs 1m 38s hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings.
          +1 javadoc 1m 3s trunk passed
          0 mvndep 0m 7s Maven dependency ordering for patch
          +1 mvninstall 1m 27s the patch passed
          +1 compile 1m 30s the patch passed
          +1 cc 1m 30s the patch passed
          +1 javac 1m 30s the patch passed
          -0 checkstyle 0m 46s hadoop-hdfs-project: The patch generated 4 new + 635 unchanged - 2 fixed = 639 total (was 637)
          +1 mvnsite 1m 24s the patch passed
          +1 mvneclipse 0m 23s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 3m 31s the patch passed
          +1 javadoc 1m 5s the patch passed
          +1 unit 1m 20s hadoop-hdfs-client in the patch passed.
          -1 unit 66m 33s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 23s The patch does not generate ASF License warnings.
          103m 12s



          Reason Tests
          Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:0ac17dc
          JIRA Issue HDFS-10480
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12824136/HDFS-10480-trunk-1.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc
          uname Linux ec0a05139300 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 667966c
          Default Java 1.8.0_121
          findbugs v3.1.0-RC1
          findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/19166/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html
          findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/19166/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19166/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/19166/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19166/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19166/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 18s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. 0 mvndep 0m 22s Maven dependency ordering for branch +1 mvninstall 14m 10s trunk passed +1 compile 1m 25s trunk passed +1 checkstyle 0m 49s trunk passed +1 mvnsite 1m 28s trunk passed +1 mvneclipse 0m 29s trunk passed -1 findbugs 1m 22s hadoop-hdfs-project/hadoop-hdfs-client in trunk has 2 extant Findbugs warnings. -1 findbugs 1m 38s hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings. +1 javadoc 1m 3s trunk passed 0 mvndep 0m 7s Maven dependency ordering for patch +1 mvninstall 1m 27s the patch passed +1 compile 1m 30s the patch passed +1 cc 1m 30s the patch passed +1 javac 1m 30s the patch passed -0 checkstyle 0m 46s hadoop-hdfs-project: The patch generated 4 new + 635 unchanged - 2 fixed = 639 total (was 637) +1 mvnsite 1m 24s the patch passed +1 mvneclipse 0m 23s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 3m 31s the patch passed +1 javadoc 1m 5s the patch passed +1 unit 1m 20s hadoop-hdfs-client in the patch passed. -1 unit 66m 33s hadoop-hdfs in the patch failed. +1 asflicense 0m 23s The patch does not generate ASF License warnings. 103m 12s Reason Tests Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting Subsystem Report/Notes Docker Image:yetus/hadoop:0ac17dc JIRA Issue HDFS-10480 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12824136/HDFS-10480-trunk-1.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc uname Linux ec0a05139300 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 667966c Default Java 1.8.0_121 findbugs v3.1.0-RC1 findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/19166/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/19166/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19166/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/19166/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19166/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19166/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          adamkennedy77 Adam Kennedy added a comment -

          With thousand+ node clusters, where you might be adding and removing machines regularly for maintenance, a huge use case on top of the directory filter would be a "which open files are blocking server decommissioning" filter (identify files with blocks on hosts that are currently in decommisioning state).

          This is very difficult right now, and particularly annoying when by chance you try to decom the node holding the long-open HDFS balancer pid file.

          Show
          adamkennedy77 Adam Kennedy added a comment - With thousand+ node clusters, where you might be adding and removing machines regularly for maintenance, a huge use case on top of the directory filter would be a "which open files are blocking server decommissioning" filter (identify files with blocks on hosts that are currently in decommisioning state). This is very difficult right now, and particularly annoying when by chance you try to decom the node holding the long-open HDFS balancer pid file.
          Hide
          shahrs87 Rushabh S Shah added a comment -

          Manoj Govindassamy: Thanks for the valuable feedback on the patch and will try to incorporate in the next patch.
          I got sidetracked due to some internal issues but I plan to restart my work within a week or so.

          Show
          shahrs87 Rushabh S Shah added a comment - Manoj Govindassamy : Thanks for the valuable feedback on the patch and will try to incorporate in the next patch. I got sidetracked due to some internal issues but I plan to restart my work within a week or so.
          Hide
          manojg Manoj Govindassamy added a comment -

          Rushabh S Shah / Kihwal Lee,

          Thanks for working on this. Very useful. On top of review comments from Andrew Wang, I have few more thoughts.

          • There are cases where listing open files are needed only under a specific dir and not the whole system.
          • Java API to HdfsAdmin is also very useful for this feature.
          • It will be very useful to return/print the open files instead of dumping them onto a file.
          • Testing could add more cases around multiple readers, file opened for append/truncates, snapshots etc.,

          If you are not working on this, can I take up this jira, implement the extra features, incorporate all review comments and complete the testing ? Please let me know.

          Show
          manojg Manoj Govindassamy added a comment - Rushabh S Shah / Kihwal Lee , Thanks for working on this. Very useful. On top of review comments from Andrew Wang , I have few more thoughts. There are cases where listing open files are needed only under a specific dir and not the whole system. Java API to HdfsAdmin is also very useful for this feature. It will be very useful to return/print the open files instead of dumping them onto a file. Testing could add more cases around multiple readers, file opened for append/truncates, snapshots etc., If you are not working on this, can I take up this jira, implement the extra features, incorporate all review comments and complete the testing ? Please let me know.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 14s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          0 mvndep 0m 7s Maven dependency ordering for branch
          +1 mvninstall 13m 14s trunk passed
          +1 compile 1m 32s trunk passed
          +1 checkstyle 0m 37s trunk passed
          +1 mvnsite 1m 26s trunk passed
          +1 mvneclipse 0m 25s trunk passed
          +1 findbugs 3m 10s trunk passed
          +1 javadoc 1m 4s trunk passed
          0 mvndep 0m 6s Maven dependency ordering for patch
          +1 mvninstall 1m 31s the patch passed
          +1 compile 1m 31s the patch passed
          +1 cc 1m 31s the patch passed
          +1 javac 1m 31s the patch passed
          -0 checkstyle 0m 37s hadoop-hdfs-project: The patch generated 4 new + 621 unchanged - 2 fixed = 625 total (was 623)
          +1 mvnsite 1m 32s the patch passed
          +1 mvneclipse 0m 23s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 3m 47s the patch passed
          +1 javadoc 1m 0s the patch passed
          +1 unit 1m 6s hadoop-hdfs-client in the patch passed.
          -1 unit 85m 35s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 29s The patch does not generate ASF License warnings.
          120m 56s



          Reason Tests
          Failed junit tests hadoop.hdfs.server.datanode.checker.TestThrottledAsyncChecker
          Timed out junit tests org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:a9ad5d6
          JIRA Issue HDFS-10480
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12824136/HDFS-10480-trunk-1.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc
          uname Linux fd21136c5344 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 2977bc6
          Default Java 1.8.0_111
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/18054/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/18054/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/18054/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/18054/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 14s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. 0 mvndep 0m 7s Maven dependency ordering for branch +1 mvninstall 13m 14s trunk passed +1 compile 1m 32s trunk passed +1 checkstyle 0m 37s trunk passed +1 mvnsite 1m 26s trunk passed +1 mvneclipse 0m 25s trunk passed +1 findbugs 3m 10s trunk passed +1 javadoc 1m 4s trunk passed 0 mvndep 0m 6s Maven dependency ordering for patch +1 mvninstall 1m 31s the patch passed +1 compile 1m 31s the patch passed +1 cc 1m 31s the patch passed +1 javac 1m 31s the patch passed -0 checkstyle 0m 37s hadoop-hdfs-project: The patch generated 4 new + 621 unchanged - 2 fixed = 625 total (was 623) +1 mvnsite 1m 32s the patch passed +1 mvneclipse 0m 23s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 3m 47s the patch passed +1 javadoc 1m 0s the patch passed +1 unit 1m 6s hadoop-hdfs-client in the patch passed. -1 unit 85m 35s hadoop-hdfs in the patch failed. +1 asflicense 0m 29s The patch does not generate ASF License warnings. 120m 56s Reason Tests Failed junit tests hadoop.hdfs.server.datanode.checker.TestThrottledAsyncChecker Timed out junit tests org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue HDFS-10480 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12824136/HDFS-10480-trunk-1.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc uname Linux fd21136c5344 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 2977bc6 Default Java 1.8.0_111 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/18054/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/18054/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/18054/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project Console output https://builds.apache.org/job/PreCommit-HDFS-Build/18054/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          djp Junping Du added a comment -

          2.8 is in RC stage, move to 2.9

          Show
          djp Junping Du added a comment - 2.8 is in RC stage, move to 2.9
          Hide
          andrew.wang Andrew Wang added a comment -

          Hi Rushabh, thanks for working on this, I took a brief look at this patch. Some review comments:

          • Is there a reason for dumping the info to a file on the NN? This makes it more difficult for admins to get the information, and is more complicated than just printing it out on the command line. Allowing a user-specified name that isn't validated is also a possible security issue. This also means normal users can't use this, since they won't have access to the NN's log directory.
          • Let's not change the import to a wildcard, it makes backports harder.
          • Shouldn't this only go to the active NN, since it has up-to-date info about writers? This is in reference to the Operation.UNCHECKED and the HA logic in DFSAdmin.
          • Nit: "getUnderconstructionFiles" -> "getUnderConstructionFiles"
          • Could you also add a Java API to HdfsAdmin?

          One more thing that would be nice here is to filter the output on a passed path or DN. Usecases:

          • An admin might already know a stale file by path (perhaps from fsck's -openforwrite), and wants to figure out who the lease holder is.
          • A DN might be blocked from decommissioning by an open-for-write file, and the admin wants to figure out what files those are.
          Show
          andrew.wang Andrew Wang added a comment - Hi Rushabh, thanks for working on this, I took a brief look at this patch. Some review comments: Is there a reason for dumping the info to a file on the NN? This makes it more difficult for admins to get the information, and is more complicated than just printing it out on the command line. Allowing a user-specified name that isn't validated is also a possible security issue. This also means normal users can't use this, since they won't have access to the NN's log directory. Let's not change the import to a wildcard, it makes backports harder. Shouldn't this only go to the active NN, since it has up-to-date info about writers? This is in reference to the Operation.UNCHECKED and the HA logic in DFSAdmin. Nit: "getUnderconstructionFiles" -> "getUnderConstructionFiles" Could you also add a Java API to HdfsAdmin? One more thing that would be nice here is to filter the output on a passed path or DN. Usecases: An admin might already know a stale file by path (perhaps from fsck's -openforwrite ), and wants to figure out who the lease holder is. A DN might be blocked from decommissioning by an open-for-write file, and the admin wants to figure out what files those are.
          Hide
          kihwal Kihwal Lee added a comment -

          The superuser-only admin command is certainly useful, but letting individual users list their open files might be useful too. We can add an option to list all, which is allowed only to the superuser.

          Show
          kihwal Kihwal Lee added a comment - The superuser-only admin command is certainly useful, but letting individual users list their open files might be useful too. We can add an option to list all, which is allowed only to the superuser.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 14s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          0 mvndep 0m 7s Maven dependency ordering for branch
          +1 mvninstall 7m 40s trunk passed
          +1 compile 1m 36s trunk passed
          +1 checkstyle 0m 40s trunk passed
          +1 mvnsite 1m 40s trunk passed
          +1 mvneclipse 0m 28s trunk passed
          +1 findbugs 3m 41s trunk passed
          +1 javadoc 1m 23s trunk passed
          0 mvndep 0m 8s Maven dependency ordering for patch
          +1 mvninstall 1m 27s the patch passed
          +1 compile 1m 24s the patch passed
          +1 cc 1m 24s the patch passed
          +1 javac 1m 24s the patch passed
          -0 checkstyle 0m 36s hadoop-hdfs-project: The patch generated 4 new + 623 unchanged - 2 fixed = 627 total (was 625)
          +1 mvnsite 1m 38s the patch passed
          +1 mvneclipse 0m 22s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 3m 19s the patch passed
          +1 javadoc 1m 10s the patch passed
          +1 unit 0m 52s hadoop-hdfs-client in the patch passed.
          -1 unit 60m 49s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 18s The patch does not generate ASF License warnings.
          90m 55s



          Reason Tests
          Failed junit tests hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation
            hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer
            hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12824136/HDFS-10480-trunk-1.patch
          JIRA Issue HDFS-10480
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc
          uname Linux 94cd161c3364 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 7f05ff7
          Default Java 1.8.0_101
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/16449/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/16449/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/16449/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/16449/console
          Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 14s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. 0 mvndep 0m 7s Maven dependency ordering for branch +1 mvninstall 7m 40s trunk passed +1 compile 1m 36s trunk passed +1 checkstyle 0m 40s trunk passed +1 mvnsite 1m 40s trunk passed +1 mvneclipse 0m 28s trunk passed +1 findbugs 3m 41s trunk passed +1 javadoc 1m 23s trunk passed 0 mvndep 0m 8s Maven dependency ordering for patch +1 mvninstall 1m 27s the patch passed +1 compile 1m 24s the patch passed +1 cc 1m 24s the patch passed +1 javac 1m 24s the patch passed -0 checkstyle 0m 36s hadoop-hdfs-project: The patch generated 4 new + 623 unchanged - 2 fixed = 627 total (was 625) +1 mvnsite 1m 38s the patch passed +1 mvneclipse 0m 22s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 3m 19s the patch passed +1 javadoc 1m 10s the patch passed +1 unit 0m 52s hadoop-hdfs-client in the patch passed. -1 unit 60m 49s hadoop-hdfs in the patch failed. +1 asflicense 0m 18s The patch does not generate ASF License warnings. 90m 55s Reason Tests Failed junit tests hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation   hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer   hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12824136/HDFS-10480-trunk-1.patch JIRA Issue HDFS-10480 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc uname Linux 94cd161c3364 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 7f05ff7 Default Java 1.8.0_101 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/16449/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/16449/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/16449/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project Console output https://builds.apache.org/job/PreCommit-HDFS-Build/16449/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          shahrs87 Rushabh S Shah added a comment -

          Thanks for pointing out that jira.

          Show
          shahrs87 Rushabh S Shah added a comment - Thanks for pointing out that jira.
          Hide
          shahrs87 Rushabh S Shah added a comment -

          Attaching a new patch addressing checkstyle comments.

          Show
          shahrs87 Rushabh S Shah added a comment - Attaching a new patch addressing checkstyle comments.
          Hide
          liuml07 Mingliang Liu added a comment -

          hadoop.hdfs.server.namenode.ha.TestBootstrapStandby is not related and is tracked by HADOOP-13470.

          Show
          liuml07 Mingliang Liu added a comment - hadoop.hdfs.server.namenode.ha.TestBootstrapStandby is not related and is tracked by HADOOP-13470 .
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 18s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          0 mvndep 0m 8s Maven dependency ordering for branch
          +1 mvninstall 7m 20s trunk passed
          +1 compile 1m 26s trunk passed
          +1 checkstyle 0m 38s trunk passed
          +1 mvnsite 1m 57s trunk passed
          +1 mvneclipse 0m 30s trunk passed
          +1 findbugs 3m 47s trunk passed
          +1 javadoc 1m 18s trunk passed
          0 mvndep 0m 7s Maven dependency ordering for patch
          +1 mvninstall 1m 36s the patch passed
          +1 compile 1m 48s the patch passed
          +1 cc 1m 48s the patch passed
          +1 javac 1m 48s the patch passed
          -0 checkstyle 0m 42s hadoop-hdfs-project: The patch generated 24 new + 624 unchanged - 2 fixed = 648 total (was 626)
          +1 mvnsite 1m 51s the patch passed
          +1 mvneclipse 0m 24s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 4m 6s the patch passed
          +1 javadoc 1m 12s the patch passed
          +1 unit 0m 54s hadoop-hdfs-client in the patch passed.
          -1 unit 77m 18s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 23s The patch does not generate ASF License warnings.
          109m 10s



          Reason Tests
          Failed junit tests hadoop.hdfs.server.namenode.ha.TestBootstrapStandby



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12824003/HDFS-10480-trunk.patch
          JIRA Issue HDFS-10480
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc
          uname Linux a0a697dd94ee 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 4b689e7
          Default Java 1.8.0_101
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/16445/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/16445/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/16445/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/16445/console
          Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 18s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. 0 mvndep 0m 8s Maven dependency ordering for branch +1 mvninstall 7m 20s trunk passed +1 compile 1m 26s trunk passed +1 checkstyle 0m 38s trunk passed +1 mvnsite 1m 57s trunk passed +1 mvneclipse 0m 30s trunk passed +1 findbugs 3m 47s trunk passed +1 javadoc 1m 18s trunk passed 0 mvndep 0m 7s Maven dependency ordering for patch +1 mvninstall 1m 36s the patch passed +1 compile 1m 48s the patch passed +1 cc 1m 48s the patch passed +1 javac 1m 48s the patch passed -0 checkstyle 0m 42s hadoop-hdfs-project: The patch generated 24 new + 624 unchanged - 2 fixed = 648 total (was 626) +1 mvnsite 1m 51s the patch passed +1 mvneclipse 0m 24s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 4m 6s the patch passed +1 javadoc 1m 12s the patch passed +1 unit 0m 54s hadoop-hdfs-client in the patch passed. -1 unit 77m 18s hadoop-hdfs in the patch failed. +1 asflicense 0m 23s The patch does not generate ASF License warnings. 109m 10s Reason Tests Failed junit tests hadoop.hdfs.server.namenode.ha.TestBootstrapStandby Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12824003/HDFS-10480-trunk.patch JIRA Issue HDFS-10480 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc uname Linux a0a697dd94ee 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 4b689e7 Default Java 1.8.0_101 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/16445/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/16445/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/16445/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project Console output https://builds.apache.org/job/PreCommit-HDFS-Build/16445/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          shahrs87 Rushabh S Shah added a comment -

          Attaching trunk version of the patch.
          Once this is reviewed, I will attach branch-2 version of patch.

          Show
          shahrs87 Rushabh S Shah added a comment - Attaching trunk version of the patch. Once this is reviewed, I will attach branch-2 version of patch.
          Hide
          kihwal Kihwal Lee added a comment - - edited

          While debugging issues, I had to dump a huge fsimage to get the list of open files. I was looking for files being open for a long time, so it was okay but took a long time to get them. The list may surprise you if there are runaway clients keeping renewing leases. I've seen something open for many months, surviving multiple rolling upgrades. They also pose risk of data loss since even the finalized blocks don't get re-replicated if the file is under construction. If confirmed to be a "forgotten" file that is left open, the admin can use hdfs debug recoverLease command to revoke the lease and close the file.

          Show
          kihwal Kihwal Lee added a comment - - edited While debugging issues, I had to dump a huge fsimage to get the list of open files. I was looking for files being open for a long time, so it was okay but took a long time to get them. The list may surprise you if there are runaway clients keeping renewing leases. I've seen something open for many months, surviving multiple rolling upgrades. They also pose risk of data loss since even the finalized blocks don't get re-replicated if the file is under construction. If confirmed to be a "forgotten" file that is left open, the admin can use hdfs debug recoverLease command to revoke the lease and close the file.

            People

            • Assignee:
              manojg Manoj Govindassamy
              Reporter:
              kihwal Kihwal Lee
            • Votes:
              1 Vote for this issue
              Watchers:
              23 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development