Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-11402

HDFS Snapshots should capture point-in-time copies of OPEN files

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.6.0
    • Fix Version/s: 2.9.0, 3.0.0-alpha4
    • Component/s: hdfs
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Incompatible change, Reviewed
    • Release Note:
      Hide
      When the config param "dfs.namenode.snapshot.capture.openfiles" is enabled, HDFS snapshots taken will additionally capture point-in-time copies of the open files that have valid leases. Even when the current version open files grow or shrink in size, the snapshot will always retain the immutable versions of these open files, just as in for all other closed files. Note: The file length captured for open files in the snapshot was the one recorded in NameNode at the time of snapshot and it may be shorter than what the client has written till then. In order to capture the latest length, the client can call hflush/hsync with the flag SyncFlag.UPDATE_LENGTH on the open files handles.
      Show
      When the config param "dfs.namenode.snapshot.capture.openfiles" is enabled, HDFS snapshots taken will additionally capture point-in-time copies of the open files that have valid leases. Even when the current version open files grow or shrink in size, the snapshot will always retain the immutable versions of these open files, just as in for all other closed files. Note: The file length captured for open files in the snapshot was the one recorded in NameNode at the time of snapshot and it may be shorter than what the client has written till then. In order to capture the latest length, the client can call hflush/hsync with the flag SyncFlag.UPDATE_LENGTH on the open files handles.

      Description

      Problem:

      1. When there are files being written and when HDFS Snapshots are taken in parallel, Snapshots do capture all these files, but these being written files in Snapshots do not have the point-in-time file length captured. That is, these open files are not frozen in HDFS Snapshots. These open files grow/shrink in length, just like the original file, even after the snapshot time.

      2. At the time of File close or any other meta data modification operation on these files, HDFS reconciles the file length and records the modification in the last taken Snapshot. All the previously taken Snapshots continue to have those open Files with no modification recorded. So, all those previous snapshots end up using the final modification record in the last snapshot. Thus after the file close, file lengths in all those snapshots will end up same.

      Assume File1 is opened for write and a total of 1MB written to it. While the writes are happening, snapshots are taken in parallel.

      |---Time---T1-----------T2-------------T3----------------T4------>
      |-----------------------Snap1----------Snap2-------------Snap3--->
      |---File1.open---write---------write-----------close------------->
      

      Then at time,
      T2:
      Snap1.File1.length = 0

      T3:
      Snap1.File1.length = 0
      Snap2.File1.length = 0

      <File1 write completed and closed>

      T4:
      Snap1.File1.length = 1MB
      Snap2.File1.length = 1MB
      Snap3.File1.length = 1MB

      Proposal

      1. At the time of taking Snapshot, SnapshotManager#createSnapshot can optionally request DirectorySnapshottableFeature#addSnapshot to freeze open files.

      2. DirectorySnapshottableFeature#addSnapshot can consult with LeaseManager and get a list INodesInPath for all open files under the snapshot dir.

      3. DirectorySnapshottableFeature#addSnapshot after the Snapshot creation, Diff creation and updating modification time, can invoke INodeFile#recordModification for each of the open files. This way, the Snapshot just taken will have a FileDiff with fileSize captured for each of the open files.

      4. Above model follows the current Snapshot and Diff protocols and doesn't introduce any any disk formats. So, I don't think we will be needing any new FSImage Loader/Saver changes for Snapshots.

      5. One of the design goals of HDFS Snapshot was ability to take any number of snapshots in O(1) time. LeaseManager though has all the open files with leases in-memory map, an iteration is still needed to prune the needed open files and then run recordModification on each of them. So, it will not be a strict O(1) with the above proposal. But, its going be a marginal increase only as the new order will be of O(open_files_under_snap_dir). In order to avoid HDFS Snapshots change in behavior for open files and avoid change in time complexity, this improvement can be made under a new config "dfs.namenode.snapshot.freeze.openfiles" which by default can be false.

      1. HDFS-11402.01.patch
        44 kB
        Manoj Govindassamy
      2. HDFS-11402.02.patch
        52 kB
        Manoj Govindassamy
      3. HDFS-11402.03.patch
        62 kB
        Manoj Govindassamy
      4. HDFS-11402.04.patch
        67 kB
        Manoj Govindassamy
      5. HDFS-11402.05.patch
        67 kB
        Manoj Govindassamy
      6. HDFS-11402.06.patch
        68 kB
        Manoj Govindassamy
      7. HDFS-11402.07.patch
        68 kB
        Manoj Govindassamy
      8. HDFS-11402.08.patch
        68 kB
        Manoj Govindassamy
      9. HDFS-11402-branch-2.01.patch
        67 kB
        Manoj Govindassamy

        Issue Links

          Activity

          Hide
          manojg Manoj Govindassamy added a comment - - edited

          Hi Aaron T. Myers, Andrew Wang, Yongjun Zhang, Jing Zhao,
          Can you please take a look at the problem statement and the proposal above ? Any comments/suggestions are welcome.

          Show
          manojg Manoj Govindassamy added a comment - - edited Hi Aaron T. Myers , Andrew Wang , Yongjun Zhang , Jing Zhao , Can you please take a look at the problem statement and the proposal above ? Any comments/suggestions are welcome.
          Hide
          yzhangal Yongjun Zhang added a comment -

          Hi Manoj Govindassamy,

          Thanks for working on this issue.

          I share the same concern of run time, especially when handling many open files. It may hold the FS lock for too long thus impacting NN service. If we can distribute the burden to relevant file operation, then it would have less runtime impact.

          Wonder if it's possible to do the following, when NN updates the size of a file, it calls INodeFile#recordModification to either create a new snapshotCopy in FileDiff, or modify an existing snapshotCopy in FileDiff? If we can do this, then we can still maintain o(1) time. The tricky thing is about modifying an existing snapshotCopy of the file in FileDiff. The current code may not support it, but it'd be nice to try to see if making some changes can make it work.

          Show
          yzhangal Yongjun Zhang added a comment - Hi Manoj Govindassamy , Thanks for working on this issue. I share the same concern of run time, especially when handling many open files. It may hold the FS lock for too long thus impacting NN service. If we can distribute the burden to relevant file operation, then it would have less runtime impact. Wonder if it's possible to do the following, when NN updates the size of a file, it calls INodeFile#recordModification to either create a new snapshotCopy in FileDiff, or modify an existing snapshotCopy in FileDiff? If we can do this, then we can still maintain o(1) time. The tricky thing is about modifying an existing snapshotCopy of the file in FileDiff. The current code may not support it, but it'd be nice to try to see if making some changes can make it work.
          Hide
          manojg Manoj Govindassamy added a comment -

          Yongjun Zhang,

          LeaseManager maintains INodeFileIDs of all leased open files in-memory. Moreover, with the proposed design LeaseManager now exposes a functionality to get all files with leases under the given directory. So at the time of Snapshot, the SnapshotManager will only run extra operations for the open files under the snapshot dir and not on all the open files in the system. For all the open files, creation of diff record is very light weight as it only copies the length field from the file and not blocks.

          Your proposal of maintaining diff record for each open file at the time of file length update well before the Snaoshot is not going to make the overall time complexity O(1). The diff record computation is not the one contributing to the extra time consumption at the Snapshot time. Its the generation of open files under the snapshot directory and the diff record saving for each of those open files is making the overall complexity for the snapshot creation an O(open files count under snap root). And, dfs.namenode.snapshot.freeze.openfiles config is false by default. Users who are interested in this feature can enable this at a marginal cost and otherwise there aren't any performance dips in the normal snapshot operation.

          Show
          manojg Manoj Govindassamy added a comment - Yongjun Zhang , LeaseManager maintains INodeFileIDs of all leased open files in-memory. Moreover, with the proposed design LeaseManager now exposes a functionality to get all files with leases under the given directory. So at the time of Snapshot, the SnapshotManager will only run extra operations for the open files under the snapshot dir and not on all the open files in the system. For all the open files, creation of diff record is very light weight as it only copies the length field from the file and not blocks. Your proposal of maintaining diff record for each open file at the time of file length update well before the Snaoshot is not going to make the overall time complexity O(1). The diff record computation is not the one contributing to the extra time consumption at the Snapshot time. Its the generation of open files under the snapshot directory and the diff record saving for each of those open files is making the overall complexity for the snapshot creation an O(open files count under snap root). And, dfs.namenode.snapshot.freeze.openfiles config is false by default. Users who are interested in this feature can enable this at a marginal cost and otherwise there aren't any performance dips in the normal snapshot operation.
          Hide
          manojg Manoj Govindassamy added a comment -

          Attached v01 patch which addresses the following:

          1. LeaseManager#getINodeWithLeases() returns set of open files in INodesInPath format under any given INodeDirectory
          2. INodesInPath#isAncestor() can find if the given INodeDirectory is an ancestor of it. Used by LeaseManager to gather open files under a snap root.
          3. New config param dfs.namenode.snapshot.freeze.openfiles in DFSConfigKeys to turn on/off this feature
          4. When dfs.namenode.snapshot.freeze.openfiles is enabled, DirectorySnapshottableFeature#addSnapshot gathers open files under the snap root from LeaseManager and records modification for each of them to freeze the file length in the snapshot.

          Tests:
          1. TestLeaseManager updated to test LeaseManager#getINodeWithLeases() functionality
          2. TestOpenFilesWithSnapshot, TestSnapshotManager updated to test Snapshots when there are files being written – verify file lengths in snapshots are frozen even after more writes goto the live file, verify open files in non snap dir not affected, verify file lengths frozen with NN restart, etc.,

          Yongjun Zhang, Jing Zhao, Andrew Wang can you please review the patch and share your suggestions/comments ?

          Show
          manojg Manoj Govindassamy added a comment - Attached v01 patch which addresses the following: 1. LeaseManager#getINodeWithLeases() returns set of open files in INodesInPath format under any given INodeDirectory 2. INodesInPath#isAncestor() can find if the given INodeDirectory is an ancestor of it. Used by LeaseManager to gather open files under a snap root. 3. New config param dfs.namenode.snapshot.freeze.openfiles in DFSConfigKeys to turn on/off this feature 4. When dfs.namenode.snapshot.freeze.openfiles is enabled, DirectorySnapshottableFeature#addSnapshot gathers open files under the snap root from LeaseManager and records modification for each of them to freeze the file length in the snapshot. Tests: 1. TestLeaseManager updated to test LeaseManager#getINodeWithLeases() functionality 2. TestOpenFilesWithSnapshot, TestSnapshotManager updated to test Snapshots when there are files being written – verify file lengths in snapshots are frozen even after more writes goto the live file, verify open files in non snap dir not affected, verify file lengths frozen with NN restart, etc., Yongjun Zhang , Jing Zhao , Andrew Wang can you please review the patch and share your suggestions/comments ?
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 35s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 3 new or modified test files.
          +1 mvninstall 16m 56s trunk passed
          +1 compile 0m 58s trunk passed
          +1 checkstyle 0m 43s trunk passed
          +1 mvnsite 1m 13s trunk passed
          +1 mvneclipse 0m 18s trunk passed
          +1 findbugs 1m 59s trunk passed
          +1 javadoc 0m 49s trunk passed
          +1 mvninstall 1m 11s the patch passed
          +1 compile 1m 0s the patch passed
          +1 javac 1m 0s the patch passed
          +1 checkstyle 0m 40s the patch passed
          +1 mvnsite 1m 5s the patch passed
          +1 mvneclipse 0m 14s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 1m 59s the patch passed
          +1 javadoc 0m 39s the patch passed
          -1 unit 92m 42s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 27s The patch does not generate ASF License warnings.
          124m 49s



          Reason Tests
          Failed junit tests hadoop.tools.TestHdfsConfigFields
            hadoop.hdfs.TestRollingUpgrade
            hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:a9ad5d6
          JIRA Issue HDFS-11402
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12852975/HDFS-11402.01.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 043c43761295 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / a136936
          Default Java 1.8.0_121
          findbugs v3.0.0
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/18386/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/18386/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/18386/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 35s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 3 new or modified test files. +1 mvninstall 16m 56s trunk passed +1 compile 0m 58s trunk passed +1 checkstyle 0m 43s trunk passed +1 mvnsite 1m 13s trunk passed +1 mvneclipse 0m 18s trunk passed +1 findbugs 1m 59s trunk passed +1 javadoc 0m 49s trunk passed +1 mvninstall 1m 11s the patch passed +1 compile 1m 0s the patch passed +1 javac 1m 0s the patch passed +1 checkstyle 0m 40s the patch passed +1 mvnsite 1m 5s the patch passed +1 mvneclipse 0m 14s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 59s the patch passed +1 javadoc 0m 39s the patch passed -1 unit 92m 42s hadoop-hdfs in the patch failed. +1 asflicense 0m 27s The patch does not generate ASF License warnings. 124m 49s Reason Tests Failed junit tests hadoop.tools.TestHdfsConfigFields   hadoop.hdfs.TestRollingUpgrade   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue HDFS-11402 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12852975/HDFS-11402.01.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 043c43761295 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / a136936 Default Java 1.8.0_121 findbugs v3.0.0 unit https://builds.apache.org/job/PreCommit-HDFS-Build/18386/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/18386/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/18386/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          linyiqun Yiqun Lin added a comment -

          Hi, Manoj Govindassamy, thanks for the work on this.
          I just take a quick look on this, some comments from me. I am also concerning the run time problem for this. Yes, you have mentioned that LeaseManager#getINodeWithLeases() will return set of open files in INodesInPath format under given INodeDirectory. And it will not deal with all the open files. But I looked into this method, it seems this operation is still expensive. The getInode and {{ INodesInPath#fromINode}} are still executed whether the file is under given directory or not. So if the open files is much enough, this method looks still expensive operation, right? How do you think?

          public Set<INodesInPath> getINodeWithLeases(final INodeDirectory
          +      restrictFilesFromDir) {
          +    Set<INodesInPath> iNodeSet = new HashSet<>();
          +    for (final long iNodeId : leasesById.keySet()) {
          +      INode inode = fsnamesystem.getFSDirectory().getInode(iNodeId);
          +      assert inode.isFile();
          +      INodesInPath iNodesInPath = INodesInPath.fromINode(
          +          fsnamesystem.getFSDirectory().getRoot(), inode.asFile());
          +      if (restrictFilesFromDir != null &&
          +          !iNodesInPath.isAncestor(restrictFilesFromDir)) {
          +        continue;
          +      }
          +      iNodeSet.add(iNodesInPath);
          +    }
          +    return iNodeSet;
          +  }
          
          Show
          linyiqun Yiqun Lin added a comment - Hi, Manoj Govindassamy , thanks for the work on this. I just take a quick look on this, some comments from me. I am also concerning the run time problem for this. Yes, you have mentioned that LeaseManager#getINodeWithLeases() will return set of open files in INodesInPath format under given INodeDirectory . And it will not deal with all the open files. But I looked into this method, it seems this operation is still expensive. The getInode and {{ INodesInPath#fromINode}} are still executed whether the file is under given directory or not. So if the open files is much enough, this method looks still expensive operation, right? How do you think? public Set<INodesInPath> getINodeWithLeases( final INodeDirectory + restrictFilesFromDir) { + Set<INodesInPath> iNodeSet = new HashSet<>(); + for ( final long iNodeId : leasesById.keySet()) { + INode inode = fsnamesystem.getFSDirectory().getInode(iNodeId); + assert inode.isFile(); + INodesInPath iNodesInPath = INodesInPath.fromINode( + fsnamesystem.getFSDirectory().getRoot(), inode.asFile()); + if (restrictFilesFromDir != null && + !iNodesInPath.isAncestor(restrictFilesFromDir)) { + continue ; + } + iNodeSet.add(iNodesInPath); + } + return iNodeSet; + }
          Hide
          manojg Manoj Govindassamy added a comment -

          Thanks for the review Yiqun Lin.

          I can split LeaseManager#leasesById Map entries into few parts and spin off the LeaseManager#getINodeWithLeases() work in fixed thread pool Executor and have them run in parallel. Each Task will filter the open files from its list and we can benefit from many such tasks running in parallel. Your thoughts on this please ?

          Show
          manojg Manoj Govindassamy added a comment - Thanks for the review Yiqun Lin . I can split LeaseManager#leasesById Map entries into few parts and spin off the LeaseManager#getINodeWithLeases() work in fixed thread pool Executor and have them run in parallel. Each Task will filter the open files from its list and we can benefit from many such tasks running in parallel. Your thoughts on this please ?
          Hide
          manojg Manoj Govindassamy added a comment -

          Attaching v02 patch to address the following
          1. LeaseManager#getINodeWithLeases() now runs the open files INode filtering task in a thread pool of 4 workers. So, at clusters having lots of open files, this should improve the performance
          2. TestLeaseManager unit tests updated to verify various corner cases around open files count and thread pool execution.
          3. Update hdfs-default.xml with the new config param.

          Yongjun Zhang, Yiqun Lin, Jing Zhao and others, can you please take a look at the patch ?

          Show
          manojg Manoj Govindassamy added a comment - Attaching v02 patch to address the following 1. LeaseManager#getINodeWithLeases() now runs the open files INode filtering task in a thread pool of 4 workers. So, at clusters having lots of open files, this should improve the performance 2. TestLeaseManager unit tests updated to verify various corner cases around open files count and thread pool execution. 3. Update hdfs-default.xml with the new config param. Yongjun Zhang , Yiqun Lin , Jing Zhao and others, can you please take a look at the patch ?
          Hide
          hadoopqa Hadoop QA added a comment -
          +1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 11s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 3 new or modified test files.
          0 mvndep 0m 7s Maven dependency ordering for branch
          +1 mvninstall 12m 27s trunk passed
          +1 compile 1m 21s trunk passed
          +1 checkstyle 0m 39s trunk passed
          +1 mvnsite 1m 22s trunk passed
          +1 mvneclipse 0m 25s trunk passed
          +1 findbugs 3m 7s trunk passed
          +1 javadoc 1m 0s trunk passed
          0 mvndep 0m 7s Maven dependency ordering for patch
          +1 mvninstall 1m 18s the patch passed
          +1 compile 1m 18s the patch passed
          +1 javac 1m 18s the patch passed
          +1 checkstyle 0m 37s the patch passed
          +1 mvnsite 1m 18s the patch passed
          +1 mvneclipse 0m 21s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 xml 0m 2s The patch has no ill-formed XML file.
          +1 findbugs 3m 15s the patch passed
          +1 javadoc 0m 54s the patch passed
          +1 unit 0m 54s hadoop-hdfs-client in the patch passed.
          +1 unit 63m 41s hadoop-hdfs in the patch passed.
          +1 asflicense 0m 19s The patch does not generate ASF License warnings.
          96m 7s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:a9ad5d6
          JIRA Issue HDFS-11402
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12853200/HDFS-11402.02.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml
          uname Linux c91ae938b25f 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 02c5494
          Default Java 1.8.0_121
          findbugs v3.0.0
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/18393/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/18393/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 11s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 3 new or modified test files. 0 mvndep 0m 7s Maven dependency ordering for branch +1 mvninstall 12m 27s trunk passed +1 compile 1m 21s trunk passed +1 checkstyle 0m 39s trunk passed +1 mvnsite 1m 22s trunk passed +1 mvneclipse 0m 25s trunk passed +1 findbugs 3m 7s trunk passed +1 javadoc 1m 0s trunk passed 0 mvndep 0m 7s Maven dependency ordering for patch +1 mvninstall 1m 18s the patch passed +1 compile 1m 18s the patch passed +1 javac 1m 18s the patch passed +1 checkstyle 0m 37s the patch passed +1 mvnsite 1m 18s the patch passed +1 mvneclipse 0m 21s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 xml 0m 2s The patch has no ill-formed XML file. +1 findbugs 3m 15s the patch passed +1 javadoc 0m 54s the patch passed +1 unit 0m 54s hadoop-hdfs-client in the patch passed. +1 unit 63m 41s hadoop-hdfs in the patch passed. +1 asflicense 0m 19s The patch does not generate ASF License warnings. 96m 7s Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue HDFS-11402 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12853200/HDFS-11402.02.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml uname Linux c91ae938b25f 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 02c5494 Default Java 1.8.0_121 findbugs v3.0.0 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/18393/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project Console output https://builds.apache.org/job/PreCommit-HDFS-Build/18393/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          jingzhao Jing Zhao added a comment -

          Thanks for working on this, Manoj Govindassamy. Yes the length of open files is a known issue for the current snapshot implementation. As you mentioned in the description, the current semantic is to capture a length in snapshot which is >= the real length when the snapshot was created. This behavior kind of breaks the read-only semantic.

          I think the key challenge here is how to let NN know the lengths of open files. Currently the length of an open file is updated on the NN only when 1) the first time hflush is called, or 2) hflush/hsync is called along with the UPDATE_LENGTH flag. Thus if a file is being written, the file length on the NN side (let's call it l_n) is usually a lot less than the length seen by the DN/client.

          If we choose to record l_n in the snapshot, then later we may have risk to lose data (from client's point of view). E.g., a user wrote 100MB data and took a snapshot. The l_n at that time might be only 1MB or even 0. Later if the user deletes the file she will expect ~100MB still kept in the snapshotted file, instead of 1MB or an empty file. At this time, from the safety point of view, maybe the semantic of the current snapshot implementation is better.

          So before we update the NN side logic about capturing the file length in snapshots, I think we first need to solve the problem about how to report the length of open files to NN (e.g., maybe utilizing the DN heartbeats or some other ways).

          Show
          jingzhao Jing Zhao added a comment - Thanks for working on this, Manoj Govindassamy . Yes the length of open files is a known issue for the current snapshot implementation. As you mentioned in the description, the current semantic is to capture a length in snapshot which is >= the real length when the snapshot was created. This behavior kind of breaks the read-only semantic. I think the key challenge here is how to let NN know the lengths of open files. Currently the length of an open file is updated on the NN only when 1) the first time hflush is called, or 2) hflush/hsync is called along with the UPDATE_LENGTH flag. Thus if a file is being written, the file length on the NN side (let's call it l_n ) is usually a lot less than the length seen by the DN/client. If we choose to record l_n in the snapshot, then later we may have risk to lose data (from client's point of view). E.g., a user wrote 100MB data and took a snapshot. The l_n at that time might be only 1MB or even 0. Later if the user deletes the file she will expect ~100MB still kept in the snapshotted file, instead of 1MB or an empty file. At this time, from the safety point of view, maybe the semantic of the current snapshot implementation is better. So before we update the NN side logic about capturing the file length in snapshots, I think we first need to solve the problem about how to report the length of open files to NN (e.g., maybe utilizing the DN heartbeats or some other ways).
          Hide
          manojg Manoj Govindassamy added a comment -

          Thanks for the design review Jing Zhao. Much appreciated.

          I think the key challenge here is how to let NN know the lengths of open files...If we choose to record l_n in the snapshot, then later we may have risk to lose data (from client's point of view).

          We have the same problem with parallel readers when there is an ongoing write. Readers reading the file will have no clue on the latest length or data. But there is always SyncFlag.UPDATE_LENGTH and applications wanting to update NN about their latest writes can do hsync() with the SyncFlag option. This same existing method is helpful for Snapshots also, as in the above case the open file length captured in the snapshot would be the last hsync() length.

          Otherwise, applications which are attempting to ensure consistent lengths of files in HDFS snapshots will not be able to reliably do so. IMHO, HDFS Snapshot is violating the important design goal of read-only behavior, as files in the snapshot grow in size after the snapshot time. This mutable snapshot behavior makes HDFS Snapshots far less attractive and applications not able to make use of the feature in an useful way.

          I think we first need to solve the problem about how to report the length of open files to NN (e.g., maybe utilizing the DN heartbeats or some other ways).

          I can dig deeper on this model to make HDFS Snapshots much more reliable. Given that applications are already aware of NN length issues for open files, and the availability of hsync(SyncFlag.UPDATE_LENGTH) to close the same gap, and the proposed design is under a config which is turned off by default, do we still need to make this jira a dependent on fixing open file lengths via heartbeat improvement ? Would love to hear your thoughts on this.

          Show
          manojg Manoj Govindassamy added a comment - Thanks for the design review Jing Zhao . Much appreciated. I think the key challenge here is how to let NN know the lengths of open files...If we choose to record l_n in the snapshot, then later we may have risk to lose data (from client's point of view). We have the same problem with parallel readers when there is an ongoing write. Readers reading the file will have no clue on the latest length or data. But there is always SyncFlag.UPDATE_LENGTH and applications wanting to update NN about their latest writes can do hsync() with the SyncFlag option. This same existing method is helpful for Snapshots also, as in the above case the open file length captured in the snapshot would be the last hsync() length. Otherwise, applications which are attempting to ensure consistent lengths of files in HDFS snapshots will not be able to reliably do so. IMHO, HDFS Snapshot is violating the important design goal of read-only behavior, as files in the snapshot grow in size after the snapshot time. This mutable snapshot behavior makes HDFS Snapshots far less attractive and applications not able to make use of the feature in an useful way. I think we first need to solve the problem about how to report the length of open files to NN (e.g., maybe utilizing the DN heartbeats or some other ways). I can dig deeper on this model to make HDFS Snapshots much more reliable. Given that applications are already aware of NN length issues for open files, and the availability of hsync(SyncFlag.UPDATE_LENGTH) to close the same gap, and the proposed design is under a config which is turned off by default, do we still need to make this jira a dependent on fixing open file lengths via heartbeat improvement ? Would love to hear your thoughts on this.
          Hide
          jingzhao Jing Zhao added a comment -

          We have the same problem with parallel readers when there is an ongoing write.

          This is not true. The current DFSInputstream has the capability to talk to datanodes and learn the current length.

          Show
          jingzhao Jing Zhao added a comment - We have the same problem with parallel readers when there is an ongoing write. This is not true. The current DFSInputstream has the capability to talk to datanodes and learn the current length.
          Hide
          manojg Manoj Govindassamy added a comment - - edited

          Thanks for correcting me on the readers getting to know on the latest length. SyncFlag.UPDATE_LENGTH also has saved us on many such cases. I still see (1) making snapshots truly read-only and (2) file lengths for open files as separate issues, and the former (1) one more alarming. Yes, fixing the (2) one will make the solution more reliable and I can take it up in a new jira, if everyone agrees.

          Show
          manojg Manoj Govindassamy added a comment - - edited Thanks for correcting me on the readers getting to know on the latest length. SyncFlag.UPDATE_LENGTH also has saved us on many such cases. I still see (1) making snapshots truly read-only and (2) file lengths for open files as separate issues, and the former (1) one more alarming. Yes, fixing the (2) one will make the solution more reliable and I can take it up in a new jira, if everyone agrees.
          Hide
          manojg Manoj Govindassamy added a comment -

          Filed HDFS-11435 - NameNode should track open for write files lengths more frequent than on newer block allocations.

          Show
          manojg Manoj Govindassamy added a comment - Filed HDFS-11435 - NameNode should track open for write files lengths more frequent than on newer block allocations.
          Hide
          raviprak Ravi Prakash added a comment -

          I'm sorry I don't fully understand why we care about HDFS-11435. In fact it makes heartbeat handling a lot slower and more complicated. And for what? Slightly more updated length values in Snapshots (a relatively rare operation.) If the DNs send back lengths of RBW every heartbeat, you'll still have to choose the minimum length returned by at least dfs.namenode.replication.min nodes. IMHO we should store in the snapshot the last minimum length that the NN knows data has been written till. If that's the last block boundary, so be it.

          Show
          raviprak Ravi Prakash added a comment - I'm sorry I don't fully understand why we care about HDFS-11435 . In fact it makes heartbeat handling a lot slower and more complicated. And for what? Slightly more updated length values in Snapshots (a relatively rare operation.) If the DNs send back lengths of RBW every heartbeat, you'll still have to choose the minimum length returned by at least dfs.namenode.replication.min nodes. IMHO we should store in the snapshot the last minimum length that the NN knows data has been written till. If that's the last block boundary, so be it.
          Hide
          manojg Manoj Govindassamy added a comment -

          Thanks for sharing your views Ravi Prakash. I will wait for others to review the patch and share the comments. And, I would also prefer this jira not dependent on HDFS-11435.

          Show
          manojg Manoj Govindassamy added a comment - Thanks for sharing your views Ravi Prakash . I will wait for others to review the patch and share the comments. And, I would also prefer this jira not dependent on HDFS-11435 .
          Hide
          manojg Manoj Govindassamy added a comment -

          Yongjun Zhang, Any review comments on the patch ? Please let me know your thoughts.

          Show
          manojg Manoj Govindassamy added a comment - Yongjun Zhang , Any review comments on the patch ? Please let me know your thoughts.
          Hide
          yzhangal Yongjun Zhang added a comment -

          Hi Manoj Govindassamy,

          Thanks a lot for your work here and very sorry for my delayed review.

          The patch looks largely good to me. Below are some comments, mostly cosmetic.

          1. We can put the parameters leaseManager and freezeOpenFiles together at the API, since they are used together for an optional feature. For example in INodeDirectory

          public Snapshot addSnapshot(final LeaseManager leaseManager,
                int id, String name, boolean freezeOpenFiles)
          

          we can change it to

          public Snapshot addSnapshot(int id, String name,
                final LeaseManager leaseManager,
                final boolean freezeOpenFiles)
          

          2. share common code in the two INodesInPath$fromINode methods
          3. change method name isAncestor to isDescendent in INodesInPath
          4. In LeaseManager,

          • INODE_FILTER_WORKER_COUNT is only used in a single method, it's better not to define it as public, and maybe we can just move it to the single method.
          • change getINodeWithLeases(final INodeDirectory restrictFilesFromDir)
            to getINodesWithLease(final INodeDirectory ancestorDir)
            and javadoc the behavior when ancestorDir is null or not-null
          • optionally, possibly just use the above COUNT as a cap, and have a way to dynamically decide how big the thread pool is, especially when the number of files open for write is small. This can be consider in the future when needed.
          • add a private method (like getINodesInLease to wrap
               synchronized (this) {
                  inodes = new INode[leasesById.size()];
                  for (long inodeId : leasesById.keySet()) {
                    inodes[inodeCount] = fsnamesystem.getFSDirectory().getInode(inodeId);
                    inodeCount++;
                  }
                }
            

          5. In hdfs-default.xml, add a note to describe that the file length captured in snapshot for a file is what's recorded in NameNode, it may be shorter than what the client has written. In order to capture the length the client has written, the client need to call hflush/hsync on the file.
          6. Suggest to add a test about snapshot diff.

          HI Jing Zhao, wonder if you could help doing a review too. Much appreciated.

          Thanks.

          Show
          yzhangal Yongjun Zhang added a comment - Hi Manoj Govindassamy , Thanks a lot for your work here and very sorry for my delayed review. The patch looks largely good to me. Below are some comments, mostly cosmetic. 1. We can put the parameters leaseManager and freezeOpenFiles together at the API, since they are used together for an optional feature. For example in INodeDirectory public Snapshot addSnapshot( final LeaseManager leaseManager, int id, String name, boolean freezeOpenFiles) we can change it to public Snapshot addSnapshot( int id, String name, final LeaseManager leaseManager, final boolean freezeOpenFiles) 2. share common code in the two INodesInPath$fromINode methods 3. change method name isAncestor to isDescendent in INodesInPath 4. In LeaseManager , INODE_FILTER_WORKER_COUNT is only used in a single method, it's better not to define it as public, and maybe we can just move it to the single method. change getINodeWithLeases(final INodeDirectory restrictFilesFromDir) to getINodesWithLease(final INodeDirectory ancestorDir) and javadoc the behavior when ancestorDir is null or not-null optionally, possibly just use the above COUNT as a cap, and have a way to dynamically decide how big the thread pool is, especially when the number of files open for write is small. This can be consider in the future when needed. add a private method (like getINodesInLease to wrap synchronized ( this ) { inodes = new INode[leasesById.size()]; for ( long inodeId : leasesById.keySet()) { inodes[inodeCount] = fsnamesystem.getFSDirectory().getInode(inodeId); inodeCount++; } } 5. In hdfs-default.xml, add a note to describe that the file length captured in snapshot for a file is what's recorded in NameNode, it may be shorter than what the client has written. In order to capture the length the client has written, the client need to call hflush/hsync on the file. 6. Suggest to add a test about snapshot diff. HI Jing Zhao , wonder if you could help doing a review too. Much appreciated. Thanks.
          Hide
          manojg Manoj Govindassamy added a comment -

          Thanks for the review Yongjun Zhang. Attached v03 patch with below review comments addresses. Have added a new unit test for Snapshot Diff Report to verify snapshot diffs for the snapshots which have open files captured. Kindly take a look at the new patch revision.

          1. We can put the parameters leaseManager and freezeOpenFiles together at the API, since they are used together for an optional feature.

          Done.

          2. share common code in the two INodesInPath$fromINode methods

          Done.

          3. change method name isAncestor to isDescendent in INodesInPath

          Done.

          4. INODE_FILTER_WORKER_COUNT is only used in a single method, it's better not to define it as public, and maybe we can just move it to the single method.

          This count is used by unit tests as well. Have changed the scope of the variable to package private instead of public.

          change getINodeWithLeases(final INodeDirectory restrictFilesFromDir)

          to getINodesWithLease(final INodeDirectory ancestorDir)
          and javadoc the behavior when ancestorDir is null or not-null
          Done.

          optionally, possibly just use the above COUNT as a cap, and have a way to dynamically decide how big the thread pool is, especially when the number of files open for write is small. This can be consider in the future when needed.

          Will take this for future.

          add a private method (like getINodesInLease to wrap

          Done.

          5. In hdfs-default.xml, add a note to describe that the file length captured in snapshot for a file is what's recorded in NameNode, it may be shorter than what the client has written. In order to capture the length the client has written, the client need to call hflush/hsync on the file.

          Done.
          6. Suggest to add a test about snapshot diff.
          Added a new unit test for Snapshot Diff Report

          Show
          manojg Manoj Govindassamy added a comment - Thanks for the review Yongjun Zhang . Attached v03 patch with below review comments addresses. Have added a new unit test for Snapshot Diff Report to verify snapshot diffs for the snapshots which have open files captured. Kindly take a look at the new patch revision. 1. We can put the parameters leaseManager and freezeOpenFiles together at the API, since they are used together for an optional feature. Done. 2. share common code in the two INodesInPath$fromINode methods Done. 3. change method name isAncestor to isDescendent in INodesInPath Done. 4. INODE_FILTER_WORKER_COUNT is only used in a single method, it's better not to define it as public, and maybe we can just move it to the single method. This count is used by unit tests as well. Have changed the scope of the variable to package private instead of public. change getINodeWithLeases(final INodeDirectory restrictFilesFromDir) to getINodesWithLease(final INodeDirectory ancestorDir) and javadoc the behavior when ancestorDir is null or not-null Done. optionally, possibly just use the above COUNT as a cap, and have a way to dynamically decide how big the thread pool is, especially when the number of files open for write is small. This can be consider in the future when needed. Will take this for future. add a private method (like getINodesInLease to wrap Done. 5. In hdfs-default.xml, add a note to describe that the file length captured in snapshot for a file is what's recorded in NameNode, it may be shorter than what the client has written. In order to capture the length the client has written, the client need to call hflush/hsync on the file. Done. 6. Suggest to add a test about snapshot diff. Added a new unit test for Snapshot Diff Report
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 14s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 4 new or modified test files.
          0 mvndep 0m 31s Maven dependency ordering for branch
          +1 mvninstall 13m 7s trunk passed
          +1 compile 1m 23s trunk passed
          +1 checkstyle 0m 51s trunk passed
          +1 mvnsite 1m 25s trunk passed
          +1 mvneclipse 0m 26s trunk passed
          +1 findbugs 3m 7s trunk passed
          +1 javadoc 1m 1s trunk passed
          0 mvndep 0m 7s Maven dependency ordering for patch
          +1 mvninstall 1m 17s the patch passed
          +1 compile 1m 19s the patch passed
          +1 javac 1m 19s the patch passed
          -0 checkstyle 0m 47s hadoop-hdfs-project: The patch generated 7 new + 788 unchanged - 4 fixed = 795 total (was 792)
          +1 mvnsite 1m 20s the patch passed
          +1 mvneclipse 0m 20s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 xml 0m 1s The patch has no ill-formed XML file.
          +1 findbugs 3m 21s the patch passed
          +1 javadoc 0m 54s the patch passed
          +1 unit 1m 7s hadoop-hdfs-client in the patch passed.
          -1 unit 63m 59s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 19s The patch does not generate ASF License warnings.
          98m 24s



          Reason Tests
          Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting
            hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:a9ad5d6
          JIRA Issue HDFS-11402
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12862553/HDFS-11402.03.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml
          uname Linux a1d2e9186ff6 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 2aa8967
          Default Java 1.8.0_121
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19020/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/19020/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19020/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19020/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 14s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 4 new or modified test files. 0 mvndep 0m 31s Maven dependency ordering for branch +1 mvninstall 13m 7s trunk passed +1 compile 1m 23s trunk passed +1 checkstyle 0m 51s trunk passed +1 mvnsite 1m 25s trunk passed +1 mvneclipse 0m 26s trunk passed +1 findbugs 3m 7s trunk passed +1 javadoc 1m 1s trunk passed 0 mvndep 0m 7s Maven dependency ordering for patch +1 mvninstall 1m 17s the patch passed +1 compile 1m 19s the patch passed +1 javac 1m 19s the patch passed -0 checkstyle 0m 47s hadoop-hdfs-project: The patch generated 7 new + 788 unchanged - 4 fixed = 795 total (was 792) +1 mvnsite 1m 20s the patch passed +1 mvneclipse 0m 20s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 xml 0m 1s The patch has no ill-formed XML file. +1 findbugs 3m 21s the patch passed +1 javadoc 0m 54s the patch passed +1 unit 1m 7s hadoop-hdfs-client in the patch passed. -1 unit 63m 59s hadoop-hdfs in the patch failed. +1 asflicense 0m 19s The patch does not generate ASF License warnings. 98m 24s Reason Tests Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting   hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue HDFS-11402 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12862553/HDFS-11402.03.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml uname Linux a1d2e9186ff6 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 2aa8967 Default Java 1.8.0_121 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19020/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/19020/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19020/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19020/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          manojg Manoj Govindassamy added a comment -

          Above unit test failures are not related to the patch.

          Show
          manojg Manoj Govindassamy added a comment - Above unit test failures are not related to the patch.
          Hide
          manojg Manoj Govindassamy added a comment -

          Attaching v04 patch with checkstyle issues (other than method length issues) fixed. Yongjun Zhang, Wei-Chiu Chuang please take a look.

          Show
          manojg Manoj Govindassamy added a comment - Attaching v04 patch with checkstyle issues (other than method length issues) fixed. Yongjun Zhang , Wei-Chiu Chuang please take a look.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 14m 4s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 4 new or modified test files.
          0 mvndep 0m 28s Maven dependency ordering for branch
          +1 mvninstall 15m 21s trunk passed
          +1 compile 1m 47s trunk passed
          +1 checkstyle 0m 51s trunk passed
          +1 mvnsite 1m 44s trunk passed
          +1 mvneclipse 0m 31s trunk passed
          +1 findbugs 3m 41s trunk passed
          +1 javadoc 1m 2s trunk passed
          0 mvndep 0m 7s Maven dependency ordering for patch
          +1 mvninstall 1m 22s the patch passed
          +1 compile 1m 27s the patch passed
          +1 javac 1m 27s the patch passed
          -0 checkstyle 0m 49s hadoop-hdfs-project: The patch generated 2 new + 788 unchanged - 4 fixed = 790 total (was 792)
          +1 mvnsite 1m 26s the patch passed
          +1 mvneclipse 0m 24s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 xml 0m 2s The patch has no ill-formed XML file.
          +1 findbugs 3m 25s the patch passed
          +1 javadoc 0m 58s the patch passed
          +1 unit 1m 11s hadoop-hdfs-client in the patch passed.
          -1 unit 96m 15s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 34s The patch does not generate ASF License warnings.
          149m 4s



          Reason Tests
          Failed junit tests hadoop.hdfs.TestMissingBlocksAlert
          Timed out junit tests org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:612578f
          JIRA Issue HDFS-11402
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12863103/HDFS-11402.04.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml
          uname Linux 597321cc4deb 3.13.0-108-generic #155-Ubuntu SMP Wed Jan 11 16:58:52 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / d4c01dd
          Default Java 1.8.0_121
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19063/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/19063/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19063/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19063/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 14m 4s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 4 new or modified test files. 0 mvndep 0m 28s Maven dependency ordering for branch +1 mvninstall 15m 21s trunk passed +1 compile 1m 47s trunk passed +1 checkstyle 0m 51s trunk passed +1 mvnsite 1m 44s trunk passed +1 mvneclipse 0m 31s trunk passed +1 findbugs 3m 41s trunk passed +1 javadoc 1m 2s trunk passed 0 mvndep 0m 7s Maven dependency ordering for patch +1 mvninstall 1m 22s the patch passed +1 compile 1m 27s the patch passed +1 javac 1m 27s the patch passed -0 checkstyle 0m 49s hadoop-hdfs-project: The patch generated 2 new + 788 unchanged - 4 fixed = 790 total (was 792) +1 mvnsite 1m 26s the patch passed +1 mvneclipse 0m 24s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 xml 0m 2s The patch has no ill-formed XML file. +1 findbugs 3m 25s the patch passed +1 javadoc 0m 58s the patch passed +1 unit 1m 11s hadoop-hdfs-client in the patch passed. -1 unit 96m 15s hadoop-hdfs in the patch failed. +1 asflicense 0m 34s The patch does not generate ASF License warnings. 149m 4s Reason Tests Failed junit tests hadoop.hdfs.TestMissingBlocksAlert Timed out junit tests org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean Subsystem Report/Notes Docker Image:yetus/hadoop:612578f JIRA Issue HDFS-11402 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12863103/HDFS-11402.04.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml uname Linux 597321cc4deb 3.13.0-108-generic #155-Ubuntu SMP Wed Jan 11 16:58:52 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / d4c01dd Default Java 1.8.0_121 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19063/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/19063/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19063/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19063/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          manojg Manoj Govindassamy added a comment -

          Above unit test failure is not related to the patch. Checkstyle issue is regarding the method definition being longer than 150 lines.

          Show
          manojg Manoj Govindassamy added a comment - Above unit test failure is not related to the patch. Checkstyle issue is regarding the method definition being longer than 150 lines.
          Hide
          jojochuang Wei-Chiu Chuang added a comment -

          Hello Manoj Govindassamy Thanks for the patch.
          I am stilling reviewing the patch, but I have a few quick comments if you don't mind:

          • Have you considered the case where an inode is a INodeReference? I haven't thought about this hard enough, but would like to make sure we cover that case well.
          • How about we add some debug messages in LeaseManager#getINodeWithLeases. E.g. the inode id or file name added in each thread.
          Show
          jojochuang Wei-Chiu Chuang added a comment - Hello Manoj Govindassamy Thanks for the patch. I am stilling reviewing the patch, but I have a few quick comments if you don't mind: Have you considered the case where an inode is a INodeReference? I haven't thought about this hard enough, but would like to make sure we cover that case well. How about we add some debug messages in LeaseManager#getINodeWithLeases . E.g. the inode id or file name added in each thread.
          Hide
          andrew.wang Andrew Wang added a comment -

          Hi Manoj, thanks for working on this. I gave it a nit pass, still thinking about the high-level idea, but wanted to post this up before the weekend:

          • typo: FRREZE -> FREEZE
          • Maybe name it "dfs.namenode.snapshot.capture-file-length" to be more explanatory?
          • hdfs-default.xml: hflush/hsync also do not update the NN's view of the file length unless the client calls hsync with a special flag.

          INodesInPath

          • Not a fan of Pair since it doesn't document its entries, could we instead add a helper method that given an INode[] returns a byte[][]?
          • We made an effort to avoid resolving paths -> IIPs and back during RPC handling for performance reasons. This is a special case for the lease manager, which only has an INode ID. Can you add a comment saying as much for the newly added methods?
          • isDescendant javadoc, could you reverse it to say "if this InodesInPath is a descendant of the specified INodeDirectory"?
          • the private isDescendant is only called by the public one, combine into one? This way we don't lose the typesafety of the INodeDirectory also.
          • isDescendant, if we're an IIP, can we simply look in our array of INodes for the specified ancestor? This method looks expensive right now.

          LeaseManager

          • What are the locking requirements for the new methods? add asserts?
          • Is the threading important for performance? Was this benchmarked? I suspect it's slower for small numbers of open files.
          • Can call shutdown immediately after adding all the futures to the executor.
          • Currently we swallow when a future.get throws an exception. If we return a partial set, won't the snapshot be inaccurate?
          • ancestorDir isn't mutated, so can we move the ancestorDir null check out of the Callable?
          • testCheckLease, why is there a new sleep(1)? Thread sleeps in a unit test are a code smell.
          Show
          andrew.wang Andrew Wang added a comment - Hi Manoj, thanks for working on this. I gave it a nit pass, still thinking about the high-level idea, but wanted to post this up before the weekend: typo: FRREZE -> FREEZE Maybe name it "dfs.namenode.snapshot.capture-file-length" to be more explanatory? hdfs-default.xml: hflush/hsync also do not update the NN's view of the file length unless the client calls hsync with a special flag. INodesInPath Not a fan of Pair since it doesn't document its entries, could we instead add a helper method that given an INode[] returns a byte[][]? We made an effort to avoid resolving paths -> IIPs and back during RPC handling for performance reasons. This is a special case for the lease manager, which only has an INode ID. Can you add a comment saying as much for the newly added methods? isDescendant javadoc, could you reverse it to say "if this InodesInPath is a descendant of the specified INodeDirectory"? the private isDescendant is only called by the public one, combine into one? This way we don't lose the typesafety of the INodeDirectory also. isDescendant, if we're an IIP, can we simply look in our array of INodes for the specified ancestor? This method looks expensive right now. LeaseManager What are the locking requirements for the new methods? add asserts? Is the threading important for performance? Was this benchmarked? I suspect it's slower for small numbers of open files. Can call shutdown immediately after adding all the futures to the executor. Currently we swallow when a future.get throws an exception. If we return a partial set, won't the snapshot be inaccurate? ancestorDir isn't mutated, so can we move the ancestorDir null check out of the Callable? testCheckLease, why is there a new sleep(1)? Thread sleeps in a unit test are a code smell.
          Hide
          manojg Manoj Govindassamy added a comment - - edited

          Thanks for the review Wei-Chiu Chuang and Andrew Wang. Attached v05 patch after addressing following comments, please take a look.

          • Refactored the config word 'freeze' with 'capture'
          • Its just not the file length which is getting captured but the whole meta data like permissions, acls, etc., Also, this is only for open files. So, would prefer to have the word "openfiles" also in the config. Now it is named dfs.namenode.snapshot.capture.openfiles.
          • Updated the comment in hdfs-default.xml about the special flag needed for hsync/hflush.
          • the byte[][] is already available getINodesAndPaths() along with INode[]. Wouldn't it be a lot slower to have one more helper method to construct the same byte[][] ?
          • Updated the comment for the new method fromINode()
          • Updated isDescendant() javadoc comments.
          • Clubbed isDescendant() methods together.
          • Its callers duty to hold fsnamesystem to hold the read or write lock so that the lease file list doesn't change. Added the assert.
          • v01 ptach didn't have the thread pool. Based on the comments from Yongjun and Yiqun that a lot of open files could lead to performance problems, I altered the model to use the thread pool. I did check with 10K+ open files and the thread pool is helping. But, for smaller open files count its almost same or little lesser.
          • Moved the exectuorService shutdown before future tasks get. But, I don't see a difference in shutdown() before or after the future get. Its anyway non-blocking and should not interrupt the currently running ones.
          • future.get() can throw interrupted exception or the execution exception. Since task routine doesn't throw any exception, the only other is interrupted exception. Since we are not calling any shutdownNow(), there shouldn't be interrupted exception either. We are logging it anyways. Not sure if we want to fail the snapshot create operation if at all any exception happens here.
          • Even if ancestorDir is null, we still want to convert INode to INodesInPath. Probably in the ancestor null case, we can do it without having to spin off a thread pool. But, there are no production code which will call with null ancestorDir.
          • Sleep of 1 ms is to wait for the expiration of lease soft and hard limit. Added more comments in the code.
          Show
          manojg Manoj Govindassamy added a comment - - edited Thanks for the review Wei-Chiu Chuang and Andrew Wang . Attached v05 patch after addressing following comments, please take a look. Refactored the config word 'freeze' with 'capture' Its just not the file length which is getting captured but the whole meta data like permissions, acls, etc., Also, this is only for open files. So, would prefer to have the word "openfiles" also in the config. Now it is named dfs.namenode.snapshot.capture.openfiles. Updated the comment in hdfs-default.xml about the special flag needed for hsync/hflush. the byte[][] is already available getINodesAndPaths() along with INode[]. Wouldn't it be a lot slower to have one more helper method to construct the same byte[][] ? Updated the comment for the new method fromINode() Updated isDescendant() javadoc comments. Clubbed isDescendant() methods together. Its callers duty to hold fsnamesystem to hold the read or write lock so that the lease file list doesn't change. Added the assert. v01 ptach didn't have the thread pool. Based on the comments from Yongjun and Yiqun that a lot of open files could lead to performance problems, I altered the model to use the thread pool. I did check with 10K+ open files and the thread pool is helping. But, for smaller open files count its almost same or little lesser. Moved the exectuorService shutdown before future tasks get. But, I don't see a difference in shutdown() before or after the future get. Its anyway non-blocking and should not interrupt the currently running ones. future.get() can throw interrupted exception or the execution exception. Since task routine doesn't throw any exception, the only other is interrupted exception. Since we are not calling any shutdownNow(), there shouldn't be interrupted exception either. We are logging it anyways. Not sure if we want to fail the snapshot create operation if at all any exception happens here. Even if ancestorDir is null, we still want to convert INode to INodesInPath. Probably in the ancestor null case, we can do it without having to spin off a thread pool. But, there are no production code which will call with null ancestorDir. Sleep of 1 ms is to wait for the expiration of lease soft and hard limit. Added more comments in the code.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 15s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 4 new or modified test files.
          0 mvndep 0m 10s Maven dependency ordering for branch
          +1 mvninstall 14m 44s trunk passed
          +1 compile 1m 39s trunk passed
          +1 checkstyle 0m 53s trunk passed
          +1 mvnsite 1m 41s trunk passed
          +1 mvneclipse 0m 31s trunk passed
          -1 findbugs 1m 35s hadoop-hdfs-project/hadoop-hdfs-client in trunk has 2 extant Findbugs warnings.
          -1 findbugs 2m 6s hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings.
          +1 javadoc 1m 14s trunk passed
          0 mvndep 0m 9s Maven dependency ordering for patch
          +1 mvninstall 1m 55s the patch passed
          +1 compile 2m 2s the patch passed
          +1 javac 2m 2s the patch passed
          -0 checkstyle 0m 56s hadoop-hdfs-project: The patch generated 4 new + 788 unchanged - 4 fixed = 792 total (was 792)
          +1 mvnsite 1m 55s the patch passed
          +1 mvneclipse 0m 27s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 xml 0m 2s The patch has no ill-formed XML file.
          +1 findbugs 3m 57s the patch passed
          +1 javadoc 1m 14s the patch passed
          +1 unit 1m 26s hadoop-hdfs-client in the patch passed.
          -1 unit 81m 1s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 21s The patch does not generate ASF License warnings.
          122m 12s



          Reason Tests
          Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration
            hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting
            hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
            hadoop.hdfs.TestDFSStripedOutputStreamWithFailure160
            hadoop.hdfs.TestDFSStripedOutputStreamWithFailure120



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:0ac17dc
          JIRA Issue HDFS-11402
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12864169/HDFS-11402.05.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml
          uname Linux 5e0352be5ef0 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / c154935
          Default Java 1.8.0_121
          findbugs v3.1.0-RC1
          findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/19151/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html
          findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/19151/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19151/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/19151/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19151/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19151/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 15s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 4 new or modified test files. 0 mvndep 0m 10s Maven dependency ordering for branch +1 mvninstall 14m 44s trunk passed +1 compile 1m 39s trunk passed +1 checkstyle 0m 53s trunk passed +1 mvnsite 1m 41s trunk passed +1 mvneclipse 0m 31s trunk passed -1 findbugs 1m 35s hadoop-hdfs-project/hadoop-hdfs-client in trunk has 2 extant Findbugs warnings. -1 findbugs 2m 6s hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings. +1 javadoc 1m 14s trunk passed 0 mvndep 0m 9s Maven dependency ordering for patch +1 mvninstall 1m 55s the patch passed +1 compile 2m 2s the patch passed +1 javac 2m 2s the patch passed -0 checkstyle 0m 56s hadoop-hdfs-project: The patch generated 4 new + 788 unchanged - 4 fixed = 792 total (was 792) +1 mvnsite 1m 55s the patch passed +1 mvneclipse 0m 27s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 xml 0m 2s The patch has no ill-formed XML file. +1 findbugs 3m 57s the patch passed +1 javadoc 1m 14s the patch passed +1 unit 1m 26s hadoop-hdfs-client in the patch passed. -1 unit 81m 1s hadoop-hdfs in the patch failed. +1 asflicense 0m 21s The patch does not generate ASF License warnings. 122m 12s Reason Tests Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting   hadoop.hdfs.server.namenode.ha.TestPipelinesFailover   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure160   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure120 Subsystem Report/Notes Docker Image:yetus/hadoop:0ac17dc JIRA Issue HDFS-11402 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12864169/HDFS-11402.05.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml uname Linux 5e0352be5ef0 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / c154935 Default Java 1.8.0_121 findbugs v3.1.0-RC1 findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/19151/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/19151/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19151/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/19151/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19151/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19151/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          jojochuang Wei-Chiu Chuang added a comment - - edited

          One nit:

          LOG.warn("INode filter task encountered exception: "+ e);
          

          Please use LOG.warn(Object message, Throwable t) to preserve stacktrace.

          In LeaseManager#getINodeWithLeases, It is a little counter-intuitive to me that, assuming 4 worker threads, if there are 4 or less files with leases only one worker thread is run, but if there are 5 files with leases, 4 worker threads all run.

          LeaseManager#getINodeWithLeases
          if (!inode.isFile()) {
                        continue;
                      }
          

          What if the inode is a reference? Say an open file happens to be a reference, then it won't be included, correct? I am not sure how LeaseManager handles reference though.

          Show
          jojochuang Wei-Chiu Chuang added a comment - - edited One nit: LOG.warn( "INode filter task encountered exception: " + e); Please use LOG.warn(Object message, Throwable t) to preserve stacktrace. In LeaseManager#getINodeWithLeases, It is a little counter-intuitive to me that, assuming 4 worker threads, if there are 4 or less files with leases only one worker thread is run, but if there are 5 files with leases, 4 worker threads all run. LeaseManager#getINodeWithLeases if (!inode.isFile()) { continue ; } What if the inode is a reference? Say an open file happens to be a reference, then it won't be included, correct? I am not sure how LeaseManager handles reference though.
          Hide
          manojg Manoj Govindassamy added a comment -

          Thanks for the review Wei-Chiu Chuang. Attached v06 patch addressing following comments. Please take a look at the latest patch.

          LOG.warn("INode filter task encountered exception: "+ e); Please use LOG.warn(Object message, Throwable t) to preserve stacktrace.

          Done.

          In LeaseManager#getINodeWithLeases, It is a little counter-intuitive to me that, assuming 4 worker threads, if there are 4 or less files with leases only one worker thread is run, but if there are 5 files with leases, 4 worker threads all run.

          Modified the worker model where by workers are spin off based on open files count. Now there are max 4 workers and each worker can handle a minimum of 512 inodes. So, for under 512 inodes there will be 1 worker. For 513 inodes, there will be 2 workers and until 1024 inodes.

          What if the inode is a reference? Say an open file happens to be a reference, then it won't be included, correct?

          isFile() will pass through on inodeReference if its target is indeed a file. So, this works.

          Show
          manojg Manoj Govindassamy added a comment - Thanks for the review Wei-Chiu Chuang . Attached v06 patch addressing following comments. Please take a look at the latest patch. LOG.warn("INode filter task encountered exception: "+ e); Please use LOG.warn(Object message, Throwable t) to preserve stacktrace. Done. In LeaseManager#getINodeWithLeases, It is a little counter-intuitive to me that, assuming 4 worker threads, if there are 4 or less files with leases only one worker thread is run, but if there are 5 files with leases, 4 worker threads all run. Modified the worker model where by workers are spin off based on open files count. Now there are max 4 workers and each worker can handle a minimum of 512 inodes. So, for under 512 inodes there will be 1 worker. For 513 inodes, there will be 2 workers and until 1024 inodes. What if the inode is a reference? Say an open file happens to be a reference, then it won't be included, correct? isFile() will pass through on inodeReference if its target is indeed a file. So, this works.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 35s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 4 new or modified test files.
          0 mvndep 0m 25s Maven dependency ordering for branch
          +1 mvninstall 15m 40s trunk passed
          +1 compile 1m 27s trunk passed
          +1 checkstyle 0m 48s trunk passed
          +1 mvnsite 1m 26s trunk passed
          +1 mvneclipse 0m 28s trunk passed
          -1 findbugs 1m 22s hadoop-hdfs-project/hadoop-hdfs-client in trunk has 2 extant Findbugs warnings.
          -1 findbugs 1m 40s hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings.
          +1 javadoc 1m 4s trunk passed
          0 mvndep 0m 8s Maven dependency ordering for patch
          +1 mvninstall 1m 33s the patch passed
          +1 compile 1m 26s the patch passed
          +1 javac 1m 26s the patch passed
          -0 checkstyle 0m 50s hadoop-hdfs-project: The patch generated 2 new + 788 unchanged - 4 fixed = 790 total (was 792)
          +1 mvnsite 1m 27s the patch passed
          +1 mvneclipse 0m 27s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 xml 0m 2s The patch has no ill-formed XML file.
          +1 findbugs 3m 39s the patch passed
          +1 javadoc 1m 8s the patch passed
          +1 unit 1m 20s hadoop-hdfs-client in the patch passed.
          +1 unit 94m 4s hadoop-hdfs in the patch passed.
          +1 asflicense 0m 20s The patch does not generate ASF License warnings.
          133m 6s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:0ac17dc
          JIRA Issue HDFS-11402
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12864389/HDFS-11402.06.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml
          uname Linux d142ce37debf 3.13.0-108-generic #155-Ubuntu SMP Wed Jan 11 16:58:52 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / f356f0f
          Default Java 1.8.0_121
          findbugs v3.1.0-RC1
          findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/19165/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html
          findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/19165/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19165/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19165/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19165/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 35s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 4 new or modified test files. 0 mvndep 0m 25s Maven dependency ordering for branch +1 mvninstall 15m 40s trunk passed +1 compile 1m 27s trunk passed +1 checkstyle 0m 48s trunk passed +1 mvnsite 1m 26s trunk passed +1 mvneclipse 0m 28s trunk passed -1 findbugs 1m 22s hadoop-hdfs-project/hadoop-hdfs-client in trunk has 2 extant Findbugs warnings. -1 findbugs 1m 40s hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings. +1 javadoc 1m 4s trunk passed 0 mvndep 0m 8s Maven dependency ordering for patch +1 mvninstall 1m 33s the patch passed +1 compile 1m 26s the patch passed +1 javac 1m 26s the patch passed -0 checkstyle 0m 50s hadoop-hdfs-project: The patch generated 2 new + 788 unchanged - 4 fixed = 790 total (was 792) +1 mvnsite 1m 27s the patch passed +1 mvneclipse 0m 27s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 xml 0m 2s The patch has no ill-formed XML file. +1 findbugs 3m 39s the patch passed +1 javadoc 1m 8s the patch passed +1 unit 1m 20s hadoop-hdfs-client in the patch passed. +1 unit 94m 4s hadoop-hdfs in the patch passed. +1 asflicense 0m 20s The patch does not generate ASF License warnings. 133m 6s Subsystem Report/Notes Docker Image:yetus/hadoop:0ac17dc JIRA Issue HDFS-11402 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12864389/HDFS-11402.06.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml uname Linux d142ce37debf 3.13.0-108-generic #155-Ubuntu SMP Wed Jan 11 16:58:52 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / f356f0f Default Java 1.8.0_121 findbugs v3.1.0-RC1 findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/19165/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/19165/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19165/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19165/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19165/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          andrew.wang Andrew Wang added a comment -

          Hi Manoj, took another look, mostly just code cleanup-type comments. Would also be interested to know how you chose 512 inodes/thread as it seems like a magic number, was this based on benchmarking?

          the byte[][] is already available getINodesAndPaths() along with INode[]. Wouldn't it be a lot slower to have one more helper method to construct the same byte[][] ?

          It seems equivalent if we have two functions as follows:

            private static INode[] getINodes(final INode inode) {
              int depth = 0, index;
              INode tmp = inode;
              while (tmp != null) {
                depth++;
                tmp = tmp.getParent();
              }
              INode[] inodes = new INode[depth];
              tmp = inode;
              index = depth;
              while (tmp != null) {
                index--;
                inodes[index] = tmp;
                tmp = tmp.getParent();
              }
              return inodes;
            }
          
            private static byte[][] getPaths(final INode[] inodes) {
              byte paths[][] = new byte[inodes.length][];
              for (int i=0; i<inodes.length; i++) {
                paths[i] = inodes[i].getKey();
              }
              return paths;
            }
          

          Was this earlier comment addressed?

          isDescendant, if we're an IIP, can we simply look in our array of INodes for the specified ancestor? This method looks expensive right now.

          hasReadLock checks hasWriteLock, so the additional hasWriteLock check is unnecessary:

              assert fsnamesystem.hasReadLock() || fsnamesystem.hasWriteLock();
          
          • We could use a stride increment to simplify the work partitioning logic (and make work distribution more even):
          for (int idx = workerIdx; idx < inodeCount; idx += workerCount) {
            // do work
          }
          

          We aren't using workerIdx when getting the future results, so can simplify with foreach:

              for (Future<List<INodesInPath>> f: futureList) {
                try {
                  iipSet.addAll(f.get());
                } catch (Exception e) {
                  LOG.warn("INode filter task encountered exception: ", e);
                }
              }
          

          Mind filing a JIRA to migrate LeaseManager over to SLF4J as well? We're adding a pretty gnarly new log that would be more readable with SLF4j.

          Show
          andrew.wang Andrew Wang added a comment - Hi Manoj, took another look, mostly just code cleanup-type comments. Would also be interested to know how you chose 512 inodes/thread as it seems like a magic number, was this based on benchmarking? the byte[][] is already available getINodesAndPaths() along with INode[]. Wouldn't it be a lot slower to have one more helper method to construct the same byte[][] ? It seems equivalent if we have two functions as follows: private static INode[] getINodes(final INode inode) { int depth = 0, index; INode tmp = inode; while (tmp != null) { depth++; tmp = tmp.getParent(); } INode[] inodes = new INode[depth]; tmp = inode; index = depth; while (tmp != null) { index--; inodes[index] = tmp; tmp = tmp.getParent(); } return inodes; } private static byte[][] getPaths(final INode[] inodes) { byte paths[][] = new byte[inodes.length][]; for (int i=0; i<inodes.length; i++) { paths[i] = inodes[i].getKey(); } return paths; } Was this earlier comment addressed? isDescendant, if we're an IIP, can we simply look in our array of INodes for the specified ancestor? This method looks expensive right now. hasReadLock checks hasWriteLock, so the additional hasWriteLock check is unnecessary: assert fsnamesystem.hasReadLock() || fsnamesystem.hasWriteLock(); We could use a stride increment to simplify the work partitioning logic (and make work distribution more even): for (int idx = workerIdx; idx < inodeCount; idx += workerCount) { // do work } We aren't using workerIdx when getting the future results, so can simplify with foreach: for (Future<List<INodesInPath>> f: futureList) { try { iipSet.addAll(f.get()); } catch (Exception e) { LOG.warn("INode filter task encountered exception: ", e); } } Mind filing a JIRA to migrate LeaseManager over to SLF4J as well? We're adding a pretty gnarly new log that would be more readable with SLF4j.
          Hide
          manojg Manoj Govindassamy added a comment -

          Thanks for the review comments Andrew Wang. Attached v07 patch to address the following comments. Please take a look.

          Would also be interested to know how you chose 512 inodes/thread as it seems like a magic number, was this based on benchmarking?

          This approach addresses concerns on unnecessary thread pool for small number of open files. Yes, I did testing with 1K,5K,10K,15K,50K open files and chose this number when thread pool started to make some improvement.

          the byte[][] is already available getINodesAndPaths() along with INode[]. Wouldn't it be a lot slower to have one more helper method to construct the same byte[][] ?

          I was concerned about having to do another while loop when the former approach coalesced both into one. Anyway, the added complexity isn't much. Changed the code as suggested.

          isDescendant, if we're an IIP, can we simply look in our array of INodes for the specified ancestor? This method looks expensive right now.

          Done.

          hasReadLock checks hasWriteLock, so the additional hasWriteLock check is unnecessary:

          Done.

          We could use a stride increment to simplify the work partitioning logic (and make work distribution more even):

          The suggested approach doesn't collect partition inodes at once for the worker and needs multiple iteration. If something can be done better here, will take it up in another jira.

          We aren't using workerIdx when getting the future results, so can simplify with foreach:

          Done.

          Mind filing a JIRA to migrate LeaseManager over to SLF4J as well? We're adding a pretty gnarly new log that would be more readable with SLF4j.

          Sure, will update after filing the jira.

          Show
          manojg Manoj Govindassamy added a comment - Thanks for the review comments Andrew Wang . Attached v07 patch to address the following comments. Please take a look. Would also be interested to know how you chose 512 inodes/thread as it seems like a magic number, was this based on benchmarking? This approach addresses concerns on unnecessary thread pool for small number of open files. Yes, I did testing with 1K,5K,10K,15K,50K open files and chose this number when thread pool started to make some improvement. the byte[][] is already available getINodesAndPaths() along with INode[]. Wouldn't it be a lot slower to have one more helper method to construct the same byte[][] ? I was concerned about having to do another while loop when the former approach coalesced both into one. Anyway, the added complexity isn't much. Changed the code as suggested. isDescendant, if we're an IIP, can we simply look in our array of INodes for the specified ancestor? This method looks expensive right now. Done. hasReadLock checks hasWriteLock, so the additional hasWriteLock check is unnecessary: Done. We could use a stride increment to simplify the work partitioning logic (and make work distribution more even): The suggested approach doesn't collect partition inodes at once for the worker and needs multiple iteration. If something can be done better here, will take it up in another jira. We aren't using workerIdx when getting the future results, so can simplify with foreach: Done. Mind filing a JIRA to migrate LeaseManager over to SLF4J as well? We're adding a pretty gnarly new log that would be more readable with SLF4j. Sure, will update after filing the jira.
          Hide
          andrew.wang Andrew Wang added a comment -

          I discussed offline with Manoj on the stride, he's going to make a rev. Otherwise looks good to me!

          Show
          andrew.wang Andrew Wang added a comment - I discussed offline with Manoj on the stride, he's going to make a rev. Otherwise looks good to me!
          Hide
          manojg Manoj Govindassamy added a comment -

          Thanks for the quick review Andrew Wang. Attached v08 patch to address the following. please take a look.

          We could use a stride increment to simplify the work partitioning logic (and make work distribution more even):

          Done.

          Show
          manojg Manoj Govindassamy added a comment - Thanks for the quick review Andrew Wang . Attached v08 patch to address the following. please take a look. We could use a stride increment to simplify the work partitioning logic (and make work distribution more even): Done.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 26s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 4 new or modified test files.
          0 mvndep 0m 9s Maven dependency ordering for branch
          +1 mvninstall 14m 31s trunk passed
          +1 compile 1m 36s trunk passed
          +1 checkstyle 0m 50s trunk passed
          +1 mvnsite 1m 30s trunk passed
          +1 mvneclipse 0m 30s trunk passed
          -1 findbugs 1m 25s hadoop-hdfs-project/hadoop-hdfs-client in trunk has 2 extant Findbugs warnings.
          -1 findbugs 1m 39s hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings.
          +1 javadoc 1m 3s trunk passed
          0 mvndep 0m 7s Maven dependency ordering for patch
          +1 mvninstall 1m 24s the patch passed
          +1 compile 1m 24s the patch passed
          +1 javac 1m 24s the patch passed
          -0 checkstyle 0m 47s hadoop-hdfs-project: The patch generated 2 new + 788 unchanged - 4 fixed = 790 total (was 792)
          +1 mvnsite 1m 25s the patch passed
          +1 mvneclipse 0m 26s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 xml 0m 1s The patch has no ill-formed XML file.
          +1 findbugs 3m 18s the patch passed
          +1 javadoc 1m 1s the patch passed
          +1 unit 1m 12s hadoop-hdfs-client in the patch passed.
          -1 unit 95m 31s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 20s The patch does not generate ASF License warnings.
          132m 9s



          Reason Tests
          Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes
            hadoop.hdfs.server.namenode.ha.TestHAAppend
          Timed out junit tests org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:0ac17dc
          JIRA Issue HDFS-11402
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12864572/HDFS-11402.07.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml
          uname Linux ed52ac537083 3.13.0-108-generic #155-Ubuntu SMP Wed Jan 11 16:58:52 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 5078df7
          Default Java 1.8.0_121
          findbugs v3.1.0-RC1
          findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/19175/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html
          findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/19175/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19175/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/19175/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19175/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19175/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 26s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 4 new or modified test files. 0 mvndep 0m 9s Maven dependency ordering for branch +1 mvninstall 14m 31s trunk passed +1 compile 1m 36s trunk passed +1 checkstyle 0m 50s trunk passed +1 mvnsite 1m 30s trunk passed +1 mvneclipse 0m 30s trunk passed -1 findbugs 1m 25s hadoop-hdfs-project/hadoop-hdfs-client in trunk has 2 extant Findbugs warnings. -1 findbugs 1m 39s hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings. +1 javadoc 1m 3s trunk passed 0 mvndep 0m 7s Maven dependency ordering for patch +1 mvninstall 1m 24s the patch passed +1 compile 1m 24s the patch passed +1 javac 1m 24s the patch passed -0 checkstyle 0m 47s hadoop-hdfs-project: The patch generated 2 new + 788 unchanged - 4 fixed = 790 total (was 792) +1 mvnsite 1m 25s the patch passed +1 mvneclipse 0m 26s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 xml 0m 1s The patch has no ill-formed XML file. +1 findbugs 3m 18s the patch passed +1 javadoc 1m 1s the patch passed +1 unit 1m 12s hadoop-hdfs-client in the patch passed. -1 unit 95m 31s hadoop-hdfs in the patch failed. +1 asflicense 0m 20s The patch does not generate ASF License warnings. 132m 9s Reason Tests Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes   hadoop.hdfs.server.namenode.ha.TestHAAppend Timed out junit tests org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean Subsystem Report/Notes Docker Image:yetus/hadoop:0ac17dc JIRA Issue HDFS-11402 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12864572/HDFS-11402.07.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml uname Linux ed52ac537083 3.13.0-108-generic #155-Ubuntu SMP Wed Jan 11 16:58:52 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 5078df7 Default Java 1.8.0_121 findbugs v3.1.0-RC1 findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/19175/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/19175/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/19175/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/19175/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/19175/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project Console output https://builds.apache.org/job/PreCommit-HDFS-Build/19175/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          manojg Manoj Govindassamy added a comment -

          Above unit test failures not related to the patch.

          Show
          manojg Manoj Govindassamy added a comment - Above unit test failures not related to the patch.
          Hide
          yzhangal Yongjun Zhang added a comment -

          Thanks Andrew Wang for the review, I did another round of review, looks good to me too. +1.

          I'm going to commit soon.

          Show
          yzhangal Yongjun Zhang added a comment - Thanks Andrew Wang for the review, I did another round of review, looks good to me too. +1. I'm going to commit soon.
          Hide
          yzhangal Yongjun Zhang added a comment - - edited

          Committed to trunk. Thanks Manoj Govindassamy for the very good contribution!

          And thanks Jing Zhao Yiqun Lin and Andrew Wang for the review!

          Show
          yzhangal Yongjun Zhang added a comment - - edited Committed to trunk. Thanks Manoj Govindassamy for the very good contribution! And thanks Jing Zhao Yiqun Lin and Andrew Wang for the review!
          Hide
          yzhangal Yongjun Zhang added a comment -

          We might consider backport to branch-2. The jira is marked incompatible, but the new behaviour is disabled by default. It should be safe to put into branch-2 so people can enable it when they need it.

          BTW, noticed a small thing

          If true, snapshots taken will have an immutable shared copy of

          The "shared" word in hdfs-default.xml looks a bit confusing. We indeed create a copy and it's not really shared, maybe we can remove this word later.

          Thanks.

          Show
          yzhangal Yongjun Zhang added a comment - We might consider backport to branch-2. The jira is marked incompatible, but the new behaviour is disabled by default. It should be safe to put into branch-2 so people can enable it when they need it. BTW, noticed a small thing If true, snapshots taken will have an immutable shared copy of The "shared" word in hdfs-default.xml looks a bit confusing. We indeed create a copy and it's not really shared, maybe we can remove this word later. Thanks.
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11623 (See https://builds.apache.org/job/Hadoop-trunk-Commit/11623/)
          HDFS-11402. HDFS Snapshots should capture point-in-time copies of OPEN (yzhang: rev 20e3ae260b40cd6ef657b2a629a02219d68f162f)

          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotDiffReport.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/LeaseManager.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirSnapshotOp.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeDirectory.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestLeaseManager.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodesInPath.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotManager.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotManager.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestOpenFilesWithSnapshot.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectorySnapshottableFeature.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11623 (See https://builds.apache.org/job/Hadoop-trunk-Commit/11623/ ) HDFS-11402 . HDFS Snapshots should capture point-in-time copies of OPEN (yzhang: rev 20e3ae260b40cd6ef657b2a629a02219d68f162f) (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotDiffReport.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/LeaseManager.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirSnapshotOp.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeDirectory.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestLeaseManager.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodesInPath.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotManager.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotManager.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestOpenFilesWithSnapshot.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectorySnapshottableFeature.java (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java
          Hide
          manojg Manoj Govindassamy added a comment -

          Thanks for the review and commit help Yongjun Zhang.
          Attached branch2 patch. Please take a look and help me with getting it committed.

          Show
          manojg Manoj Govindassamy added a comment - Thanks for the review and commit help Yongjun Zhang . Attached branch2 patch. Please take a look and help me with getting it committed.
          Hide
          brahmareddy Brahma Reddy Battula added a comment -

          Outstanding snapshot inconsistency is addressed by this.nice work..looks release notes is not updated,it's better to update release notes.

          Show
          brahmareddy Brahma Reddy Battula added a comment - Outstanding snapshot inconsistency is addressed by this.nice work. .looks release notes is not updated,it's better to update release notes.
          Hide
          manojg Manoj Govindassamy added a comment -

          Thanks Brahma Reddy Battula. Updated the release notes.

          Show
          manojg Manoj Govindassamy added a comment - Thanks Brahma Reddy Battula . Updated the release notes.
          Hide
          manojg Manoj Govindassamy added a comment -

          Changes committed to branch-2. Thanks for the review Yongjun Zhang.

          Show
          manojg Manoj Govindassamy added a comment - Changes committed to branch-2. Thanks for the review Yongjun Zhang .

            People

            • Assignee:
              manojg Manoj Govindassamy
              Reporter:
              manojg Manoj Govindassamy
            • Votes:
              0 Vote for this issue
              Watchers:
              17 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development