Details

    • Type: Sub-task Sub-task
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: Snapshot (HDFS-2802)
    • Fix Version/s: Snapshot (HDFS-2802)
    • Component/s: datanode, namenode
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      This jira provides internal data structures and computation processes for calculating and representing the diff between two snapshots, or the diff between a snapshot and the current tree.

      Specifically, a new method getSnapshotDiffReport(Path, String, String) is added to FSNamesystem to compute the snapshot diff. The snapshot diff is represented as a SnapshotDiffReport internally. In later jiras we will add support to present the SnapshotDiffReport to end users.

      1. HDFS-4131.006.patch
        29 kB
        Jing Zhao
      2. HDFS-4131.005.patch
        30 kB
        Jing Zhao
      3. HDFS-4131.004.patch
        28 kB
        Jing Zhao
      4. HDFS-4131.003.patch
        27 kB
        Jing Zhao
      5. HDFS-4131.002.patch
        27 kB
        Jing Zhao
      6. HDFS-4131.001.patch
        25 kB
        Jing Zhao

        Issue Links

          Activity

          Hide
          Jing Zhao added a comment -

          Initial patch that only covers the diff computation in NN. A simple testcase is also included. Further work for reporting the diff to users will be done in separate jiras.

          Show
          Jing Zhao added a comment - Initial patch that only covers the diff computation in NN. A simple testcase is also included. Further work for reporting the diff to users will be done in separate jiras.
          Hide
          Jing Zhao added a comment -

          Update the patch based on "HDFS-4414+4131.002.patch" in HDFS-4414: fix the code for checking if the metadata of a directory has been changed between snapshots.

          Show
          Jing Zhao added a comment - Update the patch based on " HDFS-4414 +4131.002.patch" in HDFS-4414 : fix the code for checking if the metadata of a directory has been changed between snapshots.
          Hide
          Jing Zhao added a comment -

          Rebase the patch.

          Show
          Jing Zhao added a comment - Rebase the patch.
          Hide
          Jing Zhao added a comment -

          The diff printout from the new testcase TestSnapshotDiffReport looks like:

          Diffence between snapshot s0 and snapshot s2 under directory /TestSnapshot/sub1:
          M	/TestSnapshot/sub1
          +	/TestSnapshot/sub1/file15
          -	/TestSnapshot/sub1/file12
          M	/TestSnapshot/sub1/file11
          M	/TestSnapshot/sub1/file13
          
          Diffence between snapshot s0 and snapshot s5 under directory /TestSnapshot/sub1:
          M	/TestSnapshot/sub1
          +	/TestSnapshot/sub1/file15
          +	/TestSnapshot/sub1/subsub1
          -	/TestSnapshot/sub1/file12
          M	/TestSnapshot/sub1/file10
          M	/TestSnapshot/sub1/file11
          M	/TestSnapshot/sub1/file13
          
          Diffence between snapshot s0 and current directory under directory /TestSnapshot/sub1:
          M	/TestSnapshot/sub1
          +	/TestSnapshot/sub1/file15
          +	/TestSnapshot/sub1/subsub1
          -	/TestSnapshot/sub1/file12
          M	/TestSnapshot/sub1/file10
          M	/TestSnapshot/sub1/file11
          M	/TestSnapshot/sub1/file13
          

          where M/+/-/R denote modified/created/deleted/renamed respectively (rename is not supported in the diff computation currently).

          Show
          Jing Zhao added a comment - The diff printout from the new testcase TestSnapshotDiffReport looks like: Diffence between snapshot s0 and snapshot s2 under directory /TestSnapshot/sub1: M /TestSnapshot/sub1 + /TestSnapshot/sub1/file15 - /TestSnapshot/sub1/file12 M /TestSnapshot/sub1/file11 M /TestSnapshot/sub1/file13 Diffence between snapshot s0 and snapshot s5 under directory /TestSnapshot/sub1: M /TestSnapshot/sub1 + /TestSnapshot/sub1/file15 + /TestSnapshot/sub1/subsub1 - /TestSnapshot/sub1/file12 M /TestSnapshot/sub1/file10 M /TestSnapshot/sub1/file11 M /TestSnapshot/sub1/file13 Diffence between snapshot s0 and current directory under directory /TestSnapshot/sub1: M /TestSnapshot/sub1 + /TestSnapshot/sub1/file15 + /TestSnapshot/sub1/subsub1 - /TestSnapshot/sub1/file12 M /TestSnapshot/sub1/file10 M /TestSnapshot/sub1/file11 M /TestSnapshot/sub1/file13 where M/+/-/R denote modified/created/deleted/renamed respectively (rename is not supported in the diff computation currently).
          Hide
          Daryn Sharp added a comment -

          As a high level comment, shouldn't this be a more programmatic api? Ie. a list of objects containing the path and the snapshot state? It should be up to a tool to decide how to format the diff.

          Show
          Daryn Sharp added a comment - As a high level comment, shouldn't this be a more programmatic api? Ie. a list of objects containing the path and the snapshot state? It should be up to a tool to decide how to format the diff.
          Hide
          Jing Zhao added a comment -

          Thanks for the comment Daryn! The above printout is just for testing use. And in this jira, only some internal data structure and computation process have been implemented. HDFS-4414 develops an API for users to get the diff report between two snapshots, where a DiffReport is provided containing "a list of objects containing the path and the snapshot state". And another separate jira (will create later) will provide CLI and JMX support to present/format the diff.

          Show
          Jing Zhao added a comment - Thanks for the comment Daryn! The above printout is just for testing use. And in this jira, only some internal data structure and computation process have been implemented. HDFS-4414 develops an API for users to get the diff report between two snapshots, where a DiffReport is provided containing "a list of objects containing the path and the snapshot state". And another separate jira (will create later) will provide CLI and JMX support to present/format the diff.
          Hide
          Suresh Srinivas added a comment -

          This patch does not apply on top of latest branch. Needs rebase?

          Show
          Suresh Srinivas added a comment - This patch does not apply on top of latest branch. Needs rebase?
          Hide
          Jing Zhao added a comment -

          Rebase the patch.

          Show
          Jing Zhao added a comment - Rebase the patch.
          Hide
          Suresh Srinivas added a comment -

          Can you please add details of what this patch does?

          Show
          Suresh Srinivas added a comment - Can you please add details of what this patch does?
          Hide
          Jing Zhao added a comment -

          Update the patch. Do not distinguish earlier/later snapshots in FSNamesystem#getSnapshotDiffReport. Instead, only compute diff between source and target snapshot where source may be taken after target.

          Show
          Jing Zhao added a comment - Update the patch. Do not distinguish earlier/later snapshots in FSNamesystem#getSnapshotDiffReport. Instead, only compute diff between source and target snapshot where source may be taken after target.
          Hide
          Suresh Srinivas added a comment -

          Comments:

          1. Consider renaming?
            • sourceSnapshotName -> fromSnapshot
            • targetSnapshotName -> toSnapshot
            • DiffBetweenSnapshot -> SnapshotDiffReport

          +1 with those changes.

          Show
          Suresh Srinivas added a comment - Comments: Consider renaming? sourceSnapshotName -> fromSnapshot targetSnapshotName -> toSnapshot DiffBetweenSnapshot -> SnapshotDiffReport +1 with those changes.
          Hide
          Jing Zhao added a comment -

          Update the patch based on Suresh's comments.

          Show
          Jing Zhao added a comment - Update the patch based on Suresh's comments.
          Hide
          Suresh Srinivas added a comment -

          I committed the change to the branch.

          Thank you Jing!

          Show
          Suresh Srinivas added a comment - I committed the change to the branch. Thank you Jing!
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-Snapshots-Branch-build #86 (See https://builds.apache.org/job/Hadoop-Hdfs-Snapshots-Branch-build/86/)
          HDFS-4131. Add capability to namenode to get snapshot diff. Contributed by Jing Zhao. (Revision 1440152)

          Result = FAILURE
          suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1440152
          Files :

          • /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/CHANGES.HDFS-2802.txt
          • /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
          • /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeDirectory.java
          • /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/INodeDirectorySnapshottable.java
          • /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/INodeDirectoryWithSnapshot.java
          • /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotManager.java
          • /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/diff/Diff.java
          • /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotDiffReport.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-Snapshots-Branch-build #86 (See https://builds.apache.org/job/Hadoop-Hdfs-Snapshots-Branch-build/86/ ) HDFS-4131 . Add capability to namenode to get snapshot diff. Contributed by Jing Zhao. (Revision 1440152) Result = FAILURE suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1440152 Files : /hadoop/common/branches/ HDFS-2802 /hadoop-hdfs-project/hadoop-hdfs/CHANGES. HDFS-2802 .txt /hadoop/common/branches/ HDFS-2802 /hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java /hadoop/common/branches/ HDFS-2802 /hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeDirectory.java /hadoop/common/branches/ HDFS-2802 /hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/INodeDirectorySnapshottable.java /hadoop/common/branches/ HDFS-2802 /hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/INodeDirectoryWithSnapshot.java /hadoop/common/branches/ HDFS-2802 /hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotManager.java /hadoop/common/branches/ HDFS-2802 /hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/diff/Diff.java /hadoop/common/branches/ HDFS-2802 /hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotDiffReport.java

            People

            • Assignee:
              Jing Zhao
              Reporter:
              Suresh Srinivas
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development