Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-9505

HDFS Architecture documentation needs to be refreshed.

    Details

    • Hadoop Flags:
      Reviewed

      Description

      The HDFS Architecture document is out of date with respect to the current design of the system.

      http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html

      There are multiple false statements and omissions of recent features.

      1. HDFS-9505.001.patch
        17 kB
        Masatake Iwasaki
      2. HDFS-9505.002.patch
        23 kB
        Masatake Iwasaki

        Issue Links

          Activity

          Hide
          cnauroth Chris Nauroth added a comment -

          I haven't done a full read, but here are a few of the problems:

          "There is a plan to support appending-writes to files in the future."

          "HDFS does not yet implement user quotas or access permissions"

          "Currently, automatic restart and failover of the NameNode software to
          another machine is not supported."

          Show
          cnauroth Chris Nauroth added a comment - I haven't done a full read, but here are a few of the problems: "There is a plan to support appending-writes to files in the future." "HDFS does not yet implement user quotas or access permissions" "Currently, automatic restart and failover of the NameNode software to another machine is not supported."
          Hide
          ajisakaa Akira Ajisaka added a comment -

          "There is a plan to support appending-writes to files in the future."

          This sentence was fixed in HDFS-8852. The rest should be fixed.

          Show
          ajisakaa Akira Ajisaka added a comment - "There is a plan to support appending-writes to files in the future." This sentence was fixed in HDFS-8852 . The rest should be fixed.
          Hide
          iwasakims Masatake Iwasaki added a comment -

          I would like to start working on this. If someone is already on this, ping me please.

          Show
          iwasakims Masatake Iwasaki added a comment - I would like to start working on this. If someone is already on this, ping me please.
          Hide
          iwasakims Masatake Iwasaki added a comment -

          "Currently, automatic restart and failover of the NameNode software to another machine is not supported."

          This was fixed by HDFS-8914.

          Show
          iwasakims Masatake Iwasaki added a comment - "Currently, automatic restart and failover of the NameNode software to another machine is not supported." This was fixed by HDFS-8914 .
          Hide
          iwasakims Masatake Iwasaki added a comment -

          The statement that "all blocks in a file except the last block are the same size" is not always true after HDFS-3689.

          Show
          iwasakims Masatake Iwasaki added a comment - The statement that "all blocks in a file except the last block are the same size" is not always true after HDFS-3689 .
          Hide
          iwasakims Masatake Iwasaki added a comment -

          I attached 001.

          • fixed description about quota and permission.
          • fixed description about variable block size (HDFS-3689).
          • added description about stale state of datanodes (HDFS-3703).
          • fixed description about snapshot support.
          • fixed description about client-side buffering.
          • added hyperlinks.
          • fixed formatting nits.
          Show
          iwasakims Masatake Iwasaki added a comment - I attached 001. fixed description about quota and permission. fixed description about variable block size ( HDFS-3689 ). added description about stale state of datanodes ( HDFS-3703 ). fixed description about snapshot support. fixed description about client-side buffering. added hyperlinks. fixed formatting nits.
          Hide
          iwasakims Masatake Iwasaki added a comment -

          I reattached the patch to fix formatting issue.

          Show
          iwasakims Masatake Iwasaki added a comment - I reattached the patch to fix formatting issue.
          Hide
          iwasakims Masatake Iwasaki added a comment -

          Long lines in markdown documentation is side effect of conversion from APT format. I added line breaks to rerevant lines of this jira for ease of future editing and diff tracking.

          Please use git diff --word-diff to see the change of contents.

          Show
          iwasakims Masatake Iwasaki added a comment - Long lines in markdown documentation is side effect of conversion from APT format. I added line breaks to rerevant lines of this jira for ease of future editing and diff tracking. Please use git diff --word-diff to see the change of contents.
          Hide
          ajisakaa Akira Ajisaka added a comment -

          Thank you Masatake Iwasaki for refleshing the document! Mostly looks good to me. Two comments from me:
          1. hadoop fs -expunge actually does not delete all the files in trash. Please see HADOOP-12374 for the detail.

          Work is in progress to expose HDFS through the WebDAV protocol.

          2. There is a jira for WebDAV (HDFS-225) but there have been no updates for more than 6 years. Instead, we should document that HDFS now supports NFSv3.

          Show
          ajisakaa Akira Ajisaka added a comment - Thank you Masatake Iwasaki for refleshing the document! Mostly looks good to me. Two comments from me: 1. hadoop fs -expunge actually does not delete all the files in trash. Please see HADOOP-12374 for the detail. Work is in progress to expose HDFS through the WebDAV protocol. 2. There is a jira for WebDAV ( HDFS-225 ) but there have been no updates for more than 6 years. Instead, we should document that HDFS now supports NFSv3.
          Hide
          iwasakims Masatake Iwasaki added a comment -

          Thanks for the review comments, Akira Ajisaka.

          hadoop fs -expunge actually does not delete all the files in trash.

          I'm going to move relevant part to "FileSystem Shell" guide and fix the description because trash is (basically) the feature of hadoop-common and could be relevant to other file systems.

          There is a jira for WebDAV (HDFS-225) but there have been no updates for more than 6 years. Instead, we should document that HDFS now supports NFSv3.

          Yeah. I agree.

          Show
          iwasakims Masatake Iwasaki added a comment - Thanks for the review comments, Akira Ajisaka . hadoop fs -expunge actually does not delete all the files in trash. I'm going to move relevant part to "FileSystem Shell" guide and fix the description because trash is (basically) the feature of hadoop-common and could be relevant to other file systems. There is a jira for WebDAV ( HDFS-225 ) but there have been no updates for more than 6 years. Instead, we should document that HDFS now supports NFSv3. Yeah. I agree.
          Hide
          iwasakims Masatake Iwasaki added a comment -

          I attached updated patch as 002.

          • made it clearer that moving files to trash is a feature of FS shell.
          • moved contents of "HDFS Trash Management" section to under "File Deletes and Undeletes" section.
          • moved part of contents about trash feature to FileSystem Shell guide.
          • removed description about WebDAV and added NFS gateway instead.
          Show
          iwasakims Masatake Iwasaki added a comment - I attached updated patch as 002. made it clearer that moving files to trash is a feature of FS shell. moved contents of "HDFS Trash Management" section to under "File Deletes and Undeletes" section. moved part of contents about trash feature to FileSystem Shell guide. removed description about WebDAV and added NFS gateway instead.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 0s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 mvnsite 1m 55s trunk passed
          +1 mvnsite 1m 57s the patch passed
          -1 whitespace 0m 0s The patch has 141 line(s) that end in whitespace. Use git apply --whitespace=fix.
          +1 asflicense 0m 18s Patch does not generate ASF License warnings.
          4m 25s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:0ca8df7
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12778742/HDFS-9505.002.patch
          JIRA Issue HDFS-9505
          Optional Tests asflicense mvnsite
          uname Linux a5757710af6e 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 52ad912
          whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/13952/artifact/patchprocess/whitespace-eol.txt
          modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs U: .
          Max memory used 32MB
          Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/13952/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 0s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 mvnsite 1m 55s trunk passed +1 mvnsite 1m 57s the patch passed -1 whitespace 0m 0s The patch has 141 line(s) that end in whitespace. Use git apply --whitespace=fix. +1 asflicense 0m 18s Patch does not generate ASF License warnings. 4m 25s Subsystem Report/Notes Docker Image:yetus/hadoop:0ca8df7 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12778742/HDFS-9505.002.patch JIRA Issue HDFS-9505 Optional Tests asflicense mvnsite uname Linux a5757710af6e 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 52ad912 whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/13952/artifact/patchprocess/whitespace-eol.txt modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs U: . Max memory used 32MB Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org Console output https://builds.apache.org/job/PreCommit-HDFS-Build/13952/console This message was automatically generated.
          Hide
          ajisakaa Akira Ajisaka added a comment -

          +1, thanks Masatake.

          Show
          ajisakaa Akira Ajisaka added a comment - +1, thanks Masatake.
          Hide
          ajisakaa Akira Ajisaka added a comment -

          Committed v2 patch to trunk, branch-2, branch-2.8, and branch-2.7. Thanks Masatake Iwasaki for the contribution!

          Show
          ajisakaa Akira Ajisaka added a comment - Committed v2 patch to trunk, branch-2, branch-2.8, and branch-2.7. Thanks Masatake Iwasaki for the contribution!
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #9007 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9007/)
          HDFS-9505. HDFS Architecture documentation needs to be refreshed. (aajisaka: rev fa544020f6f71ee993f047c9b986c047a25ed84c)

          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-common-project/hadoop-common/src/site/markdown/FileSystemShell.md
          • hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsDesign.md
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #9007 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9007/ ) HDFS-9505 . HDFS Architecture documentation needs to be refreshed. (aajisaka: rev fa544020f6f71ee993f047c9b986c047a25ed84c) hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-common-project/hadoop-common/src/site/markdown/FileSystemShell.md hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsDesign.md
          Hide
          iwasakims Masatake Iwasaki added a comment -

          Thanks, Akira Ajisaka!

          Show
          iwasakims Masatake Iwasaki added a comment - Thanks, Akira Ajisaka !
          Hide
          vinodkv Vinod Kumar Vavilapalli added a comment -

          Closing the JIRA as part of 2.7.3 release.

          Show
          vinodkv Vinod Kumar Vavilapalli added a comment - Closing the JIRA as part of 2.7.3 release.
          Hide
          cnauroth Chris Nauroth added a comment -

          FYI, I have filed HDFS-11995 for another inaccuracy in the HDFS Architecture documentation that remains even after this patch was committed.

          Show
          cnauroth Chris Nauroth added a comment - FYI, I have filed HDFS-11995 for another inaccuracy in the HDFS Architecture documentation that remains even after this patch was committed.

            People

            • Assignee:
              iwasakims Masatake Iwasaki
              Reporter:
              cnauroth Chris Nauroth
            • Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development