Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-1612

HDFS Design Documentation is outdated

    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      I was trying to discover details about the Secondary NameNode, and came across the description below in the HDFS design doc.

      The NameNode keeps an image of the entire file system namespace and file Blockmap in memory. This key metadata item is designed to be compact, such that a NameNode with 4 GB of RAM is plenty to support a huge number of files and directories. When the NameNode starts up, it reads the FsImage and EditLog from disk, applies all the transactions from the EditLog to the in-memory representation of the FsImage, and flushes out this new version into a new FsImage on disk. It can then truncate the old EditLog because its transactions have been applied to the persistent FsImage. This process is called a checkpoint. In the current implementation, a checkpoint only occurs when the NameNode starts up. Work is in progress to support periodic checkpointing in the near future.

      (emphasis mine).

      Note that this directly conflicts with information in the hdfs user guide, http://hadoop.apache.org/common/docs/r0.20.2/hdfs_user_guide.html#Secondary+NameNode
      and http://hadoop.apache.org/hdfs/docs/current/hdfs_user_guide.html#Checkpoint+Node

      I haven't done a thorough audit of that doc-- I only noticed the above inaccuracy.

      Attachments

        1. HDFS-1612.patch
          7 kB
          Joe Crobak

        Issue Links

          Activity

            People

              joecrobak Joe Crobak
              joecrobak Joe Crobak
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: