I was trying to discover details about the Secondary NameNode, and came across the description below in the HDFS design doc.
The NameNode keeps an image of the entire file system namespace and file Blockmap in memory. This key metadata item is designed to be compact, such that a NameNode with 4 GB of RAM is plenty to support a huge number of files and directories. When the NameNode starts up, it reads the FsImage and EditLog from disk, applies all the transactions from the EditLog to the in-memory representation of the FsImage, and flushes out this new version into a new FsImage on disk. It can then truncate the old EditLog because its transactions have been applied to the persistent FsImage. This process is called a checkpoint. In the current implementation, a checkpoint only occurs when the NameNode starts up. Work is in progress to support periodic checkpointing in the near future.
Note that this directly conflicts with information in the hdfs user guide, http://hadoop.apache.org/common/docs/r0.20.2/hdfs_user_guide.html#Secondary+NameNode
I haven't done a thorough audit of that doc-- I only noticed the above inaccuracy.
- relates to
HDFS-1388 HDFS forrest doc out-dated