Currently, any time the NameNode is not running, an HDFS filesystem will accept the 'format' command, and will duly format itself. There are those of us who have multi-PB HDFS filesystems who are really quite uncomfortable with this behavior. There is "Y/N" confirmation in the format command, but if the formatter genuinely believes themselves to be doing the right thing, the filesystem will be formatted.
This patch adds a configuration parameter to the namenode, dfs.namenode.support.allowformat, which defaults to "true," the current behavior: always allow formatting if the NameNode is down or some other process is not holding the namenode lock. But if dfs.namenode.support.allowformat is set to "false," the NameNode will not allow itself to be formatted until this config parameter is changed to "true".
The general idea is that for production HDFS filesystems, the user would format the HDFS once, then set dfs.namenode.support.allowformat to "false" for all time.
The attached patch was generated against trunk and +1's on my test machine. We have a 0.20 version that we are using in our cluster as well.