When I work on
HDFS-1070, I come across a very strange bug. Occasionally when loading a compressed image, NameNode throws an exception complaining that the image file is corrupt. However, the result returned by the md5sum utility matches the checksum value stored in the VERSION file.
It turns out the image loader leaves 4 bytes unread after loading all the real data of an image. Those unread bytes may be some compression-related meta-info. The image loader should make sure to read to the end of file in order for an correct checksum.