Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
None
-
None
-
None
Description
FileSystem#getFileChecksum may accept length parameter and 0 is a valid value. Currently it will return null when length is 0, in the following code block:
//compute file MD5 final MD5Hash fileMD5 = MD5Hash.digest(md5out.getData()); switch (crcType) { case CRC32: return new MD5MD5CRC32GzipFileChecksum(bytesPerCRC, crcPerBlock, fileMD5); case CRC32C: return new MD5MD5CRC32CastagnoliFileChecksum(bytesPerCRC, crcPerBlock, fileMD5); default: // If there is no block allocated for the file, // return one with the magic entry that matches what previous // hdfs versions return. if (locatedblocks.size() == 0) { return new MD5MD5CRC32GzipFileChecksum(0, 0, fileMD5); } // we should never get here since the validity was checked // when getCrcType() was called above. return null; }
The comment says "we should never get here since the validity was checked" but it does. As we're using the MD5-MD5-X approach, and EMPTY--CONTENT actually is a valid case in which the MD5 value is d41d8cd98f00b204e9800998ecf8427e, so suggest we return a reasonable value other than null. At least some useful information in the returned value can be seen, like values from block checksum header.
Attachments
Attachments
Issue Links
- relates to
-
HDFS-8430 Erasure coding: compute file checksum for striped files (stripe by stripe)
- In Progress