Details
-
Bug
-
Status: Open
-
Critical
-
Resolution: Unresolved
-
2.4.1
-
None
-
None
Description
In some occasions, when reading MapFiles which were generated by MapFileOutputFormat with BZIP2 BLOCK compression, using getClosest(key, value, true) on the MapFile reader causes an IOException to be thrown with the message "File is corrupt!" When doing "hdfs fsck", it shows that everything is OK, and the underlying data and index files can also be read correctly if read with a SequenceFile.Reader.
The exception happens in the readBlock() method of the SequenceFile.Reader class.
My guess is that, since MapFile.Reader's seekInternal() method does "seek()" instead of "sync()", it is not correctly checked if the cursor is really positioned at a valid location.