Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
Reviewed
Description
Currently, 'hadoop fs -text myfile' looks at the first few magic bytes of a file to determine whether it is gzip compressed or a sequence file. This means 'fs -text' cannot properly decode .deflate or .bz2 files (or other codecs specified via configuration).
It should be fairly straightforward to add support for other codecs by checking the file extension against the CompressionCodecFactory to retrieve an appropriate Codec.