Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Won't Fix
-
0.12.1
-
None
-
None
Description
I'm not completely sure whether this is the right place to put this issue since Pig is involved, however, Pig leave decompression of tar.gz to hadoop-common.
How to reproduce the issue:
- simple file (file1) with arbitrary text lines put into in1 in HDFS
- same file (file1) compressed by tar -cvzf file1.tar.gz file put into in2 in HDFS
- issue simple pig commands in pig:
raw = load 'in1/' USING TextLoader AS (line: bytearray);
dump raw;run for both (compressed and uncompressed file)
- in case of compressed version you will get strange 1st line
a0000644000570000001440000000002512534073736011260 0ustar loadhadoopusersa
...