Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Won't Fix
-
0.4.0
-
None
-
None
-
None
Description
Problem 1) use of .bz2 file extension does not store results bzip2 compressed in Local mode (-exectype local)
If I use the .bz2 filename extension in a STORE statement on HDFS, the results are stored with bzip2 compression.
If I use the .bz2 filename extension in a STORE statement on local file system, the results are NOT stored with bzip2 compression.
compact.bz2.pig:
A = load 'events.test' using PigStorage(); store A into 'events.test.bz2' using PigStorage(); C = load 'events.test.bz2' using PigStorage(); C = limit C 10; dump C;
-bash-3.00$ pig -exectype local compact.bz2.pig -bash-3.00$ file events.test events.test: ASCII English text, with very long lines -bash-3.00$ file events.test.bz2 events.test.bz2: ASCII English text, with very long lines -bash-3.00$ cat events.test | bzip2 > events.test.bz2 -bash-3.00$ file events.test.bz2 events.test.bz2: bzip2 compressed data, block size = 900k
The output format in local mode is definitely not bzip2, but it should be.
Problem 2) pig in local mode does not decompress bzip2 compressed files, but should, to be consistent with HDFS read.bz2.pig:
A = load 'events.test.bz2' using PigStorage();
A = limit A 10;
dump A;
The output should be human readable but is instead garbage, indicating no decompression took place during the load:
-bash-3.00$ pig -exectype local read.bz2.pig
USING: /grid/0/gs/pig/current
2009-04-03 18:26:30,455 [main] INFO org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
2009-04-03 18:26:30,456 [main] INFO org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
(BZh91AY&SYoz?u??@{????????????????x_?d?|u??-mK?;????????????4?C)
((R? 6?*m?&?g, ?6?Zj?k,?0???QT?d?hY?#m??J?>????[j?z?m?t?u?K)K5+)?m?E7j?X?8??????a
U?p@@MT?$?B?P??N=?(??z<}GK?E
U??w??wg,?u?$?T???((Q!D?=`?}h????P_|=?(??2?m=???xG?(?rC?B?(33:4?N?????t??|T???k??NT?x?=?fyv?w>f????4z?4t?)
(?oou?t?Kwl?3?nCM?WS?;l?P?s?x
a?e??)B??9? ?44
((??@4?)
(f????)
(?@+?d?0@>?U)
(Q?SR)
-bash-3.00$