Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
Reviewed
Description
Hive produces two types of data files - flat files and sequencefiles. Syntax should reflect this. Currently the 'compressed' keyword is used to choose sequencefile format - but does not actually compress the files. this is misleading. In addition - flat files can also be compressed.
Proposal is to replace 'compressed' with 'sequencefile'. And compression options should be applied from standard hadoop way of specifying whether output should be compressed (''mapred.output.compress') - ie. session options. (session options will also define codec etc.). default file format and compression options can be specified in conf file.