The MergeTask is responsible for merging small output files into bigger files for the final output table.
The compression settings to be used should be COMPRESSRESULT instead of COMPRESSINTERMEDIATE.
GenMRFileSink1.java:172:
FileSinkOperator newOutput =
(FileSinkOperator)OperatorFactory.getAndMakeChild(
new fileSinkDesc(finalName, ts,
parseCtx.getConf().getBoolVar(HiveConf.ConfVars.COMPRESSINTERMEDIATE)),
fsRS, extract);
Associated mailing list discussion: http://mail-archives.apache.org/mod_mbox/hadoop-hive-user/200908.mbox/%3C794f042d0908122114o7ddb8d18h4f444c1dfa16fa87@mail.gmail.com%3E