Details
Description
org.apache.hadoop.hbase.mapreduce.Export should set compression codec
In createSubmittableJob(), the following should be added:
FileOutputFormat.setCompressOutput(job, true);
FileOutputFormat.setOutputCompressorClass(job, org.apache.hadoop.io.compress.GzipCodec.class);
From my experiment, 10% to 50% reduction in Export output has been observed.
SequenceFileInputFormat used by the Import tool is able to detect GzipCodec - there is no change for Import class.
Attachments
Attachments
Issue Links
- is depended upon by
-
HBASE-2434 Add scanner caching option to Export and write buffer option for Import
- Closed
- is duplicated by
-
HBASE-3166 HBase exporter should compress output files by default (or at least allow this as an option)
- Closed