Details
-
Improvement
-
Status: Open
-
Trivial
-
Resolution: Unresolved
-
2.8.1, 3.0.0-alpha3
-
None
-
None
Description
It is confusing to have two different ways to set the Sequence File compression type.
In a basic configuration, I can set mapreduce.output.fileoutputformat.compress.type or io.seqfile.compression.type. If I would like to set a default value, I should set it by setting the cluster environment's mapred-site.xml file setting for mapreduce.output.fileoutputformat.compress.type.
Please remove references to this magic string io.seqfile.compression.type, remove the setDefaultCompressionType method, and have getDefaultCompressionType return value hard-coded to CompressionType.RECORD. This will make administration easier as I have to only interrogate one configuration.
/** * Get the compression type for the reduce outputs * @param job the job config to look in * @return the kind of compression to use */ static public CompressionType getDefaultCompressionType(Configuration job) { String name = job.get("io.seqfile.compression.type"); return name == null ? CompressionType.RECORD : CompressionType.valueOf(name); } /** * Set the default compression type for sequence files. * @param job the configuration to modify * @param val the new compression type (none, block, record) */ static public void setDefaultCompressionType(Configuration job, CompressionType val) { job.set("io.seqfile.compression.type", val.toString()); }