Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Incomplete
-
0.18.3
-
None
-
None
Description
The current behavior in FileOutputFormat.checkOutputSpecs is to fail if the path specified by mapred.output.dir exists at the start of the job. This is to protect from accidentally overwriting existing data. There seems no harm then in slightly relaxing this check to allow the case for the output to exist if it is an empty directory.
At a minimum this would allow outputting to the root of S3N buckets, which is currently impossible (https://issues.apache.org/jira/browse/HADOOP-5805).