Recently when evaluating the code in HadoopMapReduceCommitProtocol#commitJob, I found some bad codepath under the dynamicPartitionOverwrite == true scenario:
Assuming dynamicPartitionOverwrite == true, we have the following sequence of events:
- Block 1 deletes all parent directories of filesToMove.values
- Block 2 attempts to rename all filesToMove.keys to filesToMove.values
- Block 3 does directory-level renames to place files into their final locations
All renames in Block 2 will always fail, since all parent directories of filesToMove.values were just deleted in Block 1. Under a normal HDFS scenario, the contract of fs.rename is to return false under such a failure scenario, as opposed to throwing an exception. There is a separate issue here that Block 2 should probably be checking for those false return values – but this allows for dynamicPartitionOverwrite to "work", albeit with a bunch of failed renames in the middle. Really, we should only run Block 2 in the dynamicPartitionOverwrite == false case, and consolidate Blocks 1 and 3 to run in the true case.
We discovered this issue when testing against a FileSystem implementation which was throwing an exception for this failed rename scenario instead of returning false, escalating the silent/ignored rename failures into actual failures.