Unlike HadoopMapReduceCommitProtocol, ManifestFileCommitProtocol doesn't clean up incomplete output files for both cases: task is aborted as well as job is aborted.
In HadoopMapReduceCommitProtocol, it leverages stage directory to write intermediate files so once job is aborted it can simply delete stage directory to clean up everything. Even HadoopMapReduceCommitProtocol puts more effort on cleaning up intermediate files on task side if task is aborted.
ManifestFileCommitProtocol doesn't do anything for cleaning up but just maintains the metadata which list of complete output files are written. It should be better if ManifestFileCommitProtocol can do the best effort to clean up: not sure it can do job level cleanup since it doesn't leverage stage directory, but it's clear that it can still put best effort to do task level cleanup.