When MRv2 job container runs in a context of non-default file system JobHistoryUtils.java obtains mapreduce.jobhistory.done-dir and
mapreduce.jobhistory.intermediate-done-dir as a non-qualified paths (e.g. /mapred/history). This path is considered to belong to the current container's context. As result the application history is being written to another file system and job history server is unable to pick it up, because it expects it to be found on the default file system. Currently providing fully qualified path to those parameters is not supported as well, because of a bug in JobHistoryEventHandler.
After this fix two scenarios will be supported:
- mapreduce.jobhistory.done-dir and mapreduce.jobhistory.intermediate-done-dir (and the staging directory BTW) will support a fully qualified path
- If a non-qualified path is configured then it will always be defaulted to the default file system (core-site.xml). That's how consistency of history location will be archived
- FileSystem#makeQualified throws exception if specified path belongs to another file system. However FileContext#makeQualified work properly in this case, and this is the meaning of the fix in JobHistoryEventHandler. I was not ready to change behavior FileSystem#makeQualified because much more thought is required. I afraid that many users expect such behavior, and fixing it would break their code.
- The fix in JobHistoryUtils detects non-default namenode configuration only if it comes from some "real" configuration: core-default.xml is ignored. This is done primary as a kind of test hook, because otherwise setting fs.defaultFS value during test executions would be always recognized by JobHistoryUtils as a non-default namenode against 'file:///' specified in core-default.xml.
(Remark. Note that makeQualified doesn't behave properly with file:/// filesystem, for example:
new Path("file:///dir/subdir").makeQualified(new URI("hdfs://server:8020"), new Path("/dir"))
Returns: "file://server:8020/dir/subdir" which doesn't make sense.
However I don't believe it worth fixing, since nobody really case about local file system besides tests. My fix just ensures that all tests run smoothly by ignoring core-default.xml file system in the logic.)