Oozie launcher job (for MR/Pig/Hive/Sqoop action) reads the location of the jobtoken file from the HADOOP_TOKEN_FILE_LOCATION ENV var and seeds it as the mapreduce.job.credentials.binary property in the jobconf that will be used to launch the real (MR/Pig/Hive/Sqoop) job.
The MR/Pig/Hive/Sqoop submission code (via Hadoop job submission) uses correctly the injected mapreduce.job.credentials.binary property to load the credentials and submit their MR jobs.
The problem is that the mapreduce.job.credentials.binary property also makes it to the tasks of the MR/Pig/Hive/Sqoop MR jobs.
If for some reason the MR/Pig/Hive/Sqoop MR code does some logic that triggers the credential loading, because the property is set, the credential loading fails trying to load a jobtoken file of the launcher job which does not exists in the context of the MR/Pig/Hive/Sqoop jobs.
More specifically, we are seeing this happening with certain hive queries that trigger a conditional code within their RowContainer which then uses the FileInputFormat.getSplits() and then the TokenCache tries to load credentials for a file that is for the wrong job.
- is blocked by
HADOOP-8023 Add unset() method to Configuration
- is broken by
MAPREDUCE-4324 JobClient can perhaps set mapreduce.job.credentials.binary rather than expect its presence?