Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
FileInputFormat.listStatus fetches delegation tokens: https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java#L213
AFAICT, this is unnecessary. listStatus doesn't delegate those tokens to another process. This is causing issues described in the attached Spark Kerberos ticket, because TokenCache.obtainTokensForNameNodes, which is used to fetch the delegation tokens, assumes that certain MapReduce configuration variables are set, which isn't true in the Spark calling code. This is a separate problem, but nonetheless it wouldn't have arisen if listStatus weren't fetching delegation tokens.
Attachments
Issue Links
- relates to
-
SPARK-20328 HadoopRDDs create a MapReduce JobConf, but are not MapReduce jobs
- Resolved