Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
None
-
None
-
None
-
None
Description
When distcp is running with webhdfs://, there is no delegation token issued to mapreduce task because mapreduce task does not have kerberos tgt ticket.
This stack trace was thrown when mapreduce task contacts webhdfs:
Error: org.apache.hadoop.security.AccessControlException: Authentication required at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:492) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:136) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.connect(WebHdfsFileSystem.java:760) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:835) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:663) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:701) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:697) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:1095) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:1106) at org.apache.hadoop.tools.mapred.CopyMapper.setup(CopyMapper.java:124) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:178) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:172)
There are two proposals:
1. Have a API to issue delegation token to pass along to webhdfs to maintain backward compatibility.
2. Have mapreduce task login to kerberos then perform webhdfs fetching.
Attachments
Issue Links
- is caused by
-
HADOOP-16314 Make sure all end point URL is covered by the same AuthenticationFilter
- Resolved
-
HADOOP-16354 Enable AuthFilter as default for WebHdfs
- Resolved