Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-16095 Support impersonation for AuthenticationFilter
  3. HADOOP-16356

Distcp with webhdfs is not working with ProxyUserAuthenticationFilter or AuthenticationFilter

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • None
    • None
    • None
    • None

    Description

      When distcp is running with webhdfs://, there is no delegation token issued to mapreduce task because mapreduce task does not have kerberos tgt ticket.

      This stack trace was thrown when mapreduce task contacts webhdfs:

      Error: org.apache.hadoop.security.AccessControlException: Authentication required
      	at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:492)
      	at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:136)
      	at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.connect(WebHdfsFileSystem.java:760)
      	at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:835)
      	at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:663)
      	at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:701)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:422)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)
      	at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:697)
      	at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:1095)
      	at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:1106)
      	at org.apache.hadoop.tools.mapred.CopyMapper.setup(CopyMapper.java:124)
      	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
      	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
      	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
      	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:178)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:422)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)
      	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:172)
      

      There are two proposals:

      1. Have a API to issue delegation token to pass along to webhdfs to maintain backward compatibility.
      2. Have mapreduce task login to kerberos then perform webhdfs fetching.

      Attachments

        Issue Links

          Activity

            People

              prabhujoseph Prabhu Joseph
              eyang Eric Yang
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: