Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-3021

YARN's delegation-token handling disallows certain trust setups to operate properly over DistCp

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.3.0
    • Fix Version/s: 2.8.0, 3.0.0-alpha1
    • Component/s: security
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      ResourceManager renews delegation tokens for applications. This behavior has been changed to renew tokens only if the token's renewer is a non-empty string. MapReduce jobs can instruct ResourceManager to skip renewal of tokens obtained from certain hosts by specifying the hosts with configuration mapreduce.job.hdfs-servers.token-renewal.exclude=<host1>,<host2>,..,<hostN>.
      Show
      ResourceManager renews delegation tokens for applications. This behavior has been changed to renew tokens only if the token's renewer is a non-empty string. MapReduce jobs can instruct ResourceManager to skip renewal of tokens obtained from certain hosts by specifying the hosts with configuration mapreduce.job.hdfs-servers.token-renewal.exclude=<host1>,<host2>,..,<hostN>.

      Description

      Consider this scenario of 3 realms: A, B and COMMON, where A trusts COMMON, and B trusts COMMON (one way trusts both), and both A and B run HDFS + YARN clusters.

      Now if one logs in with a COMMON credential, and runs a job on A's YARN that needs to access B's HDFS (such as a DistCp), the operation fails in the RM, as it attempts a renewDelegationToken(…) synchronously during application submission (to validate the managed token before it adds it to a scheduler for automatic renewal). The call obviously fails cause B realm will not trust A's credentials (here, the RM's principal is the renewer).

      In the 1.x JobTracker the same call is present, but it is done asynchronously and once the renewal attempt failed we simply ceased to schedule any further attempts of renewals, rather than fail the job immediately.

      We should change the logic such that we attempt the renewal but go easy on the failure and skip the scheduling alone, rather than bubble back an error to the client, failing the app submission. This way the old behaviour is retained.

        Attachments

        1. YARN-3021.007.patch
          11 kB
          Yongjun Zhang
        2. YARN-3021.007.patch
          11 kB
          Yongjun Zhang
        3. YARN-3021.007.patch
          11 kB
          Yongjun Zhang
        4. YARN-3021.006.patch
          11 kB
          Yongjun Zhang
        5. YARN-3021.005.patch
          13 kB
          Yongjun Zhang
        6. YARN-3021.004.patch
          9 kB
          Yongjun Zhang
        7. YARN-3021.003.patch
          20 kB
          Yongjun Zhang
        8. YARN-3021.002.patch
          17 kB
          Yongjun Zhang
        9. YARN-3021.001.patch
          13 kB
          Yongjun Zhang
        10. YARN-3021.patch
          4 kB
          Harsh J

          Issue Links

            Activity

              People

              • Assignee:
                yzhangal Yongjun Zhang
                Reporter:
                qwertymaniac Harsh J
              • Votes:
                0 Vote for this issue
                Watchers:
                16 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: