Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-10754

RM Renew Delegation token thread should timeout and retry should also consider app new submitted.

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      As  YARN-9768 described:

      Delegation token renewer thread in RM (DelegationTokenRenewer.java) renews HDFS tokens received to check for validity and expiration time.

      This call is made to an underlying HDFS NN or Router Node (which has exact APIs as HDFS NN). If one of the nodes is bad and the renew call is stuck the thread remains stuck indefinitely. The thread should ideally timeout the renewToken and retry from the client's perspective.

      But it only consider the app recovery, not consider the app submitted:

      It will cause the app submitted not retry, when renew token (HDFS Namenode/ Router) timeout. 

      Attachments

        1. YARN-10754.001.patch
          3 kB
          Qi Zhu
        2. image-2021-04-27-11-38-29-162.png
          178 kB
          Qi Zhu

        Issue Links

          Activity

            People

              zhuqi Qi Zhu
              zhuqi Qi Zhu
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: