Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-10754

RM Renew Delegation token thread should timeout and retry should also consider app new submitted.

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Patch Available
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      As  YARN-9768 described:

      Delegation token renewer thread in RM (DelegationTokenRenewer.java) renews HDFS tokens received to check for validity and expiration time.

      This call is made to an underlying HDFS NN or Router Node (which has exact APIs as HDFS NN). If one of the nodes is bad and the renew call is stuck the thread remains stuck indefinitely. The thread should ideally timeout the renewToken and retry from the client's perspective.

      But it only consider the app recovery, not consider the app submitted:

      It will cause the app submitted not retry, when renew token (HDFS Namenode/ Router) timeout. 

        Attachments

        1. image-2021-04-27-11-38-29-162.png
          178 kB
          Qi Zhu
        2. YARN-10754.001.patch
          3 kB
          Qi Zhu

          Issue Links

            Activity

              People

              • Assignee:
                zhuqi Qi Zhu
                Reporter:
                zhuqi Qi Zhu
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated: