Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-2874

Dead lock in "DelegationTokenRenewer" which blocks RM to execute any further apps

    Details

    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      When token renewal fails and the application finishes this dead lock can occur
      Jstack dump :

      Found one Java-level deadlock:
      =============================
      "DelegationTokenRenewer #181865":
      waiting to lock monitor 0x0000000000900918 (object 0x00000000c18a9998, a java.util.Collections$SynchronizedSet),
      which is held by "DelayedTokenCanceller"
      "DelayedTokenCanceller":
      waiting to lock monitor 0x0000000004141718 (object 0x00000000c7eae720, a org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$RenewalTimerTask),
      which is held by "Timer-4"
      "Timer-4":
      waiting to lock monitor 0x0000000000900918 (object 0x00000000c18a9998, a java.util.Collections$SynchronizedSet),
      which is held by "DelayedTokenCanceller"

      Java stack information for the threads listed above:
      ===================================================
      "DelegationTokenRenewer #181865":
      at java.util.Collections$SynchronizedCollection.add(Collections.java:1636)

      • waiting to lock <0x00000000c18a9998> (a java.util.Collections$SynchronizedSet)
        at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.addTokenToList(DelegationTokenRenewer.java:322)
        at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:398)
        at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$500(DelegationTokenRenewer.java:70)
        at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:657)
        at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:638)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
        "DelayedTokenCanceller":
        at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$RenewalTimerTask.cancel(DelegationTokenRenewer.java:443)
      • waiting to lock <0x00000000c7eae720> (a org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$RenewalTimerTask)
        at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.removeApplicationFromRenewal(DelegationTokenRenewer.java:558)
      • locked <0x00000000c18a9998> (a java.util.Collections$SynchronizedSet)
        at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$300(DelegationTokenRenewer.java:70)
        at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelayedTokenRemovalRunnable.run(DelegationTokenRenewer.java:599)
        at java.lang.Thread.run(Thread.java:745)
        "Timer-4":
        at java.util.Collections$SynchronizedCollection.remove(Collections.java:1639)
      • waiting to lock <0x00000000c18a9998> (a java.util.Collections$SynchronizedSet)
        at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.removeFailedDelegationToken(DelegationTokenRenewer.java:503)
        at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$100(DelegationTokenRenewer.java:70)
        at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$RenewalTimerTask.run(DelegationTokenRenewer.java:437)
      • locked <0x00000000c7eae720> (a org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$RenewalTimerTask)
        at java.util.TimerThread.mainLoop(Timer.java:555)
        at java.util.TimerThread.run(Timer.java:505)

      Found 1 deadlock.

        Attachments

        1. YARN-2874.20141118-1.patch
          2 kB
          Naganarasimha G R
        2. YARN-2874.20141118-2.patch
          2 kB
          Naganarasimha G R

          Issue Links

            Activity

              People

              • Assignee:
                Naganarasimha Naganarasimha G R
                Reporter:
                Naganarasimha Naganarasimha G R
              • Votes:
                0 Vote for this issue
                Watchers:
                14 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: