Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-8770

NN should not RPC to self to find trash defaults (causes deadlock)

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 2.0.2-alpha
    • Fix Version/s: 2.0.2-alpha
    • Component/s: trash
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      When transitioning a SBN to active, I ran into the following situation:

      • the TrashPolicy first gets loaded by an IPC Server Handler thread. The initialize function then tries to make an RPC to the same node to find out the defaults.
      • This is happening inside the NN write lock (since it's part of the active initialization). Hence, all of the other handler threads are already blocked waiting to get the NN lock.
      • Since no handler threads are free, the RPC blocks forever and the NN never enters active state.

      We need to have a general policy that the NN should never make RPCs to itself for any reason, due to potential for deadlocks like this.

        Attachments

        1. hdfs-3876.txt
          11 kB
          Eli Collins
        2. hdfs-3876.txt
          11 kB
          Eli Collins
        3. hdfs-3876.txt
          10 kB
          Eli Collins
        4. hdfs-3876.txt
          3 kB
          Eli Collins

          Issue Links

            Activity

              People

              • Assignee:
                eli Eli Collins
                Reporter:
                tlipcon Todd Lipcon
              • Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: