Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-8770

NN should not RPC to self to find trash defaults (causes deadlock)

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • 2.0.2-alpha
    • 2.0.2-alpha
    • trash
    • None
    • Reviewed

    Description

      When transitioning a SBN to active, I ran into the following situation:

      • the TrashPolicy first gets loaded by an IPC Server Handler thread. The initialize function then tries to make an RPC to the same node to find out the defaults.
      • This is happening inside the NN write lock (since it's part of the active initialization). Hence, all of the other handler threads are already blocked waiting to get the NN lock.
      • Since no handler threads are free, the RPC blocks forever and the NN never enters active state.

      We need to have a general policy that the NN should never make RPCs to itself for any reason, due to potential for deadlocks like this.

      Attachments

        1. hdfs-3876.txt
          11 kB
          Eli Collins
        2. hdfs-3876.txt
          11 kB
          Eli Collins
        3. hdfs-3876.txt
          10 kB
          Eli Collins
        4. hdfs-3876.txt
          3 kB
          Eli Collins

        Issue Links

          Activity

            People

              eli Eli Collins
              tlipcon Todd Lipcon
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: