Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-9137

DeadLock between DataNode#refreshVolumes and BPOfferService#registrationSucceeded

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.7.1, 3.0.0-alpha1
    • Fix Version/s: 2.8.0, 3.0.0-alpha1
    • Component/s: datanode
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      I can see this code flows between DataNode#refreshVolumes and BPOfferService#registrationSucceeded could cause deadLock.
      In practice situation may be rare as user calling refreshVolumes at the time DN registration with NN. But seems like issue can happen.

      Reason for deadLock:

      1) refreshVolumes will be called with DN lock and after at the end it will also trigger Block report. In the Block report call, BPServiceActor#triggerBlockReport calls toString on bpos. Here it takes readLock on bpos.
      DN lock then boos lock

      2) BPOfferSetrvice#registrationSucceeded call is taking writeLock on bpos and calling dn.bpRegistrationSucceeded which is again synchronized call on DN.
      bpos lock and then DN lock.

      So, this can clearly create dead lock.
      I think simple fix could be to move triggerBlockReport call outside out DN lock and I feel that call may not be really needed inside DN lock.

      Thoughts?

        Attachments

        1. HDFSS-9137.02.patch
          2 kB
          Uma Maheswara Rao G
        2. HDFS-9137.01-WithPreservingRootExceptions.patch
          2 kB
          Uma Maheswara Rao G
        3. HDFS-9137.00.patch
          2 kB
          Uma Maheswara Rao G

          Issue Links

            Activity

              People

              • Assignee:
                umamaheswararao Uma Maheswara Rao G
                Reporter:
                umamaheswararao Uma Maheswara Rao G
              • Votes:
                0 Vote for this issue
                Watchers:
                13 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: