Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-11352

Potential deadlock in NN when failing over

    XMLWordPrintableJSON

Details

    Description

      HDFS-11180 fixed a general class of deadlock that can occur when failing over between the MetricsSystemImpl and FSEditLog (see comments on that JIRA for more details). In trunk and branch-2/branch-2.8 this fix was successful by making the metrics calls not synchronize on FSEditLog.

      In branch-2.6 and branch-2.7 there is one more method, FSNamesystem#getTransactionsSinceLastCheckpoint, which still requires the lock on FSEditLog and thus can result in the same deadlock scenario. This can be seen by running TestFSNamesystemMBean#testWithFSEditLogLock with the patch in HDFS-11290 on either of these branches (it fails currently).

      Attachments

        1. HDFS-11352-branch-2.7.000.patch
          0.8 kB
          Erik Krogen

        Issue Links

          Activity

            People

              xkrogen Erik Krogen
              xkrogen Erik Krogen
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: