Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-11352

Potential deadlock in NN when failing over

    Details

      Description

      HDFS-11180 fixed a general class of deadlock that can occur when failing over between the MetricsSystemImpl and FSEditLog (see comments on that JIRA for more details). In trunk and branch-2/branch-2.8 this fix was successful by making the metrics calls not synchronize on FSEditLog.

      In branch-2.6 and branch-2.7 there is one more method, FSNamesystem#getTransactionsSinceLastCheckpoint, which still requires the lock on FSEditLog and thus can result in the same deadlock scenario. This can be seen by running TestFSNamesystemMBean#testWithFSEditLogLock with the patch in HDFS-11290 on either of these branches (it fails currently).

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                xkrogen Erik Krogen
                Reporter:
                xkrogen Erik Krogen
              • Votes:
                0 Vote for this issue
                Watchers:
                8 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: