Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.22.0
    • Fix Version/s: None
    • Component/s: namenode
    • Labels:
      None

      Description

      FSEditLog.rollEditLog locks the log, then the FSImage. FSImage.close locks the image, then the log.

      1. cycle.png
        33 kB
        Todd Lipcon

        Activity

        Hide
        cos Konstantin Boudnik added a comment -

        May be it is a good time to start looking into JSure more closely?

        Show
        cos Konstantin Boudnik added a comment - May be it is a good time to start looking into JSure more closely?
        Hide
        tlipcon Todd Lipcon added a comment -

        Yea, from what I read on that wiki page you wrote, JSure sounds like it would help with this. I found these deadlocks using jcarder, which works at runtime to detect lock order inversions with a bit of overhead. I had to make a few changes to jcarder ( http://github.com/toddlipcon/jcarder/tree/cloudera ) but at this point it's very effective for this and doesn't require any extra development effort. Once those changes get incorporated into a jcarder release, I hope to get it integrated into the Hudson test-patch.

        Show
        tlipcon Todd Lipcon added a comment - Yea, from what I read on that wiki page you wrote, JSure sounds like it would help with this. I found these deadlocks using jcarder, which works at runtime to detect lock order inversions with a bit of overhead. I had to make a few changes to jcarder ( http://github.com/toddlipcon/jcarder/tree/cloudera ) but at this point it's very effective for this and doesn't require any extra development effort. Once those changes get incorporated into a jcarder release, I hope to get it integrated into the Hudson test-patch.
        Hide
        slukog Gokul added a comment -


        since the 0.20 versions doesn't have the attemptRestoreRemovedStorage() method in FsImage, i think the there is no possibility for deadlock in 0.20.1 or 0.20.2 . Is it?

        Show
        slukog Gokul added a comment - since the 0.20 versions doesn't have the attemptRestoreRemovedStorage() method in FsImage, i think the there is no possibility for deadlock in 0.20.1 or 0.20.2 . Is it?

          People

          • Assignee:
            Unassigned
            Reporter:
            tlipcon Todd Lipcon
          • Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development