Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.2.2
    • Component/s: clustering
    • Labels:
      None

      Description

      A deadlock can occur when two cluster nodes concurrently register or unregister node types.

      Reason:

      NodeTypeRegistry.registerNodeTypes is synchronized, and calls eventChannel.registered(ntDefs), which calls AbstractJournal.lockAndSync(), which tries to lock AbstractJournal.rwLock.

      On the other hand, AbstractJournal.sync() locks AbstractJournal.rwLock, then calls NodeTypeRecord.process, which calls NodeTypeRegistry.unregisterNodeTypes, which is also synchronized.

      Possible solutions: Either

      • NodeTypeRegistry doesn't synchronize on the object when calling a eventChannel method,
      • or NodeTypeRegistry locks AbstractJournal.rwLock before synchronizing.

      There might be other solutions.

      1. jcr-2866-a.patch
        13 kB
        Thomas Mueller

        Issue Links

          Activity

          Hide
          Thomas Mueller added a comment -

          This patch excludes the eventChannel methods from the synchronized block.

          Show
          Thomas Mueller added a comment - This patch excludes the eventChannel methods from the synchronized block.
          Hide
          Thomas Mueller added a comment -

          This problem should be fixed now.

          Show
          Thomas Mueller added a comment - This problem should be fixed now.
          Hide
          Sergiy Shyrkov added a comment -

          Hello Thomas,

          does this patch also addresses the issue, reported in JCR-2623 , i.e. is it the same problem?

          Thank you in advance!

          Kind regards
          Sergiy

          Show
          Sergiy Shyrkov added a comment - Hello Thomas, does this patch also addresses the issue, reported in JCR-2623 , i.e. is it the same problem? Thank you in advance! Kind regards Sergiy
          Hide
          Thomas Mueller added a comment -

          Yes, I think it's the same issue. I resolved it as a duplicate now.

          I found and fixed the issue independently... I should have searched before filing a new bug, sorry.

          Show
          Thomas Mueller added a comment - Yes, I think it's the same issue. I resolved it as a duplicate now. I found and fixed the issue independently... I should have searched before filing a new bug, sorry.
          Hide
          Sergiy Shyrkov added a comment -

          These are good news, thank you!
          Any plans perhaps to backport the patch to 1.5 or 1.6 branch (actually, we are experiencing it also on our projects with Jackrabbit 1.5.0). Or it was already done?

          Show
          Sergiy Shyrkov added a comment - These are good news, thank you! Any plans perhaps to backport the patch to 1.5 or 1.6 branch (actually, we are experiencing it also on our projects with Jackrabbit 1.5.0). Or it was already done?
          Hide
          Thomas Mueller added a comment -

          I don't have any plans to backport the fix.

          But I wonder why you still use 1.5 / 1.6. Why don't you upgrade?

          Show
          Thomas Mueller added a comment - I don't have any plans to backport the fix. But I wonder why you still use 1.5 / 1.6. Why don't you upgrade?
          Hide
          Sergiy Shyrkov added a comment -

          > I don't have any plans to backport the fix.
          Good, I just wanted to know. Thank you!

          > But I wonder why you still use 1.5 / 1.6. Why don't you upgrade?
          We are using the latest 2.2.1 in our current project version (Jahia 6.5), but we have a previous version of the product (Jahia 6.1.1) that is in production and was using 1.5.0 at that time.

          We will see, if it will be two critical for our customers (for now, we had two incidents with deadlock on a cluster startup), we will apply a patch on our own.

          Thank you for the clarifications!

          Show
          Sergiy Shyrkov added a comment - > I don't have any plans to backport the fix. Good, I just wanted to know. Thank you! > But I wonder why you still use 1.5 / 1.6. Why don't you upgrade? We are using the latest 2.2.1 in our current project version (Jahia 6.5), but we have a previous version of the product (Jahia 6.1.1) that is in production and was using 1.5.0 at that time. We will see, if it will be two critical for our customers (for now, we had two incidents with deadlock on a cluster startup), we will apply a patch on our own. Thank you for the clarifications!
          Hide
          John Langley added a comment -

          Another similar question about applicability to different branches.
          Will this fix be applied to the 2.1 branch? We're running 2.1.3 currently.

          Thanks in advance for clarifications.

          Show
          John Langley added a comment - Another similar question about applicability to different branches. Will this fix be applied to the 2.1 branch? We're running 2.1.3 currently. Thanks in advance for clarifications.

            People

            • Assignee:
              Thomas Mueller
              Reporter:
              Thomas Mueller
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development