Uploaded image for project: 'Jackrabbit Content Repository'
  1. Jackrabbit Content Repository
  2. JCR-3457

Possible deadlock during external update

Agile BoardAttach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.2.13, 2.4.3
    • None
    • None
    • None

    Description

      During an external update ItemStateListeners are called while a write lock is held. See SharedItemStateManager#externalUpdate:1548 > SharedItemStateManager#doExternalUpdate:1616> ChangeLog#persisted:303,307,312 -> etc

      One of these ItemStateListeners is CachingHierarchyManager and it sets a regular java lock on 'cacheMonitor' during many of its notify methods (for instance CachingHierarchyManager#nodeRemoved:467), while at the same time, during holding that lock it may make calls to read item states (for instance CachingHierarchyManager#nodeRemoved:748).

      These notify methods are not only called during external updates but also during regular session and node operations. For instance NodeImpl.removeChildNode:642 -> NodeState#removeChildNodeEntry:447 -> NodeState#notifyNodeRemoved:911 -> etc.

      The scenario is then as follows.

      1. A client calls such a regular session or node operation, for instance remove a property from a node. The stack of that thread looks something like this (line numbers of JR 2.2.13):

      at java.lang.Object.wait(Native Method)

      • waiting on <0xe9f3d3d0> (a EDU.oswego.cs.dl.util.concurrent.Latch)
        at java.lang.Object.wait(Object.java:485)
        at EDU.oswego.cs.dl.util.concurrent.Latch.acquire(Unknown Source)
      • locked <0xe9f3d3d0> (a EDU.oswego.cs.dl.util.concurrent.Latch)
        at org.apache.jackrabbit.core.state.FineGrainedISMLocking.acquireReadLock(FineGrainedISMLocking.java:123)
        at org.apache.jackrabbit.core.state.SharedItemStateManager.acquireReadLock(SharedItemStateManager.java:1856)
        at org.apache.jackrabbit.core.state.SharedItemStateManager.getItemState(SharedItemStateManager.java:261)
        at org.apache.jackrabbit.core.state.LocalItemStateManager.getNodeState(LocalItemStateManager.java:109)
        at org.hippoecm.repository.jackrabbit.HippoLocalItemStateManager.getNodeState(HippoLocalItemStateManager.java:432)
        at org.apache.jackrabbit.core.state.LocalItemStateManager.getItemState(LocalItemStateManager.java:174)
      • locked <0xe0382110> (a org.hippoecm.repository.jackrabbit.HippoLocalItemStateManager)
        at org.apache.jackrabbit.core.state.ForkedXAItemStateManager.getItemState(ForkedXAItemStateManager.java:261)
        at org.hippoecm.repository.jackrabbit.HippoLocalItemStateManager.getCanonicalItemState(HippoLocalItemStateManager.java:328)
        at org.hippoecm.repository.dataprovider.HippoVirtualProvider.getCanonicalNodeState(HippoVirtualProvider.java:214)
        at org.hippoecm.repository.jackrabbit.FacetResultSetProvider.populate(FacetResultSetProvider.java:251)
        at org.hippoecm.repository.dataprovider.HippoNodeId.populate(HippoNodeId.java:85)
        at org.hippoecm.repository.jackrabbit.HippoLocalItemStateManager.getItemState(HippoLocalItemStateManager.java:351)
        at org.apache.jackrabbit.core.state.SessionItemStateManager.getItemState(SessionItemStateManager.java:161)
        at org.apache.jackrabbit.core.HierarchyManagerImpl.getItemState(HierarchyManagerImpl.java:152)
        at org.apache.jackrabbit.core.CachingHierarchyManager.nodeRemoved(CachingHierarchyManager.java:748)
        at org.apache.jackrabbit.core.CachingHierarchyManager.nodeRemoved(CachingHierarchyManager.java:472)
      • locked <0xe04ed0e8> (a java.lang.Object)
        at org.apache.jackrabbit.core.state.StateChangeDispatcher.notifyNodeRemoved(StateChangeDispatcher.java:191)
        at org.apache.jackrabbit.core.state.SessionItemStateManager.nodeRemoved(SessionItemStateManager.java:915)
        at org.apache.jackrabbit.core.state.StateChangeDispatcher.notifyNodeRemoved(StateChangeDispatcher.java:191)
        at org.apache.jackrabbit.core.state.LocalItemStateManager.nodeRemoved(LocalItemStateManager.java:606)
        at org.apache.jackrabbit.core.state.NodeState.notifyNodeRemoved(NodeState.java:906)
        at org.apache.jackrabbit.core.state.NodeState.removeChildNodeEntry(NodeState.java:443

      Notice that it succeeded in obtaining a lock on CachingHierarchyManager.cacheMonitor but is now waiting on a read lock in the SharedItemStateManager.

      2. This is because at the same time an external update is taking place. The stack of that thread looks something like (line numbers of JR 2.2.13):

      "jackrabbit-pool-2" daemon prio=10 tid=0xcdbd5400 nid=0x7463 waiting for monitor entry [0xcd2ae000]
      java.lang.Thread.State: BLOCKED (on object monitor)
      at org.apache.jackrabbit.core.CachingHierarchyManager.nodeModified(CachingHierarchyManager.java:322)

      • waiting to lock <0xe04ed0e8> (a java.lang.Object)
        at org.apache.jackrabbit.core.state.StateChangeDispatcher.notifyNodeModified(StateChangeDispatcher.java:175)
        at org.apache.jackrabbit.core.state.SessionItemStateManager.nodeModified(SessionItemStateManager.java:902)
        at org.apache.jackrabbit.core.state.StateChangeDispatcher.notifyNodeModified(StateChangeDispatcher.java:175)
        at org.apache.jackrabbit.core.state.LocalItemStateManager.stateModified(LocalItemStateManager.java:515)
        at org.apache.jackrabbit.core.state.ForkedXAItemStateManager.stateModified(ForkedXAItemStateManager.java:597)
        at org.apache.jackrabbit.core.state.StateChangeDispatcher.notifyStateModified(StateChangeDispatcher.java:107)
        at org.apache.jackrabbit.core.state.SharedItemStateManager.stateModified(SharedItemStateManager.java:411)
        at org.apache.jackrabbit.core.state.ItemState.notifyStateUpdated(ItemState.java:242)
        at org.apache.jackrabbit.core.state.ChangeLog.persisted(ChangeLog.java:300)
        at org.apache.jackrabbit.core.state.SharedItemStateManager.doExternalUpdate(SharedItemStateManager.java:1596)
        at org.apache.jackrabbit.core.state.SharedItemStateManager.externalUpdate(SharedItemStateManager.java:1528)

      In other words, the latter thread successfully obtained a write lock in SharedItemStateManager#externalUpdate:1545 but is now waiting for the lock obtained by 1 to be released.

      I don't see an easy way out of this. We can't move the notification of item state listeners outside the scope of the write lock can we? Changing CachingHierarchyManager to not make calls that may need a read lock in SISM also seems like a far-fetched idea.

      Because the deadlock occurs while holding a write lock the whole repository becomes unresponsive.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            unico@apache.org Unico Hommes

            Dates

              Created:
              Updated:

              Slack

                Issue deployment