Jackrabbit Content Repository
  1. Jackrabbit Content Repository
  2. JCR-2345

Many threads are blocked trying to lock the persistence manager

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 1.5.5
    • Fix Version/s: None
    • Component/s: jackrabbit-core, JCR 1.0.1
    • Labels:
      None
    • Environment:
      HP Unix

      Description

      We implemented a multi-threaded export functionlity for our JR repository, but it turns out that intended parallel behavior can't be achieved becuase most of the threads seemed to be waiting for OraclePersistenceManager.

      This is the stack trace for most of the waiting threads:

      – Blocked trying to get lock: org/apache/jackrabbit/core/persistence/bundle/OraclePersistenceManager@0x2b71b2c979c0[fat lock]
      at jrockit/vm/Threads.waitForUnblockSignal()V(Native Method)
      at jrockit/vm/Locks.fatLockBlockOrSpin(Locks.java:1675)[optimized]
      at jrockit/vm/Locks.lockFat(Locks.java:1776)[optimized]
      at jrockit/vm/Locks.monitorEnterSecondStageHard(Locks.java:1312)[optimized]
      at jrockit/vm/Locks.monitorEnterSecondStage(Locks.java:1259)[optimized]
      at org/apache/jackrabbit/core/persistence/bundle/AbstractBundlePersistenceManager.exists(AbstractBundlePersistenceManager.java:506)[optimized]
      at org/apache/jackrabbit/core/state/SharedItemStateManager.hasNonVirtualItemState(SharedItemStateManager.java:1343)[inlined]
      at org/apache/jackrabbit/core/state/SharedItemStateManager.hasItemState(SharedItemStateManager.java:297)[optimized]
      at org/apache/jackrabbit/core/state/XAItemStateManager.hasItemState(XAItemStateManager.java:295)[optimized]
      at org/apache/jackrabbit/core/state/SessionItemStateManager.getItemState(SessionItemStateManager.java:181)[optimized]
      at org/apache/jackrabbit/core/ItemManager.getItemData(ItemManager.java:282)[inlined]
      at org/apache/jackrabbit/core/ItemManager.getItemData(ItemManager.java:249)[inlined]
      at org/apache/jackrabbit/core/ItemManager.getNode(ItemManager.java:513)[inlined]
      at org/apache/jackrabbit/core/LazyItemIterator.prefetchNext(LazyItemIterator.java:109)[inlined]
      at org/apache/jackrabbit/core/LazyItemIterator.next(LazyItemIterator.java:230)[inlined]
      at org/apache/jackrabbit/core/LazyItemIterator.nextNode(LazyItemIterator.java:137)[optimized]
      ^-- Holding lock: org/apache/jackrabbit/core/ItemManager@0x2b71ec068f58[thin lock]
      at org/apache/jackrabbit/commons/xml/Exporter.exportNodes(Exporter.java:212)[optimized]
      at org/apache/jackrabbit/commons/xml/DocumentViewExporter.exportNode(DocumentViewExporter.java:77)[inlined]
      at org/apache/jackrabbit/commons/xml/Exporter.exportNode(Exporter.java:294)[inlined]
      at org/apache/jackrabbit/commons/xml/Exporter.export(Exporter.java:143)[optimized]
      at org/apache/jackrabbit/commons/AbstractSession.export(AbstractSession.java:462)[inlined]
      at org/apache/jackrabbit/commons/AbstractSession.exportDocumentView(AbstractSession.java:236)[inlined]
      at org/apache/jackrabbit/commons/AbstractSession.exportDocumentView(AbstractSession.java:281)[optimized]
      ^-- Holding lock: org/apache/jackrabbit/core/XASessionImpl@0x2b71ec068950[thin lock]

      This is the stack trace for blocking thread:

      at jrockit/net/SocketNativeIO.readBytesPinned(Ljava/io/FileDescriptor;[BIII)I(Native Method)
      at jrockit/net/SocketNativeIO.socketRead(SocketNativeIO.java:46)[optimized]
      at java/net/SocketInputStream.socketRead0(Ljava/io/FileDescriptor;[BIII)I(SocketInputStream.java)[inlined]
      at java/net/SocketInputStream.read(SocketInputStream.java:129)[optimized]
      at oracle/net/ns/Packet.receive()V(Unknown Source)[inlined]
      at oracle/net/ns/DataPacket.receive()V(Unknown Source)[optimized]
      at oracle/net/ns/NetInputStream.getNextPacket()V(Unknown Source)[optimized]
      at oracle/net/ns/NetInputStream.read([BII)I(Unknown Source)[inlined]
      at oracle/net/ns/NetInputStream.read([B)I(Unknown Source)[inlined]
      at oracle/net/ns/NetInputStream.read()I(Unknown Source)[optimized]
      at oracle/jdbc/driver/T4CMAREngine.unmarshalUB1(T4CMAREngine.java:1104)[inlined]
      at oracle/jdbc/driver/T4CMAREngine.unmarshalSB1(T4CMAREngine.java:1075)[inlined]
      at oracle/jdbc/driver/T4C8TTILob.receiveReply(T4C8TTILob.java:872)[optimized]
      at oracle/jdbc/driver/T4C8TTILob.getChunkSize(T4C8TTILob.java:329)[inlined]
      at oracle/jdbc/driver/T4CConnection.getChunkSize(T4CConnection.java:2026)[optimized]
      ^-- Holding lock: oracle/jdbc/driver/T4CConnection@0x2b71b2ce0650[thin lock]
      at oracle/sql/BLOB.getChunkSize(BLOB.java:389)[inlined]
      at oracle/sql/BLOB.getBufferSize(BLOB.java:410)[inlined]
      at oracle/sql/BLOB.getBinaryStream(BLOB.java:229)[optimized]
      at org/apache/jackrabbit/core/persistence/bundle/BundleDbPersistenceManager.getBytes(BundleDbPersistenceManager.java:1110)[inlined]
      at org/apache/jackrabbit/core/persistence/bundle/BundleDbPersistenceManager.loadBundle(BundleDbPersistenceManager.java:1142)[inlined]
      at org/apache/jackrabbit/core/persistence/bundle/BundleDbPersistenceManager.loadBundle(BundleDbPersistenceManager.java:1094)[inlined]
      at org/apache/jackrabbit/core/persistence/bundle/AbstractBundlePersistenceManager.getBundle(AbstractBundlePersistenceManager.java:701)[inlined]
      at org/apache/jackrabbit/core/persistence/bundle/AbstractBundlePersistenceManager.exists(AbstractBundlePersistenceManager.java:506)[optimized]
      ^-- Holding lock: org/apache/jackrabbit/core/persistence/bundle/OraclePersistenceManager@0x2b71b2c979c0[fat lock]
      at org/apache/jackrabbit/core/state/SharedItemStateManager.hasNonVirtualItemState(SharedItemStateManager.java:1343)[inlined]
      at org/apache/jackrabbit/core/state/SharedItemStateManager.hasItemState(SharedItemStateManager.java:297)[optimized]
      at org/apache/jackrabbit/core/state/XAItemStateManager.hasItemState(XAItemStateManager.java:295)[optimized]
      at org/apache/jackrabbit/core/state/SessionItemStateManager.getItemState(SessionItemStateManager.java:181)[optimized]
      at org/apache/jackrabbit/core/ItemManager.getItemData(ItemManager.java:282)[inlined]
      at org/apache/jackrabbit/core/ItemManager.getItemData(ItemManager.java:249)[inlined]
      at org/apache/jackrabbit/core/ItemManager.getNode(ItemManager.java:513)[inlined]
      at org/apache/jackrabbit/core/LazyItemIterator.prefetchNext(LazyItemIterator.java:109)[inlined]
      at org/apache/jackrabbit/core/LazyItemIterator.next(LazyItemIterator.java:230)[inlined]
      at org/apache/jackrabbit/core/LazyItemIterator.nextNode(LazyItemIterator.java:137)[optimized]
      ^-- Holding lock: org/apache/jackrabbit/core/ItemManager@0x2b71dcf8c668[thin lock]
      at org/apache/jackrabbit/commons/xml/Exporter.exportNodes(Exporter.java:212)[optimized]
      at org/apache/jackrabbit/commons/xml/DocumentViewExporter.exportNode(DocumentViewExporter.java:77)[inlined]
      at org/apache/jackrabbit/commons/xml/Exporter.exportNode(Exporter.java:294)[inlined]
      at org/apache/jackrabbit/commons/xml/Exporter.export(Exporter.java:143)[optimized]
      at org/apache/jackrabbit/commons/AbstractSession.export(AbstractSession.java:462)[inlined]
      at org/apache/jackrabbit/commons/AbstractSession.exportDocumentView(AbstractSession.java:236)[inlined]
      at org/apache/jackrabbit/commons/AbstractSession.exportDocumentView(AbstractSession.java:281)[optimized]
      ^-- Holding lock: org/apache/jackrabbit/core/XASessionImpl@0x2b71dcf8c3c0[thin lock]

      Oracle database performfs as usual. So, I doubt that it is an Oracle problem. In any case is there any way to avoid other threads blocking each other while performing read operations from PM?

      1. OraclePoolingPersistenceManager.java
        12 kB
        Andrey Adamovich
      2. dump3.txt
        133 kB
        Andrey Adamovich
      3. dump2.txt
        129 kB
        Andrey Adamovich
      4. dump1.txt
        129 kB
        Andrey Adamovich

        Issue Links

          Activity

          Hide
          Andrey Adamovich added a comment -

          Example configuration:

          <PersistenceManager class="...OraclePoolingPersistenceManager">
          <param name="poolSize" value="15" />
          <param name="externalBLOBs" value="false" />
          <param name="bundleCacheSize" value="25" />
          <param name="consistencyCheck" value="false" />
          <param name="driver" value="oracle.jdbc.OracleDriver" />
          <param name="url" value="..." />
          <param name="user" value="..." />
          <param name="password" value="..." />
          <param name="schema" value="oracle" />
          <param name="schemaObjectPrefix" value="$

          {wsp.name}

          _" />
          <param name="tableSpace" value="" />
          <param name="errorHandling" value="" />
          </PersistenceManager>

          Show
          Andrey Adamovich added a comment - Example configuration: <PersistenceManager class="...OraclePoolingPersistenceManager"> <param name="poolSize" value="15" /> <param name="externalBLOBs" value="false" /> <param name="bundleCacheSize" value="25" /> <param name="consistencyCheck" value="false" /> <param name="driver" value="oracle.jdbc.OracleDriver" /> <param name="url" value="..." /> <param name="user" value="..." /> <param name="password" value="..." /> <param name="schema" value="oracle" /> <param name="schemaObjectPrefix" value="$ {wsp.name} _" /> <param name="tableSpace" value="" /> <param name="errorHandling" value="" /> </PersistenceManager>
          Hide
          Andrey Adamovich added a comment -

          Attached a simple implementation of OraclePM pool. It's not a patch, but a workaround that seems to work for us.

          Show
          Andrey Adamovich added a comment - Attached a simple implementation of OraclePM pool. It's not a patch, but a workaround that seems to work for us.
          Hide
          Andrey Adamovich added a comment -

          Thanks, Stephan. We will try the upgrade, but still the fact that we are pretty much dumping big part of the repository in several threads will eventually make all caching levels useless and hit the PM singleton.

          Show
          Andrey Adamovich added a comment - Thanks, Stephan. We will try the upgrade, but still the fact that we are pretty much dumping big part of the repository in several threads will eventually make all caching levels useless and hit the PM singleton.
          Hide
          Stefan Guggisberg added a comment -

          > 2) But JCR-2186 is about some exception in SISM, right?

          no, it's about an unneccessary 'hastItemState()' call in SessionItemStateManager#getItemState.
          i noticed that call on a number of your stack traces. the fix provided in JCR-2186 should reduce
          the pm calls in your scenario significantly, thus reducing potential lock contention.

          i am optimistic that jackrabbit 1.6 will significantly improve performance of concurrent read operations.

          > 3) I have no chance to test it against different backend at the moment. Though we had used Derby initialy and didn't see similar problems, but that was probably because the load was much lower at that time. Also as I see from the thread dumps the problem is in the AbstractBundlePesistenceManager syncronised methods and the fact that there is only one instance of Oracle PM per repository.

          per workspace, to be precise read operations are served from the SISM cache. only if the item is not cahced yet, the call is delegated to the pm.

          >
          > 4) Yes, we tried SUN's JDK, and the results are pretty much the same.
          >

          Show
          Stefan Guggisberg added a comment - > 2) But JCR-2186 is about some exception in SISM, right? no, it's about an unneccessary 'hastItemState()' call in SessionItemStateManager#getItemState. i noticed that call on a number of your stack traces. the fix provided in JCR-2186 should reduce the pm calls in your scenario significantly, thus reducing potential lock contention. i am optimistic that jackrabbit 1.6 will significantly improve performance of concurrent read operations. > 3) I have no chance to test it against different backend at the moment. Though we had used Derby initialy and didn't see similar problems, but that was probably because the load was much lower at that time. Also as I see from the thread dumps the problem is in the AbstractBundlePesistenceManager syncronised methods and the fact that there is only one instance of Oracle PM per repository. per workspace, to be precise read operations are served from the SISM cache. only if the item is not cahced yet, the call is delegated to the pm. > > 4) Yes, we tried SUN's JDK, and the results are pretty much the same. >
          Hide
          Andrey Adamovich added a comment -

          1) No, we haven't used 1.6, but as far as I noticed from the code nothing really changed in bundle db pm implementations from version 1.5.5 till 1.6.0 (correct me if I'm wrong) and PM is still kind of a singleton, that's why we are seeing those locking issues.

          2) But JCR-2186 is about some exception in SISM, right? We are not seeing any exceptions, but just a lot of stuck threads. But, of course, we can give a try for 1.6 upgrade.

          3) I have no chance to test it against different backend at the moment. Though we had used Derby initialy and didn't see similar problems, but that was probably because the load was much lower at that time. Also as I see from the thread dumps the problem is in the AbstractBundlePesistenceManager syncronised methods and the fact that there is only one instance of Oracle PM per repository.

          4) Yes, we tried SUN's JDK, and the results are pretty much the same.

          Show
          Andrey Adamovich added a comment - 1) No, we haven't used 1.6, but as far as I noticed from the code nothing really changed in bundle db pm implementations from version 1.5.5 till 1.6.0 (correct me if I'm wrong) and PM is still kind of a singleton, that's why we are seeing those locking issues. 2) But JCR-2186 is about some exception in SISM, right? We are not seeing any exceptions, but just a lot of stuck threads. But, of course, we can give a try for 1.6 upgrade. 3) I have no chance to test it against different backend at the moment. Though we had used Derby initialy and didn't see similar problems, but that was probably because the load was much lower at that time. Also as I see from the thread dumps the problem is in the AbstractBundlePesistenceManager syncronised methods and the fact that there is only one instance of Oracle PM per repository. 4) Yes, we tried SUN's JDK, and the results are pretty much the same.
          Hide
          Andrey Adamovich added a comment -

          Thanks for your reply Stefan. And good point about several bundle caches having different versions of the same content. I will try to write the pooled version and see how it goes as in our application it's mostly reading data at the moment and that's very unfortunate that one reading operation from PM may block other reading operations.

          Show
          Andrey Adamovich added a comment - Thanks for your reply Stefan. And good point about several bundle caches having different versions of the same content. I will try to write the pooled version and see how it goes as in our application it's mostly reading data at the moment and that's very unfortunate that one reading operation from PM may block other reading operations.
          Hide
          Stefan Guggisberg added a comment -

          some general remarks:

          • did you try jackrabbit 1.6? there was one notable change in SessionItemStateManager (JCR-2186) which could make a difference.
          • did you test with a different backend/pm (e.g. derby or mysql)? did you test with a different vm (i noticed you are using jrockit)?
          Show
          Stefan Guggisberg added a comment - some general remarks: did you try jackrabbit 1.6? there was one notable change in SessionItemStateManager ( JCR-2186 ) which could make a difference. did you test with a different backend/pm (e.g. derby or mysql)? did you test with a different vm (i noticed you are using jrockit)?
          Hide
          Stefan Guggisberg added a comment -

          > I have looked through the code of AbstractBundlePersistenceManager.java and it seems many methods of it are sycnronised when variable bundles is accessed. That variable is of type BundleCache. Will it not be possible to move synronisation tasks directly to that class? Inside that as I see it uses simple LinkedMap, but what if it used the ConcurrentMap instead? Will then the syncronisation be less intensive?

          i am not too familiar with the bundle db pm implementation. maybe it's worth a try. patches are welcome

          Show
          Stefan Guggisberg added a comment - > I have looked through the code of AbstractBundlePersistenceManager.java and it seems many methods of it are sycnronised when variable bundles is accessed. That variable is of type BundleCache. Will it not be possible to move synronisation tasks directly to that class? Inside that as I see it uses simple LinkedMap, but what if it used the ConcurrentMap instead? Will then the syncronisation be less intensive? i am not too familiar with the bundle db pm implementation. maybe it's worth a try. patches are welcome
          Hide
          Stefan Guggisberg added a comment -

          > I have an idea of implementing a new PM that will be just a pool of Oracle PMs and delegate all work to one of the instances. Is it a good idea?

          probably not. you'll end up with redundant BundleCache instances. you might also run into inconsistencies as not all BundleCaches are updated on write operations. implementing write operations correctly would be quite tricky since they depend on jdbc commit.

          Show
          Stefan Guggisberg added a comment - > I have an idea of implementing a new PM that will be just a pool of Oracle PMs and delegate all work to one of the instances. Is it a good idea? probably not. you'll end up with redundant BundleCache instances. you might also run into inconsistencies as not all BundleCaches are updated on write operations. implementing write operations correctly would be quite tricky since they depend on jdbc commit.
          Hide
          Andrey Adamovich added a comment -

          I have an idea of implementing a new PM that will be just a pool of Oracle PMs and delegate all work to one of the instances. Is it a good idea? Or is there any other solution? This problem is very critical to us.

          Show
          Andrey Adamovich added a comment - I have an idea of implementing a new PM that will be just a pool of Oracle PMs and delegate all work to one of the instances. Is it a good idea? Or is there any other solution? This problem is very critical to us.
          Hide
          Andrey Adamovich added a comment -

          Can anyone, please, comment on this issue?

          Show
          Andrey Adamovich added a comment - Can anyone, please, comment on this issue?
          Hide
          Andrey Adamovich added a comment -

          I have looked through the code of AbstractBundlePersistenceManager.java and it seems many methods of it are sycnronised when variable bundles is accessed. That variable is of type BundleCache. Will it not be possible to move synronisation tasks directly to that class? Inside that as I see it uses simple LinkedMap, but what if it used the ConcurrentMap instead? Will then the syncronisation be less intensive?

          Show
          Andrey Adamovich added a comment - I have looked through the code of AbstractBundlePersistenceManager.java and it seems many methods of it are sycnronised when variable bundles is accessed. That variable is of type BundleCache. Will it not be possible to move synronisation tasks directly to that class? Inside that as I see it uses simple LinkedMap, but what if it used the ConcurrentMap instead? Will then the syncronisation be less intensive?
          Hide
          Andrey Adamovich added a comment -

          Attached thread dumps showing the problem.

          Show
          Andrey Adamovich added a comment - Attached thread dumps showing the problem.

            People

            • Assignee:
              Unassigned
              Reporter:
              Andrey Adamovich
            • Votes:
              1 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:

                Development