Uploaded image for project: 'OpenJPA'
  1. OpenJPA
  2. OPENJPA-2470

DataCacheManagerImpl infinite loop for checking if classes are cachable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 2.0.1, 2.1.1, 2.2.2
    • Fix Version/s: 2.2.3, 2.4.0
    • Component/s: datacache
    • Labels:
      None
    • Environment:
      linux 64

      Description

      We're integrated openjpa into our latest software delivery and in the last three months we're ran two times into this issue. First time it was not investigated at all, but the last time we've had the chance to grab extra information out of the system.

      At that point in time, we've had one of our processes not getting response back from the application that was using openjpa.

      All the connection threads were running the same calls:

      2014-01-07 07:21:37,716 - INFO "ClientConnection - 9" prio=10 tid=0x00007f33d400e000 nid=0x8d9 runnable [0x00007f32db1ef000]
      2014-01-07 07:21:37,716 - INFO java.lang.Thread.State: RUNNABLE
      2014-01-07 07:21:37,716 - INFO at java.util.HashMap.getEntry(Unknown Source)
      2014-01-07 07:21:37,716 - INFO at java.util.HashMap.get(Unknown Source)
      2014-01-07 07:21:37,716 - INFO at org.apache.openjpa.datacache.DataCacheManagerImpl.isCachable(DataCacheManagerImpl.java:145)
      2014-01-07 07:21:37,716 - INFO at org.apache.openjpa.datacache.DataCacheManagerImpl.selectCache(DataCacheManagerImpl.java:128)
      2014-01-07 07:21:37,716 - INFO at org.apache.openjpa.datacache.DataCacheStoreManager.initialize(DataCacheStoreManager.java:358)
      2014-01-07 07:21:37,716 - INFO at org.apache.openjpa.kernel.DelegatingStoreManager.initialize(DelegatingStoreManager.java:112)
      2014-01-07 07:21:37,716 - INFO at org.apache.openjpa.kernel.ROPStoreManager.initialize(ROPStoreManager.java:57)
      2014-01-07 07:21:37,716 - INFO at org.apache.openjpa.kernel.BrokerImpl.initialize(BrokerImpl.java:1027)
      2014-01-07 07:21:37,716 - INFO at org.apache.openjpa.kernel.BrokerImpl.find(BrokerImpl.java:985)
      2014-01-07 07:21:37,716 - INFO at org.apache.openjpa.kernel.BrokerImpl.find(BrokerImpl.java:907)
      2014-01-07 07:21:37,716 - INFO at org.apache.openjpa.jdbc.kernel.JDBCStoreManager.load(JDBCStoreManager.java:1041)
      2014-01-07 07:21:37,716 - INFO at org.apache.openjpa.jdbc.sql.AbstractResult.load(AbstractResult.java:280)
      2014-01-07 07:21:37,716 - INFO at org.apache.openjpa.jdbc.sql.SelectImpl$SelectResult.load(SelectImpl.java:2381)
      2014-01-07 07:21:37,716 - INFO at org.apache.openjpa.jdbc.meta.strats.RelationToManyInverseKeyFieldStrategy.loadElement(RelationToManyInverseKeyFieldStrategy.java:90)
      2014-01-07 07:21:37,716 - INFO at org.apache.openjpa.jdbc.meta.strats.RelationCollectionInverseKeyFieldStrategy.loadElement(RelationCollectionInverseKeyFieldStrategy.java:76)
      2014-01-07 07:21:37,716 - INFO at org.apache.openjpa.jdbc.meta.strats.StoreCollectionFieldStrategy.load(StoreCollectionFieldStrategy.java:558)
      2014-01-07 07:21:37,716 - INFO at org.apache.openjpa.jdbc.meta.FieldMapping.load(FieldMapping.java:934)
      2014-01-07 07:21:37,717 - INFO at org.apache.openjpa.jdbc.kernel.JDBCStoreManager.load(JDBCStoreManager.java:702)
      2014-01-07 07:21:37,717 - INFO at org.apache.openjpa.kernel.DelegatingStoreManager.load(DelegatingStoreManager.java:117)
      2014-01-07 07:21:37,717 - INFO at org.apache.openjpa.datacache.DataCacheStoreManager.load(DataCacheStoreManager.java:461)
      2014-01-07 07:21:37,717 - INFO at org.apache.openjpa.kernel.DelegatingStoreManager.load(DelegatingStoreManager.java:117)
      2014-01-07 07:21:37,717 - INFO at org.apache.openjpa.kernel.ROPStoreManager.load(ROPStoreManager.java:78)
      2014-01-07 07:21:37,717 - INFO at org.apache.openjpa.kernel.StateManagerImpl.loadFields(StateManagerImpl.java:3061)
      2014-01-07 07:21:37,717 - INFO at org.apache.openjpa.kernel.StateManagerImpl.loadField(StateManagerImpl.java:3136)
      2014-01-07 07:21:37,717 - INFO at org.apache.openjpa.kernel.StateManagerImpl.beforeAccessField(StateManagerImpl.java:1606)
      2014-01-07 07:21:37,717 - INFO at org.apache.openjpa.kernel.StateManagerImpl.accessingField(StateManagerImpl.java:1591)
      ....

      We've narrowed this down to the fact that a HashMap is used in the org.apache.openjpa.datacache.DataCacheManagerImpl to globally serv our all the threads. So multiple threads can add and get information from the _cacheable at the same time.

      HashMaps are not thread safe and can get corrupted with eating the entire CPU. The problem is better described here (but you can find it in a lot of places):
      http://mailinator.blogspot.com/2009/06/beautiful-race-condition.html

      I could not find this bug logged yet and I'm surprised that nobody has ran into this yet.

      Our plan for now is to just switch to ConcurrentHashMap instantiation for the _cacheable Map.

        Activity

        Hide
        curtisr7 Rick Curtis added a comment -

        I'm also surprised that no one else has encountered this same problem. I'll try to get a fix committed today for this one.

        Show
        curtisr7 Rick Curtis added a comment - I'm also surprised that no one else has encountered this same problem. I'll try to get a fix committed today for this one.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 1558594 from Rick Curtis in branch 'openjpa/trunk'
        [ https://svn.apache.org/r1558594 ]

        OPENJPA-2470 : Update DataCacheManagerImpl to use a ConcurrentHashMap rather than a HashMap.

        Show
        jira-bot ASF subversion and git services added a comment - Commit 1558594 from Rick Curtis in branch 'openjpa/trunk' [ https://svn.apache.org/r1558594 ] OPENJPA-2470 : Update DataCacheManagerImpl to use a ConcurrentHashMap rather than a HashMap.
        Hide
        curtisr7 Rick Curtis added a comment -

        I committed a code change to trunk for this issue. I also created a unit test to expose the reported problem, but I don't think I'm going to commit it as it is quite machine / hardware dependent and I can't see someone actually regressing this behavior.

        This JIRA does expose a larger issue that might exist in other parts of the code base. I'm going to take a couple hours tomorrow to dig around to see if any other areas jump out at me.

        Show
        curtisr7 Rick Curtis added a comment - I committed a code change to trunk for this issue. I also created a unit test to expose the reported problem, but I don't think I'm going to commit it as it is quite machine / hardware dependent and I can't see someone actually regressing this behavior. This JIRA does expose a larger issue that might exist in other parts of the code base. I'm going to take a couple hours tomorrow to dig around to see if any other areas jump out at me.
        Hide
        sebyonthenet Seb Mo added a comment -

        Thank you Ric for the quick reaction on this ticket. Would it be possible you can let me know if it's planned to include this change in 2.2.x in the near future or we'll have to wait for 2.4?

        Show
        sebyonthenet Seb Mo added a comment - Thank you Ric for the quick reaction on this ticket. Would it be possible you can let me know if it's planned to include this change in 2.2.x in the near future or we'll have to wait for 2.4?
        Hide
        curtisr7 Rick Curtis added a comment -

        I'll check with the owner of the 2.2.x branch to see his thoughts on getting it committed to there.

        Show
        curtisr7 Rick Curtis added a comment - I'll check with the owner of the 2.2.x branch to see his thoughts on getting it committed to there.
        Hide
        sebyonthenet Seb Mo added a comment -

        Could you please let me know if you found other places that required a similar fix? I see no other changes besides the DCMI class tight to this ticket. Thanks

        Show
        sebyonthenet Seb Mo added a comment - Could you please let me know if you found other places that required a similar fix? I see no other changes besides the DCMI class tight to this ticket. Thanks
        Hide
        curtisr7 Rick Curtis added a comment -

        I didn't find any other areas of the code that require a similar fix.

        Show
        curtisr7 Rick Curtis added a comment - I didn't find any other areas of the code that require a similar fix.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 1696353 from Heath Thomann in branch 'openjpa/branches/2.2.x'
        [ https://svn.apache.org/r1696353 ]

        OPENJPA-2470: Update DataCacheManagerImpl to use a ConcurrentHashMap rather than a HashMap - backported to 2.2.x Rick Curtis' changes from trunk.

        Show
        jira-bot ASF subversion and git services added a comment - Commit 1696353 from Heath Thomann in branch 'openjpa/branches/2.2.x' [ https://svn.apache.org/r1696353 ] OPENJPA-2470 : Update DataCacheManagerImpl to use a ConcurrentHashMap rather than a HashMap - backported to 2.2.x Rick Curtis' changes from trunk.

          People

          • Assignee:
            curtisr7 Rick Curtis
            Reporter:
            sebyonthenet Seb Mo
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development