OpenJPA
  1. OpenJPA
  2. OPENJPA-637

Significant performance degradation when data cache is enabled

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.2.0
    • Fix Version/s: 1.2.0
    • Component/s: datacache, lib
    • Labels:
      None

      Description

      Performance testing is showing a severe data cache performance degradation when moving from 1.0.x OpenJPA code to 1.2.0 level code. Profiling showed the problem to be in the new random eviction scheme which runs when the cache reaches its maximum number of entries. This code was changed significantly when OpenJPA moved to Java 5 java.util.concurrent.ConcurrentHashMap and away from the OpenJPA implementation of ConcurrentHashMap. A macro-benchmark showed a 20% performance degradation from base 1.2.0 code when the cache reaches its maximum size; prompting eviction in order to add new cache entries.

      I've found that the new random eviction code appears to be improved in the very recent 666903 commit, but data cache performance remains considerably slower than the 1.0.x implementation. Profiles with the 666903 changes show test threads to be waiting on the reentrant write lock in the CacheMap wrapper (which now wrappers a max size capable, null handling, subclass of java.util.concurrent.ConcurrentHashMap). Investigation is underway to determine whether the write lock is necessary (ie. can java.util.conncurrentConcurrentHashMap manage the cache without the need for external locking) and/or if changes could be made which would result in a significant reduction in contention for the lock. Any thoughts/ideas on that would be extremely helpful.

      Performance tests run with the 1.2.0 code base, using the OpenJPA version of ConcurrentHashMap (instead of the Java 5 java.util.concurrent.ConcurrentHashMap-based implementation) have shown that performance of the data cache is significantly better when the legacy OpenJPA implementation is used. Based on the results, it appears that OpenJPA should be using the the legacy ConcurrentHashMap instead of the Java 5-based implementation – or the new Java 5-based implementation needs to be improved considerably in order to perform as well as 1.0.x.

      I am opening this as a 1.2.0 issue, although it very likely affects 1.1.x as well. Testing has not been performed on 1.1.x to confirm the problem exists on that release.

      1. OPENJPA-637.patch
        33 kB
        Jeremy Bauer
      2. CacheImplTest.jar
        79 kB
        Jeremy Bauer

        Activity

        Hide
        Jeremy Bauer added a comment -

        Attaching a patch for 1.2.0 which adds-back and utilizes the OpenJPA ConcurrentHashMap implementation. Based on benchmark results and additional testing with the Java 5-based implementation, use of the OpenJPA implementation appears to be the best course of action. Comments, please.

        Show
        Jeremy Bauer added a comment - Attaching a patch for 1.2.0 which adds-back and utilizes the OpenJPA ConcurrentHashMap implementation. Based on benchmark results and additional testing with the Java 5-based implementation, use of the OpenJPA implementation appears to be the best course of action. Comments, please.
        Hide
        Patrick Linskey added a comment -

        How many cores / CPUs were being used in the benchmark runs?

        Show
        Patrick Linskey added a comment - How many cores / CPUs were being used in the benchmark runs?
        Hide
        Jeremy Bauer added a comment -

        The benchmark system is a hyper-threaded 4-way 3.8GHz Xeon. The benchmark is exercised with 50 concurrent users.

        Show
        Jeremy Bauer added a comment - The benchmark system is a hyper-threaded 4-way 3.8GHz Xeon. The benchmark is exercised with 50 concurrent users.
        Hide
        Patrick Linskey added a comment -

        What were the comparative results with an appropriately-sized cache for the data set? (i.e., without cache eviction)

        Show
        Patrick Linskey added a comment - What were the comparative results with an appropriately-sized cache for the data set? (i.e., without cache eviction)
        Hide
        Jeremy Bauer added a comment -

        Several attempts were made to tune the max size of the data cache with base 1.2.0 code. Performance started to degrade at a certain point by further increasing the max cache size . The benchmark showed better results with a max size of 5000 than it did for 10000+. (Default (1000) and 15000 were also tested, 5000 appeared to be optimal for this workload.) The database tables used in the benchmark fluctuate around 35000 rows, which roughly equate to entities.

        Out of curiousity, instead of using CacheMap as the cache store in ConcurrentDataCache, java.util.concurrent.ConcurrentHashMap was used directly in its place (soft ref, write locking, and pinning support were removed for simplification). This also eliminated the max size and null handling aspects of the cache. Benchmark performance with this configuration was very similar to measurements taken with the data cache disabled. Database (on a separate server) utilization was down considerably, which was good & expected, but the benchmark was not showing a performance improvement. In contrast, 1.2.0 using the 1.0.x code showed a ~20% improvement when the data cache (max cache size 5000) was enabled.

        Show
        Jeremy Bauer added a comment - Several attempts were made to tune the max size of the data cache with base 1.2.0 code. Performance started to degrade at a certain point by further increasing the max cache size . The benchmark showed better results with a max size of 5000 than it did for 10000+. (Default (1000) and 15000 were also tested, 5000 appeared to be optimal for this workload.) The database tables used in the benchmark fluctuate around 35000 rows, which roughly equate to entities. Out of curiousity, instead of using CacheMap as the cache store in ConcurrentDataCache, java.util.concurrent.ConcurrentHashMap was used directly in its place (soft ref, write locking, and pinning support were removed for simplification). This also eliminated the max size and null handling aspects of the cache. Benchmark performance with this configuration was very similar to measurements taken with the data cache disabled. Database (on a separate server) utilization was down considerably, which was good & expected, but the benchmark was not showing a performance improvement. In contrast, 1.2.0 using the 1.0.x code showed a ~20% improvement when the data cache (max cache size 5000) was enabled.
        Hide
        Patrick Linskey added a comment -

        Interesting. Thanks for the additional detail. We'll look into it on our end – the initial change was the result of benchmark analysis. Hopefully we can work out what the difference is. At a minimum, we should re-introduce the old cache so that the better implementation for a given work load can be easily selected via configuration.

        Show
        Patrick Linskey added a comment - Interesting. Thanks for the additional detail. We'll look into it on our end – the initial change was the result of benchmark analysis. Hopefully we can work out what the difference is. At a minimum, we should re-introduce the old cache so that the better implementation for a given work load can be easily selected via configuration.
        Hide
        Jeremy Bauer added a comment -

        Thanks, Patrick. I think providing caching options (and doc explaining the behavioral differences) is a good approach. I'll be disconnected for a few days, so I'll catch up with you when I return.

        Show
        Jeremy Bauer added a comment - Thanks, Patrick. I think providing caching options (and doc explaining the behavioral differences) is a good approach. I'll be disconnected for a few days, so I'll catch up with you when I return.
        Hide
        Jeremy Bauer added a comment -

        I've attached a standalone test for running some tests using various cache implementations and configurations. It behaves similarly to the benchmark that exposed this problem. The test allows configuration of number of threads, max data size, max cache size, whether external locking is enabled, and the cache implementation type.

        This test is showing similar results as to what I've previously posted; a write lock causing contention when the cache becomes full. When the external write lock (a reentrant lock over and above internal cache locking) is enabled and the data size is ~5000 entries larger than the max cache size performance drops significantly. The external lock simulates the reentrant lock used by DataCacheStoreManager to ensure the cache does not get updated with an old version of data.

        This test should provide a better idea of what I'm seeing. Aside, there is an option to run directly with java.util.concurrent.ConcurrentHashMap and it performs very well - although there is no null masking or maximum size on the cache.

        To get a list of options: java -cp CacheImplTest.jar;commons-collections-3.2.jar cachetest.Main

        Patrick - What behavior/environment does your benchmark test? Is it possible that your cache size is very near the size of your data so you are not hitting the problem?

        Show
        Jeremy Bauer added a comment - I've attached a standalone test for running some tests using various cache implementations and configurations. It behaves similarly to the benchmark that exposed this problem. The test allows configuration of number of threads, max data size, max cache size, whether external locking is enabled, and the cache implementation type. This test is showing similar results as to what I've previously posted; a write lock causing contention when the cache becomes full. When the external write lock (a reentrant lock over and above internal cache locking) is enabled and the data size is ~5000 entries larger than the max cache size performance drops significantly. The external lock simulates the reentrant lock used by DataCacheStoreManager to ensure the cache does not get updated with an old version of data. This test should provide a better idea of what I'm seeing. Aside, there is an option to run directly with java.util.concurrent.ConcurrentHashMap and it performs very well - although there is no null masking or maximum size on the cache. To get a list of options: java -cp CacheImplTest.jar;commons-collections-3.2.jar cachetest.Main Patrick - What behavior/environment does your benchmark test? Is it possible that your cache size is very near the size of your data so you are not hitting the problem?
        Hide
        Jeremy Bauer added a comment -

        Since Patrick agrees that we should do something to alleviate this performance concern, I would like to at least get us back on par with our previous releases. To that end, I would like to ask to have the patch I submitted to be integrated. I created a sub-task (OPENJPA-643) so that we don't lose track of providing a more flexible, configurable option. But, until we get that ironed out, I really don't want to hold up the rest of our performance analysis.

        Show
        Jeremy Bauer added a comment - Since Patrick agrees that we should do something to alleviate this performance concern, I would like to at least get us back on par with our previous releases. To that end, I would like to ask to have the patch I submitted to be integrated. I created a sub-task ( OPENJPA-643 ) so that we don't lose track of providing a more flexible, configurable option. But, until we get that ironed out, I really don't want to hold up the rest of our performance analysis.
        Hide
        Jeremy Bauer added a comment -

        With the OpenJPA ConcurrentHashMap cache implementation back in place, benchmark results with data cache enabled are back in-line with the 1.0.x release. Future work regarding a data cache implementation configuration option will be handled through OPENJPA-643.

        Show
        Jeremy Bauer added a comment - With the OpenJPA ConcurrentHashMap cache implementation back in place, benchmark results with data cache enabled are back in-line with the 1.0.x release. Future work regarding a data cache implementation configuration option will be handled through OPENJPA-643 .

          People

          • Assignee:
            Jeremy Bauer
            Reporter:
            Jeremy Bauer
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development