Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 1.3, 1.4, 2.0
    • Fix Version/s: 1.3
    • Labels:
      None
    • Environment:

      JDK5, Oracle 10G,Postgres 8.x

      Description

      Hello,
      We are been using Ofbiz with DBCP based implementation.
      Ofbiz uses a Head revision of DBCP (package org.apache.commons.dbcp.managed is the same as current TRUNK) and geronimo-transaction-1.0.

      We are having recurrent OutOfMemory which occur on a 2 days basis.
      I analyzed the Heap Dump and I think have found the source of the problem:
      The Heap Dump shows a Retained Heap of 400Mo by org.apache.commons.dbcp.managed.TransactionRegistry#xaResources field.

      After analyzing more deeply, the leak seems to come from what is stored in xaResources through
      xaResources.put(connection, xaResource);

      Values inside weak Hash map will never be removed since XAResource holds a STRONG reference on key (connection) through:
      org.apache.commons.dbcp.managed.LocalXAConnectionFactory$LocalXAResource through:
      public LocalXAResource(Connection localTransaction)

      { this.connection = localTransaction; }

      Found in WeakHashMap javadoc:
      Implementation note: The value objects in a WeakHashMap are held by ordinary strong references. >>>>>>>>>>>>>Thus care should be taken to ensure that value objects do not strongly refer to their own keys <<<<<<<<, either directly or indirectly, since that will prevent the keys from being discarded. Note that a value object may refer indirectly to its key via the WeakHashMap itself; that is, a value object may strongly refer to some other key object whose associated value object, in turn, strongly refers to the key of the first value object. One way to deal with this is to wrap values themselves within WeakReferences before inserting, as in: m.put(key, new WeakReference(value)), and then unwrapping upon each get.

      Philippe Mouawad
      http://www.ubik-ingenierie.com

      1. memory.png
        6 kB
        Philippe Mouawad
      2. PoolableManagedConnection.java
        1 kB
        Philippe Mouawad
      3. PoolableManagedConnectionFactory.java
        2 kB
        Philippe Mouawad
      4. Test.zip
        646 kB
        Philippe Mouawad
      5. TransactionRegistry-patch.txt
        1 kB
        Philippe Mouawad

        Activity

        Hide
        Philippe Mouawad added a comment - - edited

        Please find attached a patch to TransactionRegistry.
        The idea is to unregister connection instead of relying on WeakHashMap release.
        Release will be done by PoolableManagedConnection#reallyClose()

        Philippe Mouawad
        http://www.ubik-ingenierie.com

        Show
        Philippe Mouawad added a comment - - edited Please find attached a patch to TransactionRegistry. The idea is to unregister connection instead of relying on WeakHashMap release. Release will be done by PoolableManagedConnection#reallyClose() Philippe Mouawad http://www.ubik-ingenierie.com
        Hide
        Philippe Mouawad added a comment -

        After investigating more deeply I understood what was happening.

        1) In Ofbiz configuration the DBCP evictor Thread is configured.
        2) When that thread runs, it kills down idle connection => THIS IS THE PROBLEM
        3) TransactionRegistry holds a through LocalXAResource a reference on a connection created by DriverConnectionFactory
        4) When evictor kill down a PoolableConnection calling reallyClose(), the underlying JDBC connection does not unregister from TransactionRegistry#xaResources provoking the leak.

        I send a TestCase that reproduces the bug, here necessary elements:
        Script for Postgres:

        CREATE TABLE cb_memory_monitoring_info"(
        mmi_id int8 PRIMARY KEY not null,
        mmi_console_id varchar not null,
        mmi_name varchar not null,
        mmi_date timestamp not null,
        mmi_value int4)

        Run TestBug with:
        -Dcom.sun.management.jmxremote -Xmx4m -Xms4m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=C:/temp

        It will generate a heapDump after some seconds, look a TransactionRegistry#xaResources size, it will be around 110 elements

        Then run TestBugPatch with:
        -Dcom.sun.management.jmxremote -Xmx4m -Xms4m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=C:/temp

        It will never run out of memory, TransactionRegistry#xaResources never grows over

        Philippe Mouawad
        http://www.ubik-ingenierie.com

        Show
        Philippe Mouawad added a comment - After investigating more deeply I understood what was happening. 1) In Ofbiz configuration the DBCP evictor Thread is configured. 2) When that thread runs, it kills down idle connection => THIS IS THE PROBLEM 3) TransactionRegistry holds a through LocalXAResource a reference on a connection created by DriverConnectionFactory 4) When evictor kill down a PoolableConnection calling reallyClose(), the underlying JDBC connection does not unregister from TransactionRegistry#xaResources provoking the leak. I send a TestCase that reproduces the bug, here necessary elements: Script for Postgres: CREATE TABLE cb_memory_monitoring_info"( mmi_id int8 PRIMARY KEY not null, mmi_console_id varchar not null, mmi_name varchar not null, mmi_date timestamp not null, mmi_value int4) Run TestBug with: -Dcom.sun.management.jmxremote -Xmx4m -Xms4m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=C:/temp It will generate a heapDump after some seconds, look a TransactionRegistry#xaResources size, it will be around 110 elements Then run TestBugPatch with: -Dcom.sun.management.jmxremote -Xmx4m -Xms4m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=C:/temp It will never run out of memory, TransactionRegistry#xaResources never grows over Philippe Mouawad http://www.ubik-ingenierie.com
        Hide
        Philippe Mouawad added a comment -

        Implementations of PoolableConnectionFactory and PoolableConnection that work well with managed connections SOLVING the OOM

        Show
        Philippe Mouawad added a comment - Implementations of PoolableConnectionFactory and PoolableConnection that work well with managed connections SOLVING the OOM
        Hide
        Philippe Mouawad added a comment -

        Test case

        Show
        Philippe Mouawad added a comment - Test case
        Hide
        Philippe Mouawad added a comment -

        Moved implementations to org.apache.commons.dbcp.managed

        Show
        Philippe Mouawad added a comment - Moved implementations to org.apache.commons.dbcp.managed
        Hide
        Philippe Mouawad added a comment -

        Patch to TransactionRegistry

        Show
        Philippe Mouawad added a comment - Patch to TransactionRegistry
        Hide
        Philippe Mouawad added a comment -

        Cleanup some code

        Show
        Philippe Mouawad added a comment - Cleanup some code
        Hide
        Phil Steitz added a comment -

        Thanks for reporting this. Reviewing the patch, setting fix version to 1.3

        Show
        Phil Steitz added a comment - Thanks for reporting this. Reviewing the patch, setting fix version to 1.3
        Hide
        Philippe Mouawad added a comment -

        One more precision, in a previous comment I said that problem was due mainly to Evictor Thread.
        The problem will occur even if Evictor Thread does not run aggressively, in fact each time Commons-pool destroy a PoolableConnection (call to reallyClose()) (which may occur outside of Evictor Thread depending on idle management configuration), this destroyed connection will generate a leak.

        Philippe Mouawad
        http://www.ubik-ingenierie.com

        Show
        Philippe Mouawad added a comment - One more precision, in a previous comment I said that problem was due mainly to Evictor Thread. The problem will occur even if Evictor Thread does not run aggressively, in fact each time Commons-pool destroy a PoolableConnection (call to reallyClose()) (which may occur outside of Evictor Thread depending on idle management configuration), this destroyed connection will generate a leak. Philippe Mouawad http://www.ubik-ingenierie.com
        Hide
        Philippe Mouawad added a comment - - edited

        Memory image of Production server.
        4 days of run.
        Around the point 3800 patch was put in production.
        You can see that memory leak is solved.

        Philippe Mouawad
        http://www.ubik-ingenierie.com

        Show
        Philippe Mouawad added a comment - - edited Memory image of Production server. 4 days of run. Around the point 3800 patch was put in production. You can see that memory leak is solved. Philippe Mouawad http://www.ubik-ingenierie.com
        Hide
        Phil Steitz added a comment -

        Applied the patch with the following changes/additions:

        1) Modified BasicDataSource to allow BasicManagedDataSource to use PoolableManagedConnectionFactory
        2) Added a "full" constructor for PMCF so BasicManagedDataSource properties can be set
        3) Made initializeConnection protected in PCF and added call to this in PMCF#makeObject()
        4) Modified PoolableManagedConnection constructor to not ignore abandonedConfig parameter
        4) Added a test case demonstrating the bug

        Thanks for the patch!

        Show
        Phil Steitz added a comment - Applied the patch with the following changes/additions: 1) Modified BasicDataSource to allow BasicManagedDataSource to use PoolableManagedConnectionFactory 2) Added a "full" constructor for PMCF so BasicManagedDataSource properties can be set 3) Made initializeConnection protected in PCF and added call to this in PMCF#makeObject() 4) Modified PoolableManagedConnection constructor to not ignore abandonedConfig parameter 4) Added a test case demonstrating the bug Thanks for the patch!

          People

          • Assignee:
            Phil Steitz
            Reporter:
            Philippe Mouawad
          • Votes:
            5 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 52h
              52h
              Remaining:
              Remaining Estimate - 52h
              52h
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development