Issue Details (XML | Word | Printable)

Key: DERBY-23
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Minor Minor
Assignee: Knut Anders Hatlen
Reporter: Tulika Agrawal
Votes: 1
Watchers: 1
Operations

If you were logged in you would be able to see more operations.
Derby

just booting jdbc driver and shutting down seem to leak memory

Created: 29/Sep/04 07:20 PM   Updated: 13/Dec/07 09:04 AM
Return to search
Component/s: Services
Affects Version/s: 10.0.2.0
Fix Version/s: 10.2.1.6

Time Tracking:
Not Specified

File Attachments:
  Size
File Licensed for inclusion in ASF works DERBY-23-npe.diff 2006-01-20 01:27 AM Knut Anders Hatlen 3 kB
Text File Licensed for inclusion in ASF works DERBY-23-npe.stat 2006-01-20 01:27 AM Knut Anders Hatlen 0.1 kB
File Licensed for inclusion in ASF works DERBY-23-with_comments.diff 2005-09-20 03:39 PM Knut Anders Hatlen 2 kB
File Licensed for inclusion in ASF works DERBY-23.diff 2005-09-19 04:58 PM Knut Anders Hatlen 2 kB
Text File Licensed for inclusion in ASF works DERBY-23.stat 2005-09-19 04:58 PM Knut Anders Hatlen 0.2 kB
Text File Licensed for inclusion in ASF works derbyall_report.txt 2006-01-20 01:27 AM Knut Anders Hatlen 205 kB

Urgency: Low
Resolution Date: 22/Aug/06 01:02 PM


 Description  « Hide
Reporting for Daniel John Debrunner.

Doing simple boot and shutdown of the driver in a loop
seem to grow the heap forever:
new org.apache.derby.jdbc.EmbeddedDriver(); //boot the driver
DriverManager.getConnection("jdbc:derby:;shutdown=true");
//shutdown the cloudscape instance completely

after booting and shutting down 1516 times, memory used is
~~41931056 bytes.
Sample GC Output:
1931.93: [Full GC 43484K->43243K(65088K), 0.1779751 secs]
1933.13: [Full GC 43511K->43270K(65088K), 0.1914383 secs]
1934.47: [Full GC 43538K->43297K(65088K), 0.1808878 secs]
1935.68: [Full GC 43564K->43324K(65088K), 0.4461623 secs]
1937.43: [Full GC 43591K->43350K(65088K), 0.1842980 secs]
1938.63: [Full GC 43617K->43377K(65088K), 0.1873431 secs]
1939.85: [Full GC 43644K->43404K(65088K), 0.1948505 secs]
1941.07: [Full GC 43671K->43430K(65088K), 0.1790895 secs]



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Knut Anders Hatlen added a comment - 17/Sep/05 07:45 AM
When I looked into this issue, I found a problem with the synchronization. If Derby is shut down right after the driver is loaded, the AntiGC thread will continue to run in most cases. Fixing this problem made the heap grow slower, but it was not enough to stop the memory leak entirely.

Knut Anders Hatlen added a comment - 19/Sep/05 04:58 PM
The memory leak comes from three sources:

1) The AntiGC thread is sometimes not terminated. This happens when
   Derby is shut down right after the driver is loaded, because the
   run() method in AntiGC starts with setting a boolean variable which
   is used to decide when the thread should stop. The same variable is
   set when a request to shut down Derby is sent, and if the shutdown
   is requested before the AntiGC thread has started running, the
   request to shut down Derby might not be noticed by AntiGC which
   will run forever.

Solution: Don't set the boolean variable in AntiGC's run() method. It
is enough that it is being set when the object is initialized.

2) Every time the driver is loaded, a new ThreadGroup is created. They
   don't seem to be garbage collected even when all their threads have
   finished.

Solution: Set the daemon property on the ThreadGroups. This way a
ThreadGroup will be destroyed when its last running thread (or thread
group) is destroyed.

3) The Java Heap Analysis Tool reported that a huge number of
   ContextManager and ContextService objects were kept in the
   heap. The objects were not accessible from the root set, but they
   were not garbage collected, probably because of some cyclic
   references which the gc cannot handle.

Solution: Break the reference cycle by nulling out the references to
the lists of ContextManagers when ContextService objects are stopped.

The attached patch seems to fix the problems mentioned above. Derbyall
runs fine, and top reports that the memory usage does not increase
when running a loop where the driver is loaded (with
Class.forName("org...EmbeddedDriver").newInstance()) and unloaded
(with DriverManager.getConnection("jdbc:derby:;shutdown=true")). That
is, the memory usage of course increased during the first iteration,
but not during the approximately 8.5 million next ones.

I will submit tests later.

Daniel John Debrunner added a comment - 19/Sep/05 11:46 PM
Patch looks good to me, the explanation was clear and very useful.

Satheesh Bandaram added a comment - 20/Sep/05 03:05 AM
Good fix. Wonder if adding the explanations in the code, where possible, would be more useful than leaving them in Jira entry.

Bernt expressed interest in commiting the change, so I am not commiting this to SVN.

Knut Anders Hatlen added a comment - 20/Sep/05 03:39 PM
Attached patch with comments where appropriate.

Knut Anders Hatlen added a comment - 21/Sep/05 04:34 PM
Fixed in revision 290644.

Knut Anders Hatlen added a comment - 06/Oct/05 04:22 PM
As I have mentioned on derby-dev earlier, there is a problem with the
current fix. After the fix was committed, store/encryptDatabase.sql
sometimes failed because one or more rawStoreDaemon threads printed a
NullPointerException to System.err.

For an example, see:
http://www.multinet.no/~solberg/public/Apache/TinderBox_Derby/testlog/SunOS-5.10_i86pc-i386/290646-encryptionCFB_diff.txt

The reason for this is that the fix nulls out the reference from the
ContextService to the list of ContextManagers when the ContextService
is stopped. Some threads with references to the ContextService might
still be alive when the ContextService is stopped, and they get
NullPointerExceptions when accessing the stopped ContextService.

I have only been able to reproduce this behaviour with the
store/encryptDatabase.sql test on multi-cpu machines running with a
heavy background load. It seems like the threads that get
NullPointerExceptions are orphan threads (like the ones described in
DERBY-594) that have not yet been able to start because of the heavy
background load. The problem might go away when DERBY-594 is resolved,
but I think it needs further investigation.

Knut Anders Hatlen added a comment - 20/Jan/06 01:27 AM
I have attached a patch (DERBY-23-npe.diff) that addresses the
NullPointerException messages from rawStoreDaemon threads we have seen
in derbyall on some occasions.

The NullPointerExceptions are thrown because
ContextService.threadContextList is set to null in
ContextService.stop(), but some of the daemon threads might still try
to access the variable. The patch adds checks for threadContextList
being null, and returns null or false in those cases instead of
failing with a NullPointerException.

Five tests in derbyall failed when this patch was applied, but that
seems to be what one should expect these days. All of the failures are
also seen in the daily regression tests.

Feel free to review and comment on the patch. Thanks.

Knut Anders Hatlen added a comment - 22/Aug/06 01:02 PM
I'm marking this issue as resolved.

The reason why I haven't resolved it before, is that I still see OutOfMemoryError on some platforms. These errors are caused by the background threads running with a lower priority than the the main thread, and therefore many of the background threads don't get enough CPU time to shut down. If the main thread takes a break now and then, I don't see OutOfMemoryError on any platform. That is,

    while (true) {
        Class.forName("org.apache.derby.jdbc.EmbeddedDriver").newInstance();
        DriverManager.getConnection("jdbc:derby:;shutdown=true");
        Thread.sleep(1);
    }

runs reliably with no memory leak on all platforms I have tested. I don't want to add a regression test for this issue, since I fear it will be timing dependent and might cause noise in the nightly testing.

I think the issue can be closed because the remaining problem is unlikely to be seen in a real-world application (who will spend all their CPU time on loading and unloading the driver?). If it turns out to be a problem, I see these possible solutions:

  1) On shutdown, wait for all background threads to terminate.
  2) On shutdown, raise the priority of the background threads.

If someone feels this should be addressed, please open a new issue.

Andrew McIntyre added a comment - 13/Dec/07 09:04 AM
This issue has been resolved for over a year with no further movement. Closing.