|
Attached new and correct OU_jstat-gc-t.png (previous was from a different test).
I'm just guessing here, but the histogram of objects seems similar to what I would expect from
Thank you Knut Anders, that is an interesting theory! And if it is correct, it seems like the problem has already been fixed... I started a new test run using 10.2.2.0, so we'll see in a couple of months ;)
I see from derby.log that "Failed Statement is: SELECT BIDERID, BID_PRICE, BID_TIME FROM BID WHERE ITEMID = ? ORDER BY BID_TIME DESC", and I was not able (from a quick code scan) to find any other statements utilizing sorting functionality, so this may be the sinner. There is about 20% chance that the test executes this statement per iteration, so I guess this amounts to a few million ORDER BYs in the course of these 57+ days. Logs, diagnostics info, graphs and (zipped) heap dump taken at the point where the OutOfMemoryError occurred is available at http://dbtg.thresher.com/derby/test/debug/DERBY-2176/.
Attached a graph (OU_10.2.1.6_vs_10.2.2.0.png) showing a rough comparison of 10.2.1.6 and 10.2.2.0 when it comes to memory usage (tenured space) when an ORDER BY query is executed multiple times.
I created a smaller (and faster) test case in which a similar ORDER BY query is executed repetitively (using a PreparedStatement). The attached graph shows results from this test running 1.2 million ORDER BY queries. The test case is also attached (Derby2176repro.java). 10.2.1.6 memory usage looks very similar to what was seen during the DOTS-based long running test for 10.2.1.5, while 10.2.2.0 memory usage is completely stable and much lower. Based on these results I now feel quite confident that this issue is the same as the one reported and fixed in Attaching 'OU_jstat-gc-t_10.2.2.0_server.png', which is a graph showing utilization of the tenured space in the heap of the server VM during a long running DOTS run against 10.2.2.0.
The test has now been running for about 58 days (the 10.2.1.5 test ran for 57 days, 13 hours before the OOME), and is showing no signs of ever-increasing memory usage. Memory usage in this test has never been more stable. This issue was caused by
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
* derby.log
* OU_jstat-gc-t.png
- graph of tenured space memory usage (error occurred at about 1381 hrs (x-axis unit))
* jmap-histo_server_17Nov1020CET.txt
- textual histogram of objects in the heap approx. 6 days before the error
I plan to make more logs and diagnostics info (including a heap dump) available soon.
Timing of events:
* Network Server started: 2006-09-26 14:20 CEST
* Test client started: 2006-09-26 14:23 CEST (+3 min)
* First OutOfMemoryError occured: 2006-11-23 01:41 CET (+1381 hrs (57 days, 13 hrs), 21 min)
* Test + server stopped: ca. 2006-12-12 15:05 CET (+1849 hrs, 28 min)