|
[
Permlink
| « Hide
]
Marcos Sanz added a comment - 05/Apr/07 03:28 PM
Execution of the programm does NOT produce any exception, error or relevant output.
Can you please provide a little more information on what exactly you are seeing - how exactly you can tell that the connections are being physically closed?
I ran the DBCPTester code against postgres and Sybase SQL Anywhere (developer edition, with the JConnect driver above) and did not see connection leakage or churn. In each case, I examined connection logs and there did not appear to be connect-disconnects happening. Postgres did not churn processes and the logs showed just 15 connects, and disconnects only when the program was terminated. Same for Sybase. With Sybase, I did notice a lot of executions of sp_dba. This is because the pool periodically checks isClosed(). The JConnect docs (http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.dc39001_0605/html/prjdbc/BHCIAIBF.htm Are you running Commons Pool 1.3? With dbcp 1.2.2, you should be. Could you pls also test using pool 1.2 with dbcp 1.2.1? I don't see any difference in my tests using the previous versions and the changes from pool 1.2 / dbcp 1.2.1 to pool 1.3 / dbcp 1.2.2 should not affect this; but it would be good to know if the behavior you are seeing is new. Thanks! Yes, the test has been made with Commons Pool 1.3
It is possible to determine that connections are being physically closed/reopened in that the process IDs and the login dates at the master..sysprocesses table in the database server are continuously changing for this application's connections. The number of connections is not stable = 15 either, it varies between 12-15. I'll rerun the tests on Tuesday with older versions of pool/dbcp. I'll try to capture some TDS traffic with RIBO, as well. You wrote "I examined the connection logs". Is there some debug-switch I can turn on and that would help you debugging? Thanks. Also thanks in advance for running against pool 1.2/dbcp 1.2.1 to verify same behavior.
The log I examined for Sybase was the server log with the (SQL Anywhere 10) server started with the -zr option set to "SQL" or "ALL". That appears to show connects / disconnects, as well as sp execution. I don't know much about Sybase, but from the docs, it looks like if you can snoop the TDS traffic, you see much more than I can from the logs. All my tests were on Linux (Fedora Core 4) Sun JDK 1.5.0_10, JConnect 6.0.5. I also ran against postgres, which spawns an OS process for each connection, and verified from its logs as well as from ps output that there was no process churn (PIDs did not change). After running against instrumented dbcp and pool jars and different db engines, I still can't replicate this problem. Both client and server logs are showing only the expected number of opens and closes on physical connections. I suspect that the connections must either being closed on the server side or somehow JConnect's isClosed impl is returning true when the connection is not closed (though it is working for me against SQL Anywhere). One more thing to try is setting IS_CLOSED_TEST=INTERNAL.
The issue can be reproduced with pool 1.2/dbcp 1.2.1/collections 3.2. It can also be reproduced with another driver different of jConnect.
I've isolated the source of the problem. It is the debugging line System.out.println(connectionPool.getMaxActive()); which I had in my original code (see new attachment). It looks as if the synchronized in GenericObjectPool.getMaxActive() method would be originating the physical opens/closes, but I still don't know why. New file with additional debugging that actually causes the bug.
Indeed, the problem is triggered by the synchronized methods in GenericObjectPool, like getNumActive() and getNumIdle(), often called by applications. These methods serialize all executing threads to a single code point, thus the number of unused objects in the pool increases at some moments and eventually reaches the maxIdle value, which triggers the destroyObject(), causing physical closes() of database connections.
This is very cumbersome, and nearly a paradox: there are indeed one thousand threads in the example concurring for fifteen database connections, yet the pool decides the connections are idle. One solution to this problem can be the application always setting maxIdle := maxActive, though this has side effects. I am wondering whether it could be possible to make these private fields (_maxActive, _numActive, ...) volatile and get rid of the synchronized in the methods. I have to explore this. In the meantime, I think this issue belongs to Commons-Pool, rather than to Commons-DBCP. Thanks. By adding the System.out, I can reproduce the symptoms. You are correct that it is the pool's returnObject method that is triggering closing of the excess idle physical connections. The default maxIdle for the pool is 8. I think that what is causing the threads to "stack up", however, is not contention for locks on the pool, but contention on the relatively slow and synchronized System.out.println(). When I replace System.out.println(connectionPool.getMaxActive()) with just connectionPool.getMaxActive(), the idle connection count to does not grow. On the other hand, replacing it with System.out.println("hello") leads to the idle object build up.
Not true. Even if I drop the System.out call and maintain the getMaxActive(), the situation persists. I still can observe the physical closes, you just have to wait a bit longer (in the order of minutes for my system, but YMMV) for the situation to appear. And if you wait a bit longer, exceptions start occurring (see attachment).
The concurrency control concept and, as a result, the liveness of the GenericObjectPool is catastrophical. And I am not talking only about the synchronized counter accesses. How comes that expensive (database) operations like makeObject() are locking the instance of the pool when this is not needed at all? Thanks, Marcos. Really appreciate your feedback and help chasing this down. I should have thought a little more about what is going on here before responding. What I now think is happening is that either as a result of contention for locks on the pool (via getMaxActive) or buffer flushing in System.out, the active threads are periodically closing connections faster than they are (collectively) opening them. This makes the number of idle connections periodically exceed maxIdle, they get closed and then load increases again, so new ones get created. This could happen in production for lots of different reasons.
I am leaving this open as a dbcp bug, since until pool is improved somehow to be more intelligent about destroying idle objects, dbcp is prone to this behavior. The workaround in situations where load and open/close rates can flucutate is to set maxIdle = maxActive, or at least to avoid maxIdle << maxActive. The documentation of both pool and dbcp should be improved to call out the fact that the default value of maxIdle is 8 and what this setting really means. I am also marking this as related to Yes, I also have the feeling that
Regarding concurrency control of GenericObjectPool, I have submitted a patch (see issue
It was not worth tweaking GenericObjectPool.borrowObject() any further because at the moment the next bottleneck of the test program is DBCP's method PoolableConnectionFactory.makeObject() (which is called from within borrowObject()): When the pool reaches a certain size and needs to be increased, all threads get stack at the factory entrance. Is there any compelling reason for that method to be synchronized? PoolableConnectionFactory.makeObject() has been synchronized since the initial commit. Synchronization of validateObject was removed in r132110 in response to BZ 25096. Can anyone else see a reason that this synchronization is necessary?
Not really fixed. As Phil wrote: "until pool is improved somehow to be more intelligent about destroying idle objects, dbcp is prone to this behavior".
I'd like to keep this open so as not to forget this. The question remains: Has PoolableConnectionFactory.makeObject() necessarily be synchronized? I can't see a good reason for the sync on makeObject() so I removed it. It does seem to do a little to alleviate the problem but the issue is still significant.
I have to add the sync back in - without it, driver registration was broken. Once I get to the bottom of why, I'll take the sync back out if I can.
I'm changing the status of this to enhancement as the current code is working as designed, even if it could be smarter.
If GenericObjectPool.addObjectToPool took account of the number of waiting threads then this issue would be easy to solve. That number isn't currently available but if The numerous sync changes made in POOL 1.5.x should eliminate the remaining issues reported here. In particular:
DBCP 1.3 will include POOL 1.5.1. In the mean time, you should still see most of the benefit by switching to POOL 1.5.1 or later |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||