|
I agree that ConcurrentHashMap is the way to go. ConcurrentHashMap.diff is an experimental patch which replaces the HashMap in LockSet with a ConcurrentHashMap. It seems to have good effect on the 1000x10000 join load from
To make it easier to plug in another LockSet implementation, LockSet should be better encapsulated. I'm attaching a patch (derby-2327-1a) which makes these changes:
- creates a new interface (LockTable) which LockSet implements - replaces all references to LockSet with references to the new LockTable interface - moves code that explicitly synchronizes on the LockSet object from LockSpace.unlockReference() and SinglePool.zeroDurationlockObject() to LockSet so that the granularity of the synchronization can be controlled by the LockSet implementation alone - adds setter methods for deadlock timeout and wait timeout Derbyall and suites.All passed on Solaris 10, Java SE 6. Committed 1a with revision 515040.
Here are my thoughts on how to attack this issue:
With the refactoring that improved the encapsulation of LockSet, it should be possible to plug in another implementation of LockSet (say ConcurrentLockSet) with very little need for changes in other classes. The new implementation of LockSet will use a ConcurrentHashMap instead of a HashMap, and it will not synchronize on the LockSet object. With the exception of the creation of the diagnostic lock table (VTI) and the deadlock detection, all the synchronized blocks in LockSet touch only one entry in the lock table, so I think the natural synchronization granularity is a single entry (which corresponds to a single Lockable object). For the parts of the code that only touch one entry, I think the new implementation could basically be the same as the old one, only that it synchronizes on the lock entry instead of on the LockSet. Since they only lock one entry at a time, there shouldn't be any risk of a Java-level deadlock. One thing that becomes trickier with a ConcurrentHashMap, is that you don't synchronize on the map so another thread might for instance remove an entry from the map after you have fetched it and before you have locked/synchronized on it. Therefore, the fetching from the map must be performed in a loop where you fetch, lock and finally check that no one has removed it. In the current LockSet implementation, the creation of the virtual lock table synchronizes on the LockSet. In order to guarantee that a consistent snapshot of the lock table is made, some sort of global synchronization is needed, but that would bring us back to square one with regards to concurrency. However, the javadoc for the LockTable VTI says that consistency is not a requirement: > The LockTable virtual table takes a snap shot of the lock table > while the system is in flux, so it is possible that some locks may > be in transition state while the snap shot is taken. We choose to do > this rather then impose extranous timing restrictions so that the > use of this tool will not alter the normal timing and flow of > execution in the application. Therefore, I think it will be OK to just get an iterator from the ConcurrentHashMap (whose javadoc says that the iterator is "weakly consistent") and lock/synchronize the entries one by one. On the other hand, the deadlock detection does require a certain amount of consistency in the snapshot it's working on. To illustrate it, consider this example: 1. When the deadlock detection starts, A is waiting for B. 2. The deadlock detection sees that A waits for B. 3. B unlocks the object A is waiting for, and A can continue. 4. B then tries to lock an object which A holds and must wait. 5. The deadlock detection sees that B waits for A. The deadlock detection then believes that A waits for B which waits for A. That is, it believes we have a deadlock, when in fact A is not waiting for B anymore. I think this problem can be avoided by following a two-phase locking scheme when synchronizing on the entries of the lock table. That is, we lock the entries one by one as we see them, but don't unlock any of them until we have finished the traversal. This way, we know that all the waiters we have seen while traversing the lock table are still waiting. In the example above, it would mean that in (3) B would not be allowed to unlock the object until the deadlock detection had finished. It would therefore see that A was waiting for B, but not that B was waiting for A, so it would (correctly) not report a deadlock between A and B. This approach does not ensure that the deadlock detection has a complete and consistent wait-for graph, since ConcurrentHashMap's iterator doesn't guarantee that concurrent modifications are visible, but I think it will ensure that (a) all deadlocks present when the deadlock detection starts will be detected, and (b) no false deadlocks will be reported. Since the iterator doesn't guarantee that the ordering will be the same each time, we can only have one thread performing deadlock detection at a time in order to avoid Java-level deadlocks. Also, no thread can start deadlock detection when it is synchronized on an entry in the lock table. Are there any comments to this approach? Any false/suspicious assumptions or anything I have forgotten? Thanks for the detailed writeup. This section has me confused though:
> I think this problem can be avoided by following a two-phase locking > scheme when synchronizing on the entries of the lock table. That is, > we lock the entries one by one as we see them, but don't unlock any of > them until we have finished the traversal. What does "lock the entries" mean here, java synchronization on the object, or logical lock (e.g. row lock/table lock)? I don't see how one could code so that java synchronization is kept on many objects without deep stack nesting. The entries are already logically locked so I'm not sure how locking them again would help?. I was thinking of using java.util.concurrent.locks.ReentrantLock, which should give us the same as synchronization only with more flexibility. Then we don't need one stack frame per lock. Instead we can keep the locks in a list, like this:
LinkedList<ReentrantLock> mutexes = new LinkedList<ReentrantLock>(); try { for (Entry e : locks.values()) { mutexes.add(e.mutex); e.mutex.lock(); // ..... } } finally { for (ReentrantLock mutex : mutexes) mutex.unlock(); } Of course, this will give us a small space overhead compared to a plain synchronization approach. Attaching a new refactoring patch (2a). The patch moves all the methods from SinglePool to a new abstract class called AbstractPool. AbstractPool is now the super-class of SinglePool. The only code left in SinglePool is the creation of the LockSet instance. The purpose of the patch is to make it easier to create a new LockFactory class which is identical to SinglePool except that it uses another LockSet implementation.
To apply the patch, first issue this command: svn cp java/engine/org/apache/derby/impl/services/locks/SinglePool.java java/engine/org/apache/derby/impl/services/locks/AbstractPool.java All the tests passed on Solaris 10, Sun Java SE 6. Committed 2a with revision 517586.
derby-2327-3a.diff makes the methods in Deadlock.java take an AbstractPool instance instead a SinglePool. All tests ran cleanly.
Committed 3a with revision 517680.
Attaching a new patch (4a) which contains the implementation of a lock
manager which uses ConcurrentHashMap. Before the patch is applied, one must execute the following command: svn cp \ java/engine/org/apache/derby/impl/services/locks/LockSet.java \ java/engine/org/apache/derby/impl/services/locks/ConcurrentLockSet.java And after the patch is applied: svn add java/engine/org/apache/derby/impl/services/locks/ConcurrentPool.java -------------- Description of the patch: The patch adds two classes * ConcurrentLockSet - which is supposed to behave exactly as LockSet, only that it uses a ConcurrentHashMap instead of a HashMap. * ConcurrentPool - which is like SinglePool only that it creates a instance of ConcurrentLockSet instead of LockSet. ConcurrentLockSet is basically a copy of LockSet that uses ConcurrentLockSet and replaces all occurrences of synchronized (this) { ... } with entry.lock(); try { ... } finally { entry.unlock(); } where entry.lock()/entry.unlock() locks/unlocks a java.util.concurrent.locks.ReentrantLock associated with each LockControl. To avoid Java deadlocks, each thread is normally not allowed to lock more than one of the ReentrantLocks at a time. The exception is when a thread performs deadlock detection. To prevent threads running deadlock detection from running into deadlocks with each other, different threads cannot run deadlock detection at the same time. A thread must also release its ReentrantLocks before it starts deadlock detection, since it might have to wait if another thread is already running deadlock detection, and if it holds any ReentrantLocks, it might deadlock with the other deadlock detection thread. Although the ReentrantLock is released before deadlock detection is started, we don't want normal lock requests to be able to lock it. Therefore a flag (actually a java.util.concurrent.locks.Condition variable) is set to indicate that only threads performing deadlock detection should be able to lock it. If normal requests try to lock it, they will see the Condition variable and call await() on it, which blocks until the thread that started deadlock detection has finished and reacquired the ReentrantLock. The new lock manager only works with JDK5 and higher. I made the build files use the optional jdk16 target, with source and target level set to 1.5. Is that OK? I thought this was better than requiring yet another JDK to build Derby. I have successfully run suites.All and derbyall on Sun's Java 1.4 and Java 1.6 (1.4 uses SinglePool, 1.6 uses ConcurrentPool). Reviews would be greatly appreciated. Thanks. I'm attaching some throughput graphs to give an impression of the
effect of the 4a patch. I have used the single-record select load (primary key), 1 to 70 concurrent clients, running embedded Derby with all data in the page cache. The graphs show the throughput for trunk and the patched version of trunk with JDK5 and JDK6. I have run the tests on three different machines: one dual CPU Opteron, one machine with 8 CPUs, and one Niagara with 8 cores and 32 hardware threads. All the test machines were running Solaris 10. With the exception of JDK6 on the dual Opteron machine, all the tests show improved throughput with multiple clients. They don't show any significant reduction in throughput in the single-user case. Also attaching throughput graphs for an identical configuration, only that it uses the 1000x10000 primary key joins in the
I've been using this patch in various loads, and have not seen any problems because of it.
Note that I have not reviewed the patch, so it would be nice if anyone took a look at it. But my experience from using the patch is that it works without problems and I do see performance improvements. The increase varies quite a lot depending on the type of load. +1 for commit, with a wish for code review first :) Thanks a lot for testing the patch, Kristian! It's nice to know that it works outside my sandbox too. And I agree that it would be great if someone reviewed the code as well! :) Thanks.
Committed revision 535851.
Since a commit:
" I see these build issues. /home/djd/cruise/cruisecontrol-bin-2.6.1/projects/derby_build/trunk/java/engine/org/apache/derby/impl/services/locks/ConcurrentLockSet.java:83: generics are not supported in -source 1.4 (try -source 1.5 to enable generics) private final ConcurrentHashMap<Lockable, Entry> locks; ^ /home/djd/cruise/cruisecontrol-bin-2.6.1/projects/derby_build/trunk/java/engine/org/apache/derby/impl/services/locks/ConcurrentLockSet.java:278: for-each loops are not supported in -source 1.4 (try -source 1.5 to enable for-each loops) for (Entry e : seenByDeadlockDetection) { Dan, I'm not able to reproduce the build failure. Could you please provide more details? For instance,
- which version of ant are you using? - which build target is failing? (the file is supposed to be compiled by compile_impl_services_jdk15, which does set source level and target level to 1.5) - do you see the build failure if you comment out the jdk16 setting in ant.properties? - which flags does ant -verbose report that it invokes javac with when it fails? - from the error messages it seems like you use IBM's SDK. Do you see the same failure if jdk16 points to Sun's JDK6? IBM's SDK for Java 6 is still early access; have you upgraded to the latest snapshot? Re-closing the issue due to lack of feedback. If there still is a build problem, please open a new JIRA issue and provide more details.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Ie. use Java's existing mechanisms to solve this problem rather than re-inventing stuff?
Obviously the benefit would only be seeon on JDK 5.0 or later but I don't think that's a problem.