Derby
  1. Derby
  2. DERBY-1704

Allow more concurrency in the lock manager

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 10.2.1.6
    • Fix Version/s: 10.3.1.4
    • Component/s: Services
    • Labels:
      None
    • Urgency:
      Normal
    • Bug behavior facts:
      Performance

      Description

      I have seen indications of severe monitor contention in SinglePool
      (the current lock manager) when multiple threads access a Derby
      database concurrently. When a thread wants to lock an object, it needs
      to obtain the monitor for both SinglePool and LockSet (both of them
      are global synchronization points). This leads to poor scalability.

      We should investigate how to allow more concurrency in the lock
      manager, and either extend SinglePool or implement a new manager.

      1. 1cpu.png
        3 kB
        Knut Anders Hatlen
      2. 2cpu.png
        4 kB
        Knut Anders Hatlen
      3. 8cpu.png
        3 kB
        Knut Anders Hatlen
      4. split-hashtables.diff
        12 kB
        Knut Anders Hatlen
      5. split-hashtables.stat
        0.3 kB
        Knut Anders Hatlen
      6. cleanup1.diff
        7 kB
        Knut Anders Hatlen
      7. cleanup1.stat
        0.3 kB
        Knut Anders Hatlen
      8. cleanup1.v2.diff
        8 kB
        Knut Anders Hatlen
      9. cleanup2.diff
        2 kB
        Knut Anders Hatlen
      10. cleanup3.diff
        6 kB
        Knut Anders Hatlen
      11. cleanup3.stat
        0.2 kB
        Knut Anders Hatlen
      12. cleanup4.diff
        0.6 kB
        Knut Anders Hatlen
      13. cleanup5.diff
        5 kB
        Knut Anders Hatlen
      14. cleanup5.stat
        0.2 kB
        Knut Anders Hatlen
      15. cleanup6.diff
        2 kB
        Knut Anders Hatlen

        Activity

        Hide
        Knut Anders Hatlen added a comment -

        Just to see how the performance would be affected if the global
        synchronization points were eliminated, I split the hash tables in
        SinglePool and LockSet into 16 partitions (that is, 16 hash tables),
        and used the hash key to decide which partition an object should be
        placed in. There was no global synchronization, only synchronization
        on the partition.

        I have attached graphs for some performance tests with single-record
        selects (the entire database was in the page cache). The graphs show
        the results on a machine with a single CPU (1cpu.png), one with two
        CPUs (2cpu.png) and one with eight CPUs (8cpu.png). They indicate that
        there is no significant effect (neither positive nor negative) on the
        single-CPU machine, but with multiple CPUs (and multiple threads) the
        performance increases when the global synchronization points have been
        eliminated.

        (Don't pay too much attention to the actual TPS numbers, since the
        three machines didn't have identical processors.)

        Show
        Knut Anders Hatlen added a comment - Just to see how the performance would be affected if the global synchronization points were eliminated, I split the hash tables in SinglePool and LockSet into 16 partitions (that is, 16 hash tables), and used the hash key to decide which partition an object should be placed in. There was no global synchronization, only synchronization on the partition. I have attached graphs for some performance tests with single-record selects (the entire database was in the page cache). The graphs show the results on a machine with a single CPU (1cpu.png), one with two CPUs (2cpu.png) and one with eight CPUs (8cpu.png). They indicate that there is no significant effect (neither positive nor negative) on the single-CPU machine, but with multiple CPUs (and multiple threads) the performance increases when the global synchronization points have been eliminated. (Don't pay too much attention to the actual TPS numbers, since the three machines didn't have identical processors.)
        Hide
        Knut Anders Hatlen added a comment -

        Attaching the patch used to partition the hash tables. It is only
        attached for reference, not for inclusion. I have not updated or added
        comments, and I have not tried very hard to make the names of
        variables and methods meaningful. If someone wants to implement a new
        lock manager based on the ideas in this patch, please feel free to do
        so. FWIW, derbyall ran cleanly with the patch.

        Show
        Knut Anders Hatlen added a comment - Attaching the patch used to partition the hash tables. It is only attached for reference, not for inclusion. I have not updated or added comments, and I have not tried very hard to make the names of variables and methods meaningful. If someone wants to implement a new lock manager based on the ideas in this patch, please feel free to do so. FWIW, derbyall ran cleanly with the patch.
        Hide
        Anders Morken added a comment -

        While I don't know exactly what indications you've seen
        of the lock contention in the lock manager, we've seen the
        exact same thing in our investigations in our "SMP
        scalability in Derby" project. (We == me and Per Ottar
        Ribe Pahr).

        Anyway, we used DTrace with a JDK1.6 beta release to
        measure monitor contention in a simple embedded derby
        scenario consisting of 8 threads doing simple "SELECT A
        FROM FOO WHERE A=?; COMMIT;" loops. FOO is a table
        consisting of about 100 000 rows of A, an integer and the
        primary key, and a CHAR(96) FOR BIT DATA filler.

        The dtrace script measured three things: The classes of
        objects whose monitors were most contented - in terms
        of number of contentions and the wait time threads spent
        waiting for their monitors. In addition, it took a snapshot of
        the call stack at the point of contention to figure out which
        method the contention occured in.

        Here's our measurements, we hope they're useful, if only
        to confirm that you're onto something. =)

        PS: You want a fixed-point font to read this. Point your web
        browser at http://folk.ntnu.no/andersmo/monitor_contention.txt
        to get a plain old text file. =)

        TOP BLOCKING OBJECTS:

        Monitor class Contention count
        org/apache/derby/impl/services/locks/ActiveLock 2
        java/lang/ref/ReferenceQueue$Lock 2
        org/apache/derby/impl/services/cache/CachedItem 2
        org/apache/derby/impl/services/reflect/ReflectLoaderJava2 3
        java/io/PrintStream 5
        org/apache/derby/impl/store/raw/data/StoredPage 10
        sun/misc/Launcher$AppClassLoader 10
        org/apache/derby/impl/sql/GenericPreparedStatement 10
        org/apache/derby/impl/store/raw/xact/XactFactory 11
        org/apache/derby/impl/store/raw/xact/TransactionTable 29
        java/util/Hashtable 34
        org/apache/derby/impl/store/raw/data/AllocationCache 36
        org/apache/derby/impl/services/cache/Clock 257
        org/apache/derby/impl/services/locks/SinglePool 577
        org/apache/derby/impl/services/locks/LockSet 7851

        TOP BLOCKING OBJECTS BY WAIT TIME:

        Monitor class Wait time (ms)
        java/lang/ref/ReferenceQueue$Lock 2
        org/apache/derby/impl/services/locks/ActiveLock 2
        org/apache/derby/impl/services/cache/CachedItem 6
        org/apache/derby/impl/store/raw/data/StoredPage 11
        org/apache/derby/impl/services/reflect/ReflectLoaderJava2 13
        org/apache/derby/impl/store/raw/data/AllocationCache 110
        org/apache/derby/impl/sql/GenericPreparedStatement 120
        org/apache/derby/impl/store/raw/xact/TransactionTable 129
        java/util/Hashtable 198
        org/apache/derby/impl/store/raw/xact/XactFactory 350
        sun/misc/Launcher$AppClassLoader 660
        java/io/PrintStream 709
        org/apache/derby/impl/services/cache/Clock 1380
        org/apache/derby/impl/services/locks/SinglePool 3137
        org/apache/derby/impl/services/locks/LockSet 16194

        TOP BLOCKING METHOD SIGNATURES:

        org/apache/derby/impl/services/locks/LockSet.lockObject(Ljava/lang/Object;Lorg/apache/derby/iapi/services/locks/Lockable;Ljava/l
        ang/Object;ILorg/apache/derby/iapi/services/locks/Latch;)Lorg/apache/derby/impl/services/locks/Lock;*
        55

        org/apache/derby/impl/services/cache/Clock.find(Ljava/lang/Object;)Lorg/apache/derby/iapi/services/cache/Cacheable;*
        56

        org/apache/derby/impl/services/locks/LockSet.lockObject(Ljava/lang/Object;Lorg/apache/derby/iapi/services/locks/Lockable;Ljava/l
        ang/Object;ILorg/apache/derby/iapi/services/locks/Latch;)Lorg/apache/derby/impl/services/locks/Lock;*
        56

        org/apache/derby/impl/services/cache/Clock.release(Lorg/apache/derby/iapi/services/cache/Cacheable;)V*
        65

        org/apache/derby/impl/services/locks/SinglePool.lockAnObject(Ljava/lang/Object;Ljava/lang/Object;Lorg/apache/derby/iapi/services
        /locks/Lockable;Ljava/lang/Object;ILorg/apache/derby/iapi/services/locks/Latch;)Lorg/apache/derby/impl/services/locks/Lock;*
        79

        java/util/Hashtable.get(Ljava/lang/Object;)Ljava/lang/Object;*
        84

        org/apache/derby/impl/services/locks/SinglePool.unlock(Ljava/lang/Object;Ljava/lang/Object;Lorg/apache/derby/iapi/services/locks
        /Lockable;Ljava/lang/Object;)I
        85

        org/apache/derby/impl/services/locks/SinglePool.lockObject(Ljava/lang/Object;Ljava/lang/Object;Lorg/apache/derby/iapi/services/l
        ocks/Lockable;Ljava/lang/Object;I)Z
        103

        org/apache/derby/impl/services/locks/LockSpace.unlockGroup(Lorg/apache/derby/impl/services/locks/LockSet;Ljava/lang/Object;)V
        110

        org/apache/derby/impl/services/locks/LockSet.unlock(Lorg/apache/derby/iapi/services/locks/Latch;I)V*
        153

        java/util/Hashtable.get(Ljava/lang/Object;)Ljava/lang/Object;*
        221

        org/apache/derby/impl/services/locks/SinglePool.lockAnObject(Ljava/lang/Object;Ljava/lang/Object;Lorg/apache/derby/iapi/services
        /locks/Lockable;Ljava/lang/Object;ILorg/apache/derby/iapi/services/locks/Latch;)Lorg/apache/derby/impl/services/locks/Lock;
        260

        org/apache/derby/impl/services/locks/SinglePool.unlatch(Lorg/apache/derby/iapi/services/locks/Latch;)V
        264

        org/apache/derby/impl/services/locks/SinglePool.unlock(Ljava/lang/Object;Ljava/lang/Object;Lorg/apache/derby/iapi/services/locks
        /Lockable;Ljava/lang/Object;)I
        330

        org/apache/derby/impl/services/locks/SinglePool.latchObject(Ljava/lang/Object;Lorg/apache/derby/iapi/services/locks/Lockable;Lja
        va/lang/Object;I)Z
        459

        org/apache/derby/impl/services/locks/LockSet.lockObject(Ljava/lang/Object;Lorg/apache/derby/iapi/services/locks/Lockable;Ljava/l
        ang/Object;ILorg/apache/derby/iapi/services/locks/Latch;)Lorg/apache/derby/impl/services/locks/Lock;*
        559

        org/apache/derby/impl/services/locks/LockSet.unlock(Lorg/apache/derby/iapi/services/locks/Latch;I)V*
        1714

        org/apache/derby/impl/services/locks/LockSet.lockObject(Ljava/lang/Object;Lorg/apache/derby/iapi/services/locks/Lockable;Ljava/l
        ang/Object;ILorg/apache/derby/iapi/services/locks/Latch;)Lorg/apache/derby/impl/services/locks/Lock;*
        2840

        Show
        Anders Morken added a comment - While I don't know exactly what indications you've seen of the lock contention in the lock manager, we've seen the exact same thing in our investigations in our "SMP scalability in Derby" project. (We == me and Per Ottar Ribe Pahr). Anyway, we used DTrace with a JDK1.6 beta release to measure monitor contention in a simple embedded derby scenario consisting of 8 threads doing simple "SELECT A FROM FOO WHERE A=?; COMMIT;" loops. FOO is a table consisting of about 100 000 rows of A, an integer and the primary key, and a CHAR(96) FOR BIT DATA filler. The dtrace script measured three things: The classes of objects whose monitors were most contented - in terms of number of contentions and the wait time threads spent waiting for their monitors. In addition, it took a snapshot of the call stack at the point of contention to figure out which method the contention occured in. Here's our measurements, we hope they're useful, if only to confirm that you're onto something. =) PS: You want a fixed-point font to read this. Point your web browser at http://folk.ntnu.no/andersmo/monitor_contention.txt to get a plain old text file. =) TOP BLOCKING OBJECTS: Monitor class Contention count org/apache/derby/impl/services/locks/ActiveLock 2 java/lang/ref/ReferenceQueue$Lock 2 org/apache/derby/impl/services/cache/CachedItem 2 org/apache/derby/impl/services/reflect/ReflectLoaderJava2 3 java/io/PrintStream 5 org/apache/derby/impl/store/raw/data/StoredPage 10 sun/misc/Launcher$AppClassLoader 10 org/apache/derby/impl/sql/GenericPreparedStatement 10 org/apache/derby/impl/store/raw/xact/XactFactory 11 org/apache/derby/impl/store/raw/xact/TransactionTable 29 java/util/Hashtable 34 org/apache/derby/impl/store/raw/data/AllocationCache 36 org/apache/derby/impl/services/cache/Clock 257 org/apache/derby/impl/services/locks/SinglePool 577 org/apache/derby/impl/services/locks/LockSet 7851 TOP BLOCKING OBJECTS BY WAIT TIME: Monitor class Wait time (ms) java/lang/ref/ReferenceQueue$Lock 2 org/apache/derby/impl/services/locks/ActiveLock 2 org/apache/derby/impl/services/cache/CachedItem 6 org/apache/derby/impl/store/raw/data/StoredPage 11 org/apache/derby/impl/services/reflect/ReflectLoaderJava2 13 org/apache/derby/impl/store/raw/data/AllocationCache 110 org/apache/derby/impl/sql/GenericPreparedStatement 120 org/apache/derby/impl/store/raw/xact/TransactionTable 129 java/util/Hashtable 198 org/apache/derby/impl/store/raw/xact/XactFactory 350 sun/misc/Launcher$AppClassLoader 660 java/io/PrintStream 709 org/apache/derby/impl/services/cache/Clock 1380 org/apache/derby/impl/services/locks/SinglePool 3137 org/apache/derby/impl/services/locks/LockSet 16194 TOP BLOCKING METHOD SIGNATURES: org/apache/derby/impl/services/locks/LockSet.lockObject(Ljava/lang/Object;Lorg/apache/derby/iapi/services/locks/Lockable;Ljava/l ang/Object;ILorg/apache/derby/iapi/services/locks/Latch;)Lorg/apache/derby/impl/services/locks/Lock;* 55 org/apache/derby/impl/services/cache/Clock.find(Ljava/lang/Object;)Lorg/apache/derby/iapi/services/cache/Cacheable;* 56 org/apache/derby/impl/services/locks/LockSet.lockObject(Ljava/lang/Object;Lorg/apache/derby/iapi/services/locks/Lockable;Ljava/l ang/Object;ILorg/apache/derby/iapi/services/locks/Latch;)Lorg/apache/derby/impl/services/locks/Lock;* 56 org/apache/derby/impl/services/cache/Clock.release(Lorg/apache/derby/iapi/services/cache/Cacheable;)V* 65 org/apache/derby/impl/services/locks/SinglePool.lockAnObject(Ljava/lang/Object;Ljava/lang/Object;Lorg/apache/derby/iapi/services /locks/Lockable;Ljava/lang/Object;ILorg/apache/derby/iapi/services/locks/Latch;)Lorg/apache/derby/impl/services/locks/Lock;* 79 java/util/Hashtable.get(Ljava/lang/Object;)Ljava/lang/Object;* 84 org/apache/derby/impl/services/locks/SinglePool.unlock(Ljava/lang/Object;Ljava/lang/Object;Lorg/apache/derby/iapi/services/locks /Lockable;Ljava/lang/Object;)I 85 org/apache/derby/impl/services/locks/SinglePool.lockObject(Ljava/lang/Object;Ljava/lang/Object;Lorg/apache/derby/iapi/services/l ocks/Lockable;Ljava/lang/Object;I)Z 103 org/apache/derby/impl/services/locks/LockSpace.unlockGroup(Lorg/apache/derby/impl/services/locks/LockSet;Ljava/lang/Object;)V 110 org/apache/derby/impl/services/locks/LockSet.unlock(Lorg/apache/derby/iapi/services/locks/Latch;I)V* 153 java/util/Hashtable.get(Ljava/lang/Object;)Ljava/lang/Object;* 221 org/apache/derby/impl/services/locks/SinglePool.lockAnObject(Ljava/lang/Object;Ljava/lang/Object;Lorg/apache/derby/iapi/services /locks/Lockable;Ljava/lang/Object;ILorg/apache/derby/iapi/services/locks/Latch;)Lorg/apache/derby/impl/services/locks/Lock; 260 org/apache/derby/impl/services/locks/SinglePool.unlatch(Lorg/apache/derby/iapi/services/locks/Latch;)V 264 org/apache/derby/impl/services/locks/SinglePool.unlock(Ljava/lang/Object;Ljava/lang/Object;Lorg/apache/derby/iapi/services/locks /Lockable;Ljava/lang/Object;)I 330 org/apache/derby/impl/services/locks/SinglePool.latchObject(Ljava/lang/Object;Lorg/apache/derby/iapi/services/locks/Lockable;Lja va/lang/Object;I)Z 459 org/apache/derby/impl/services/locks/LockSet.lockObject(Ljava/lang/Object;Lorg/apache/derby/iapi/services/locks/Lockable;Ljava/l ang/Object;ILorg/apache/derby/iapi/services/locks/Latch;)Lorg/apache/derby/impl/services/locks/Lock;* 559 org/apache/derby/impl/services/locks/LockSet.unlock(Lorg/apache/derby/iapi/services/locks/Latch;I)V* 1714 org/apache/derby/impl/services/locks/LockSet.lockObject(Ljava/lang/Object;Lorg/apache/derby/iapi/services/locks/Lockable;Ljava/l ang/Object;ILorg/apache/derby/iapi/services/locks/Latch;)Lorg/apache/derby/impl/services/locks/Lock;* 2840
        Hide
        Anders Morken added a comment -

        Oh, forgot to mention that the tests were run on a SMP machine - an old Sun Enterprise 450 with 4x400MHz CPUs (and a bit error on bit 12 on memory module 1701 - yay ECC. . The Derby release used was 10.1.3.1 - (417277), the official release from the web page. If time permits we'll see about testing Knut Anders' patch with DTrace as well.

        Show
        Anders Morken added a comment - Oh, forgot to mention that the tests were run on a SMP machine - an old Sun Enterprise 450 with 4x400MHz CPUs (and a bit error on bit 12 on memory module 1701 - yay ECC. . The Derby release used was 10.1.3.1 - (417277), the official release from the web page. If time permits we'll see about testing Knut Anders' patch with DTrace as well.
        Hide
        Knut Anders Hatlen added a comment -

        I am planning to partition the hash table in LockSet along the lines
        of the split-hashtables.diff patch.

        For the other hash table on which there is monitor contention (the one
        in SinglePool), I think it is better to remove the hash table
        entirely. That hash table maps a compatibility space to a
        LockSpace. Since there is a one-to-one mapping between compatibility
        spaces and LockSpaces, we could change it so that each compatibility
        space is a LockSpace object and remove the need for the hash
        table. This would (a) remove the contention problem, and (b) save some
        CPU cycles because we don't have to go through the hash table.

        If this sounds OK, I will create two sub-tasks: one for partitioning
        the table in LockSet, and one for removing the table in SinglePool.

        Show
        Knut Anders Hatlen added a comment - I am planning to partition the hash table in LockSet along the lines of the split-hashtables.diff patch. For the other hash table on which there is monitor contention (the one in SinglePool), I think it is better to remove the hash table entirely. That hash table maps a compatibility space to a LockSpace. Since there is a one-to-one mapping between compatibility spaces and LockSpaces, we could change it so that each compatibility space is a LockSpace object and remove the need for the hash table. This would (a) remove the contention problem, and (b) save some CPU cycles because we don't have to go through the hash table. If this sounds OK, I will create two sub-tasks: one for partitioning the table in LockSet, and one for removing the table in SinglePool.
        Hide
        Daniel John Debrunner added a comment -

        I haven't had time to look at the patch but the original intention was for there to be a multi-thread lock manager
        using multiple LockSets, e.g. a MultiPool implementation instead of a SinglePool implementation. Just FYI.

        With the multiple synchronization how is deadlock detection to be handled?

        Show
        Daniel John Debrunner added a comment - I haven't had time to look at the patch but the original intention was for there to be a multi-thread lock manager using multiple LockSets, e.g. a MultiPool implementation instead of a SinglePool implementation. Just FYI. With the multiple synchronization how is deadlock detection to be handled?
        Hide
        Knut Anders Hatlen added a comment -

        Thanks. Do you think it would be better to leave SinglePool as it is
        and create a new (perhaps optional) MultiPool implementation instead?

        I haven't looked very closely at the deadlock detection code yet, but
        I hope it won't be necessary with too many changes. One possibility is
        to obtain the synchronization locks on all partitions before the
        waiters graph is built. Of course, some precautions are needed in
        order to avoid java-level deadlocks when are multiple synchronization
        locks possibly obtained in different order.


        Knut Anders

        Show
        Knut Anders Hatlen added a comment - Thanks. Do you think it would be better to leave SinglePool as it is and create a new (perhaps optional) MultiPool implementation instead? I haven't looked very closely at the deadlock detection code yet, but I hope it won't be necessary with too many changes. One possibility is to obtain the synchronization locks on all partitions before the waiters graph is built. Of course, some precautions are needed in order to avoid java-level deadlocks when are multiple synchronization locks possibly obtained in different order. – Knut Anders
        Hide
        Knut Anders Hatlen added a comment -

        In the patch I used when running the tests, monitor contention on LockSet was reduced by having multiple hash tables in each LockSet. I think it might be cleaner to have multiple LockSets, where each LockSet contains one hash table. Will look more into it and report back. In the meantime, I plan to submit some cleanup patches that hopefully will make the final patch smaller and easier to understand regardless of which approach is chosen.

        Show
        Knut Anders Hatlen added a comment - In the patch I used when running the tests, monitor contention on LockSet was reduced by having multiple hash tables in each LockSet. I think it might be cleaner to have multiple LockSets, where each LockSet contains one hash table. Will look more into it and report back. In the meantime, I plan to submit some cleanup patches that hopefully will make the final patch smaller and easier to understand regardless of which approach is chosen.
        Hide
        Knut Anders Hatlen added a comment -

        Attached cleanup1.diff.

        I observed that LockSet extends Hashtable, but all calls to Hashtable's methods are already synchronized on the LockSet, so it could just as well use a HashMap. The attached patch makes LockSet contain a HashMap instead of extending Hashtable. I ran some tests with the DERBY-1961 test client (single-record select on a dual-cpu machine, Solaris 10, Java SE 6) and saw ~2% throughput improvement for a single client and from 15% to 25% improvement for multiple (2-64) clients.

        Derbyall and the JUnit tests passed on Solaris 10/Java SE 6. Reviews would be appreciated.

        Show
        Knut Anders Hatlen added a comment - Attached cleanup1.diff. I observed that LockSet extends Hashtable, but all calls to Hashtable's methods are already synchronized on the LockSet, so it could just as well use a HashMap. The attached patch makes LockSet contain a HashMap instead of extending Hashtable. I ran some tests with the DERBY-1961 test client (single-record select on a dual-cpu machine, Solaris 10, Java SE 6) and saw ~2% throughput improvement for a single client and from 15% to 25% improvement for multiple (2-64) clients. Derbyall and the JUnit tests passed on Solaris 10/Java SE 6. Reviews would be appreciated.
        Hide
        Dyre Tjeldvoll added a comment -

        cleanup1.diff looks good, but I have a couple of questions:

        LockSet has was declared "public final", but is now just "final". Is it intentional?

        In general, I have a hard time understanding the multithreading-related comments used in Derby (not specific to this patch). For example, the comment for LockSet says:
        "MT - Mutable - Container Object : Thread Safe". I really don't understand what that means. Is it still true when LockSet no longer inherits from a synchronized container?

        I think what is meant here is that only those methods declared "synchronized" are MT-safe. A client wanting to use any of the other methods must synchronize on the LockSet object. Correct?

        The method addWaiters(Dictionary) is not synchronized, and the comment says "MT - must be synchronized on this LockSet object". But in Deadock.getWaiters(LockSet) it is called without synchronization. Presumably this is ok since the old code iterated using an enumeration over an unlocked LockSet, (unless the lock is higher up the stack, and cannot be seen in the diff). But a comment would have been nice

        Show
        Dyre Tjeldvoll added a comment - cleanup1.diff looks good, but I have a couple of questions: LockSet has was declared "public final", but is now just "final". Is it intentional? In general, I have a hard time understanding the multithreading-related comments used in Derby (not specific to this patch). For example, the comment for LockSet says: "MT - Mutable - Container Object : Thread Safe". I really don't understand what that means. Is it still true when LockSet no longer inherits from a synchronized container? I think what is meant here is that only those methods declared "synchronized" are MT-safe. A client wanting to use any of the other methods must synchronize on the LockSet object. Correct? The method addWaiters(Dictionary) is not synchronized, and the comment says "MT - must be synchronized on this LockSet object". But in Deadock.getWaiters(LockSet) it is called without synchronization. Presumably this is ok since the old code iterated using an enumeration over an unlocked LockSet, (unless the lock is higher up the stack, and cannot be seen in the diff). But a comment would have been nice
        Hide
        Knut Anders Hatlen added a comment -

        Thank you for looking at the patch! I'll try to answer your questions below.

        1) The change of LockSet from "public final" to "final" was intentional. LockSet is an internal implementation class that is not supposed to be accessed from other packages.

        2) I'm not sure what the exact meaning of the MT comment is. I would assume that it meant something like "calls to public (or non-private) methods of this class can be regarded as atomic operations". I'm not sure that this statement is completely true before the patch, but it is definitely not true after the patch, so the comment should be updated.

        3) getWaiters() is a private helper method for Deadlock.look() which is only invoked from inside a synchronized block in LockSet, so getWaiters() is in fact always synchronized on the LockSet. This would probably have been clearer if Deadlock.look() had one of those infamous MT comments. I'll add one.

        Show
        Knut Anders Hatlen added a comment - Thank you for looking at the patch! I'll try to answer your questions below. 1) The change of LockSet from "public final" to "final" was intentional. LockSet is an internal implementation class that is not supposed to be accessed from other packages. 2) I'm not sure what the exact meaning of the MT comment is. I would assume that it meant something like "calls to public (or non-private) methods of this class can be regarded as atomic operations". I'm not sure that this statement is completely true before the patch, but it is definitely not true after the patch, so the comment should be updated. 3) getWaiters() is a private helper method for Deadlock.look() which is only invoked from inside a synchronized block in LockSet, so getWaiters() is in fact always synchronized on the LockSet. This would probably have been clearer if Deadlock.look() had one of those infamous MT comments. I'll add one.
        Hide
        Knut Anders Hatlen added a comment -

        I have attached v2 of cleanup1. I went through LockSet's methods to determine which of them were not thread safe without extra synchronization and updated their javadoc comments. Changes from the first version:

        • MT comment for LockSet class changed to "MT - Mutable - Container Object : All non-private methods of this class are thread safe unless otherwise stated by their javadoc comments."
        • Added javadoc comments to LockSet.oneMoreWaiter() and LockSet.oneLessWaiter() with information about required synchronization.
        • Removed what seems to be a false comment in LockSet.anyoneBlocked() ("no synchronization needed because reads of ints are atomic"). It is true that reads of ints are atomic, but synchronization is still needed to avoid that an old cached value is read instead of the real value.
        • Declared LockSet.anyoneBlocked() as synchronized to avoid the problem mentioned above. Currently, I don't think it is a problem since all calls to anyoneBlocked() come (indirectly) from the unit tests T_AccessFactory and T_LockFactory, but it would be good to fix it anyway.
        • Added javadoc and MT comment to Deadlock.look().

        I have started a new run of derbyall.

        Show
        Knut Anders Hatlen added a comment - I have attached v2 of cleanup1. I went through LockSet's methods to determine which of them were not thread safe without extra synchronization and updated their javadoc comments. Changes from the first version: MT comment for LockSet class changed to "MT - Mutable - Container Object : All non-private methods of this class are thread safe unless otherwise stated by their javadoc comments." Added javadoc comments to LockSet.oneMoreWaiter() and LockSet.oneLessWaiter() with information about required synchronization. Removed what seems to be a false comment in LockSet.anyoneBlocked() ("no synchronization needed because reads of ints are atomic"). It is true that reads of ints are atomic, but synchronization is still needed to avoid that an old cached value is read instead of the real value. Declared LockSet.anyoneBlocked() as synchronized to avoid the problem mentioned above. Currently, I don't think it is a problem since all calls to anyoneBlocked() come (indirectly) from the unit tests T_AccessFactory and T_LockFactory, but it would be good to fix it anyway. Added javadoc and MT comment to Deadlock.look(). I have started a new run of derbyall.
        Hide
        Dyre Tjeldvoll added a comment -

        New version looks good. +1

        Show
        Dyre Tjeldvoll added a comment - New version looks good. +1
        Hide
        Knut Anders Hatlen added a comment -

        Thanks for the review! Committed cleanup1.v2 with revision 498999.

        Show
        Knut Anders Hatlen added a comment - Thanks for the review! Committed cleanup1.v2 with revision 498999.
        Hide
        Knut Anders Hatlen added a comment -

        Committed cleanup2.diff with revision 499316. The patch removes an unused variable, an unused import and an unused method.

        Show
        Knut Anders Hatlen added a comment - Committed cleanup2.diff with revision 499316. The patch removes an unused variable, an unused import and an unused method.
        Hide
        Knut Anders Hatlen added a comment -

        LockSpace extends Hashtable, but no other classes use its Hashtable methods (except one call to isEmpty()), so the Hashtable could just as well be a private member. Since all the non-private methods of LockSpace are synchronized, the Hashtable could be replaced with an unsynchronized HashMap. cleanup3.diff makes these changes. Derbyall and JUnit passed.

        Show
        Knut Anders Hatlen added a comment - LockSpace extends Hashtable, but no other classes use its Hashtable methods (except one call to isEmpty()), so the Hashtable could just as well be a private member. Since all the non-private methods of LockSpace are synchronized, the Hashtable could be replaced with an unsynchronized HashMap. cleanup3.diff makes these changes. Derbyall and JUnit passed.
        Hide
        Knut Anders Hatlen added a comment -

        The results shown in the attached graphs are not valid for trunk after the commit of the DERBY-2107 patch. The single-record select load that was used is more latch-intensive than lock-intensive, and when the latches no longer use the lock manager, the effect of splitting the hash tables in the lock manager is small, even with multiple CPUs. There still is a positive effect for more lock-intensive loads though (for instance scans).

        Since there are two global synchronization points in the lock manager, I'll file subtasks so they can be discussed separately.

        Show
        Knut Anders Hatlen added a comment - The results shown in the attached graphs are not valid for trunk after the commit of the DERBY-2107 patch. The single-record select load that was used is more latch-intensive than lock-intensive, and when the latches no longer use the lock manager, the effect of splitting the hash tables in the lock manager is small, even with multiple CPUs. There still is a positive effect for more lock-intensive loads though (for instance scans). Since there are two global synchronization points in the lock manager, I'll file subtasks so they can be discussed separately.
        Hide
        Anders Morken added a comment -

        In slightly related work, we've recently done a little testing with a single-record select load on a trunk patched with DERBY-2107 as well as a port of the patches to split the hash tables in the lock subsystem that you included here.

        With a separate table and index for each thread (to remove latch contention and lock waits from the equation) we got an overall throughput increase of about 16% compared to a "single table for all threads" run (given a cache large enough to maintain the database in-memory), and found that org.apache.derby.impl.services.cache.Clock.find()/release() caused about 5 times more contention than the synchronization in LockSet.lockObject() and LockSet.unlock(). That might be an indicator of where to apply the next push - and validates the "split hashtable" approach for this workload. =)

        For our part, we're proceeding to hack at latching. =)

        Show
        Anders Morken added a comment - In slightly related work, we've recently done a little testing with a single-record select load on a trunk patched with DERBY-2107 as well as a port of the patches to split the hash tables in the lock subsystem that you included here. With a separate table and index for each thread (to remove latch contention and lock waits from the equation) we got an overall throughput increase of about 16% compared to a "single table for all threads" run (given a cache large enough to maintain the database in-memory), and found that org.apache.derby.impl.services.cache.Clock.find()/release() caused about 5 times more contention than the synchronization in LockSet.lockObject() and LockSet.unlock(). That might be an indicator of where to apply the next push - and validates the "split hashtable" approach for this workload. =) For our part, we're proceeding to hack at latching. =)
        Hide
        Knut Anders Hatlen added a comment -

        Committed cleanup3.diff with revision 507428.

        Show
        Knut Anders Hatlen added a comment - Committed cleanup3.diff with revision 507428.
        Hide
        Knut Anders Hatlen added a comment -

        Attaching a tiny cleanup patch (cleanup4). LockControl.popFrontWaiter() is identical to removeWaiter(), only that it calls remove() with the constant 0 instead of using a parameter. The patch makes popFrontWaiter() forward calls to removeWaiter(). The tests passed.

        Show
        Knut Anders Hatlen added a comment - Attaching a tiny cleanup patch (cleanup4). LockControl.popFrontWaiter() is identical to removeWaiter(), only that it calls remove() with the constant 0 instead of using a parameter. The patch makes popFrontWaiter() forward calls to removeWaiter(). The tests passed.
        Hide
        Knut Anders Hatlen added a comment -

        Committed cleanup4.diff with 517588.

        Show
        Knut Anders Hatlen added a comment - Committed cleanup4.diff with 517588.
        Hide
        Knut Anders Hatlen added a comment -

        cleanup5.diff contains these changes:

        ActiveLock.java: removed some unused imports
        Lock.java: made the class package private
        LockControl.java:

        • made the class final and package private
        • addLock() had an if which tested (!grantLock && ...) at a place where grantLock always would be false. Removed the !grantLock part of the condition.
        • addWaiter(), removeWaiter() and popFrontWaiter() took a List argument called waiting. Since the argument always was identical to a private member with the same name, I removed "List waiting" from the parameter list.

        Derbyall and the JUnit tests ran successfully.

        Show
        Knut Anders Hatlen added a comment - cleanup5.diff contains these changes: ActiveLock.java: removed some unused imports Lock.java: made the class package private LockControl.java: made the class final and package private addLock() had an if which tested (!grantLock && ...) at a place where grantLock always would be false. Removed the !grantLock part of the condition. addWaiter(), removeWaiter() and popFrontWaiter() took a List argument called waiting. Since the argument always was identical to a private member with the same name, I removed "List waiting" from the parameter list. Derbyall and the JUnit tests ran successfully.
        Hide
        Knut Anders Hatlen added a comment -

        Attaching a new small cleanup patch (cleanup6) for LockSet.java. LockSet has a private Hashtable (lockTraces) into which it puts a Throwable each time a thread must wait for a lock (only if deadlock tracing is enabled). The values in the hash table are however never used, so the hash table and the code that updates it could be removed. Derbyall and the JUnit tests ran cleanly with the patch.

        Show
        Knut Anders Hatlen added a comment - Attaching a new small cleanup patch (cleanup6) for LockSet.java. LockSet has a private Hashtable (lockTraces) into which it puts a Throwable each time a thread must wait for a lock (only if deadlock tracing is enabled). The values in the hash table are however never used, so the hash table and the code that updates it could be removed. Derbyall and the JUnit tests ran cleanly with the patch.
        Hide
        Knut Anders Hatlen added a comment -

        Committed cleanup5 with revision 518052.
        Committed cleanup6 with revision 518073.

        Show
        Knut Anders Hatlen added a comment - Committed cleanup5 with revision 518052. Committed cleanup6 with revision 518073.
        Hide
        Knut Anders Hatlen added a comment -

        All sub-tasks completed. Closing the issue.

        Show
        Knut Anders Hatlen added a comment - All sub-tasks completed. Closing the issue.

          People

          • Assignee:
            Knut Anders Hatlen
            Reporter:
            Knut Anders Hatlen
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development