|
[
Permlink
| « Hide
]
Knut Anders Hatlen added a comment - 23/Aug/07 10:26 AM
Attaching the test and a graph showing the difference in performance between shared plans and separate plans on a machine with 8 CPUs.
That is very interesting. A couple of thoughts on this.
First, the point of sharing plans is to avoid doing potentially expensive compilation. By choosing a really simple query which is cheap both to compile and execute you are effectively measuring only the cost of sharing plans. If you had even a slightly more expensive query, I doubt you would see such a huge disparity between the two cases. That having been said, the lack of any speedup is troubling. I ran the same query to see how many times the routines you mentioned (GenericPreparedStatement#upToDate and BaseActivation#checkStatementValdity) are executed. The first one is called *five* times per query and the second one *once*. I haven't looked at the code too closely but it does seem excessive and could be a starting point to investigate contention. Also, there are two other routines GPS#finish and GPS#getActivation which synchronize on the GPS and are called once per statement so these routines add to the contention as well. Thanks for investigating this, Manish!
I agree that the test is not representative of a real-world application, but that wasn't my aim when I wrote it. I just wanted to see if there were any basic part of the SQL execution layer that would be a bottleneck on a multi-CPU machine. VALUES 1 seemed to be a good choice since it avoids accesses to the buffer manager, which is a known multi-CPU bottleneck. I think of it more like looking at a small part of Derby through a magnifying glass or a microscope. :) When I run the test, I only see three calls to GPS.upToDate(), one call to BA.checkStatementValidity(), and none to GPS.finish() and GPS.getActivation(). You didn't by any chance use a Statement instead of a PreparedStatement? I'm not sure I quite understand how the interaction with upToDate() works. If upToDate() returns true, we know (because of the synchronization) that at some point after we called upToDate() and before it returned, the compiled plan was up to date. However, the synchronization doesn't guarantee that the plan is up to date the moment after the method has returned, does it? How do we know the plan is still valid then? Is it because of the uncertainty we keep calling upToDate() multiple times during execution? GPS#getActivation & GPS#finish will not be called per execution (except when using a Statement).
The upToDate() check interacts with the table locking of any DDL that lead to the invalidation. When a table T is modified via DDL there is an exclusive lock held on T. This lock is obtained and then plans dependent on that table are modified. Thus if a statement has obtained an intent lock on T and it is valid (upToDate()) then it can complete its execution knowing that no DDL can proceed and invalidate it since it holds an intent table lock that will block any DDL's exclusive lock. So ideally a plan will check that it's up to date once all of its table locks are obtained, in Derby this is not centralized. Some DBMS's as part of their compilation setup a list of table intent locks and obtain them at the start of execution. In Derby this is handled by calling checkStatementValdity() in *each* open of a ResultSet (possibly regardless of it it obtains a table lock or not). Ideally this would be in one place, maybe after the open of the top level (language) ResultSet and thus executed once per-plan. I'm not sure though if the top-level open is guaranteed to open all the tables that the plan requires. There's room for improvement here, not least by writing up & understanding all the interactions. I ran the Values1 test on a Sun Fire T2000 with 32 virtual processors (running
Solaris 10 and Java version 1.6.0_15) and noticed that there was a simple change in BaseActivation.checkStatementValidity() that improved the situation somewhat. As mentioned in the previous comments, there's a synchronized block in checkStatementValidity() where a lot of time is spent waiting: synchronized (preStmt) { if ((gc == preStmt.getActivationClass()) && preStmt.upToDate()) return; } If the (gc == preStmt.getActivationClass()) check is moved inside preStmt.upToDate(), which is also synchronized on preStmt, we avoid a double synchronization. This appears to take some of the pressure off the monitor and allows the Values1 test to scale better. The preStmt monitor is still very hot, though, so the performance still breaks down when too many threads are added, but it is able to handle more threads than before before it breaks down. The attached patch and graph (patch-1a.diff and patch-1a.png) show the change and its effect on the scalability. Whereas trunk maxes out on 5 threads and 305K tx/s, the patched version maxes out on 7 threads and 520K tx/s. After both trunk and the patched version have collapsed because of too many threads, the patched version seems to stabilize on a level 30% higher than trunk. For comparison, the graph also shows the results for trunk with separate plans for each thread. Its throughput grows steadily for each thread added until the number of threads reaches the number of virtual processors (32), which is still far better than with shared plans, so it's clear that the patch is not a full solution to this issue. It doesn't do anything with the underlying problem, which is that upToDate() is called way too frequently during execution, but it may be a good first step to remove the overhead of shared plans. One may perhaps expect the JVM to be able to eliminate double synchronization, so that such a change should not be necessary. Anyhow, I think the change would make sense even without any performance benefit, as it hides some of GenericPreparedStatement's internal synchronization details from users of the PreparedStatement interface. Committed patch-1a.diff to trunk with revision 824657.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||