Issue Details (XML | Word | Printable)

Key: DERBY-558
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: A B
Reporter: A B
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Derby

Optimizer hangs with query that uses more than 6 tables and does subquery flattening.

Created: 09/Sep/05 06:27 AM   Updated: 11/Jul/06 11:51 PM
Return to search
Component/s: SQL
Affects Version/s: 10.0.2.0, 10.0.2.1, 10.0.2.2, 10.1.1.0, 10.1.2.1, 10.2.1.6
Fix Version/s: 10.1.2.1, 10.2.1.6

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works d558.patch 2005-09-13 01:00 AM A B 10 kB
Text File Licensed for inclusion in ASF works repro.sql 2005-09-09 06:35 AM A B 0.6 kB
Environment: Running query in "ij" with derby.optimizer.noTimeout=true

Resolution Date: 06/Oct/05 01:55 AM


 Description  « Hide
I was running a query that has a large number (hundreds) of tables in it and I set the derby property "derby.optimizer.noTimeout" to true to see what plan Derby would choose as the _best_ plan for the query. When doing so, I ran into a situation where the optimizer hung forever--which is wrong. I expect that setting "noTimeout" to true might cause the query to run more slowly (since it has to evaluate ALL possible join orders for all of the tables in question), but it should _not_ cause the optimizer to hang forever.

I noticed that "subquery flattening" is peformed on the query, which introduces dependencies between the various tables and thus restricts the possible join orders that the optimizer can choose (see http://db.apache.org/derby/docs/10.1/tuning/ctuntransform25868.html). I was eventually able to track the problem down to code in OptimizerImpl where, for queries with more than 6 tables, a certain "jumping" algorithm is used to try to allow the optimizer to find a better plan more quickly.

Long story short, there is logic in the "jumping" mechanism that tries to put the tables into a legal join order, but in certain (rare) cases where multiple join order dependencies have to be enforced, the jump logic can end up looping indefinitely, causing the "hang" in the optimizer.

 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
A B added a comment - 09/Sep/05 06:35 AM
Attaching a simplified reproduction of the hang. Note that this particular reproduction is completely contrived and nonsensical, but it nonetheless demonstrates the problem. In order to reproduce, start ij with the "derby.optimizer.noTimeout" property set to true, connect to a database, and then run the attached sql script:

> java -Dderby.optimizer.noTimeout=true org.apache.derby.tools.ij
ij version 10.2
ij> connect 'jdbc:derby:testdb;create=true';
ij> run 'repro.sql';

Note that the hang won't reproduce if "noTimeout" is false (which is the default) because eventually the optimizer will decide that it's taking too long and will quit. That's nice because it means most people won't ever see this problem :) However, when noTimeout is set to true the query _should_ still finish (even if it takes longer), so I _do_ think this is a bug.

A B added a comment - 13/Sep/05 01:00 AM
Attaching a patch for this problem. The patch does the following:

1) Fixes the logic in OptimizerImpl.java that was causing the hang (an indirect infinite loop).
2) Adds some comments describing the "JUMPING" logic that is in OptimizerImpl so that developers looking at the code can (hopefully) figure out what's going on more quickly in the future.
3) Adds a test case to the lang/subqueryFlattening.sql test for verification of the fix. The test case is based on the repro attached to this issue. NOTE: I had to set the "derby.optimizer.noTimeout" property to true for this entire test--I think this is okay since everything still passes (on my machine), but if anyone feels otherwise, please let me know...

I ran derbyall on Windows 2000 w/ Sun jdk 1.4.2 and saw no failures. If someone could review this, I'd be grateful..

Satheesh Bandaram added a comment - 04/Oct/05 10:14 AM
Submitted this patch to trunk. Army, would you like to see this fix in 10.1 branch also?

Thanks for fixing this interesting problem... And also for adding comments to existing mechanism. Great patch.

Sending java\engine\org\apache\derby\impl\sql\compile\OptimizerImpl.java
Sending java\testing\org\apache\derbyTesting\functionTests\master\subqueryFlattening.out
Sending java\testing\org\apache\derbyTesting\functionTests\tests\lang\subqueryFlattening.sql
Sending java\testing\org\apache\derbyTesting\functionTests\tests\lang\subqueryFlattening_derby.properties
Transmitting file data ....
Committed revision 293480.

A B added a comment - 05/Oct/05 06:09 AM
Yes, I think it'd be good to put this into the 10.1 branch as well, esp. if we can get it into the upcoming 10.1 bug fix release...

Satheesh Bandaram added a comment - 05/Oct/05 06:42 AM
Merged to 10.1 branch. Should be part of 10.1.2 release.

A B added a comment - 06/Oct/05 01:55 AM
I ran the repro attached to this issue as well as the new test case in lang/subqueryFlattening.sql against the trunk (10.2) and the 10.1 branch to verify that the changes have been committed and that things are working as they should. It all looks good, so I'm resolving and closing this issue. Thanks for committing, Satheesh.

Rick Hillegas added a comment - 11/Jul/06 11:51 PM
Assigning to SQL component.