
|
If you were logged in you would be able to see more operations.
|
|
|
|
File Attachments:
|
|
|
Environment:
|
generic
|
|
| Issue & fix info: |
Release Note Needed
|
| Resolution Date: |
19/Sep/06 03:34 PM
|
|
Derby's handling of union subqueries in from list can be improved by materializing invariant resultsets once, rather than creating them many times.
For example:
create view V1 as select i, j from T1 union select i,j from T2;
create view V2 as select a,b from T3 union select a,b from T4;
insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5);
For a query like select * from V1, V2 where V1.j = V2.b and V1.i in (1,2,3,4,5), it is possible the resultset for V2 is created 5 times. (assuming V2 is choosen as the the inner table) This can be very costly if the underlying selects can take long time and also may perform union many times.
Enhance materialization logic in setOperatorNode.java. It currently returns FALSE always.
public boolean performMaterialization(JBitSet outerTables)
throws StandardException
{
// RESOLVE - just say no to materialization right now - should be a cost based decision
return false;
/* Actual materialization, if appropriate, will be placed by our parent PRN.
* This is because PRN might have a join condition to apply. (Materialization
* can only occur before that.
*/
//return true;
}
|
|
Description
|
Derby's handling of union subqueries in from list can be improved by materializing invariant resultsets once, rather than creating them many times.
For example:
create view V1 as select i, j from T1 union select i,j from T2;
create view V2 as select a,b from T3 union select a,b from T4;
insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5);
For a query like select * from V1, V2 where V1.j = V2.b and V1.i in (1,2,3,4,5), it is possible the resultset for V2 is created 5 times. (assuming V2 is choosen as the the inner table) This can be very costly if the underlying selects can take long time and also may perform union many times.
Enhance materialization logic in setOperatorNode.java. It currently returns FALSE always.
public boolean performMaterialization(JBitSet outerTables)
throws StandardException
{
// RESOLVE - just say no to materialization right now - should be a cost based decision
return false;
/* Actual materialization, if appropriate, will be placed by our parent PRN.
* This is because PRN might have a join condition to apply. (Materialization
* can only occur before that.
*/
//return true;
} |
Show » |
| Repository |
Revision |
Date |
User |
Message |
| ASF |
#423989 |
Thu Jul 20 17:09:03 UTC 2006 |
bandaram |
DERBY-781: Materialize select subqueries where possible to avoid recreating their resultsets multiple times. Here is more info from the contributor.
Attaching a patch (d781_v1.patch) to address this issue by allowing the optimizer to consider and choose hash joins with subqueries, which is a more general case of the specific union example mentioned in the description for this issue. In brief, the patch does this by following up on the suggestions given by Jeff Lichtman in comments above and also in the following thread:
http://article.gmane.org/gmane.comp.apache.db.derby.devel/12208
Since result set materialization comes for "free" with hash joins, that fact we now allow hash joins with subqueries (as of this patch) means that we implicitly have a way to materialize the subquery result sets.
The details of the patch are included as DERBY-781_v1.html. I added a simple test to lang/subquery.sql to demonstrate that the optimizer can and will choose to do hash joins for subqueries, and I updated one other master file--predicatesIntoViews--for which the optimizer is now choosing a hash join instead of a nested loop. Testing of "unsafe" hash joins (see section VII of the document) and generation of correct plans is done through existing tests, esp. the lang/lojreorder.sql test, which was very useful in helping to verify the correctness of the changes.
Note that I did not add the sample union query shown in the description for this issue to the tests because when I run it against the current codeline, the optimizer will already choose to do materialization of the UnionNode (via hash join) even without the patch for this issue, and thus it didn't seem like that particular test case was useful. The new test in subqery.sql is more relevant because the optimizer will choose to do a nested loop join with the subquery before my changes and will do a hash join after my changes, which seems to more accurately reflect what this issue is about.
I ran derbyall using sane jars on Red Hat Linux with ibm142 and saw no new failures, and the overall execution time does not change despite the extra work the optimizer is doing.
Submitted by Army Brown (qozinx@gmail.com)
|
| Files Changed |
MODIFY
/db/derby/code/trunk/java/testing/org/apache/derbyTesting/functionTests/tests/lang/grantRevoke.java
MODIFY
/db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/HashJoinStrategy.java
MODIFY
/db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/RelationalOperator.java
MODIFY
/db/derby/code/trunk/java/testing/org/apache/derbyTesting/functionTests/master/predicatesIntoViews.out
MODIFY
/db/derby/code/trunk/java/testing/org/apache/derbyTesting/functionTests/master/predicatePushdown.out
MODIFY
/db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/UnaryComparisonOperatorNode.java
MODIFY
/db/derby/code/trunk/java/testing/org/apache/derbyTesting/functionTests/master/subquery.out
MODIFY
/db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/Predicate.java
MODIFY
/db/derby/code/trunk/java/testing/org/apache/derbyTesting/functionTests/tests/lang/subquery.sql
MODIFY
/db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/ProjectRestrictNode.java
MODIFY
/db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/OptimizerImpl.java
MODIFY
/db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/PredicateList.java
MODIFY
/db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/BinaryRelationalOperatorNode.java
|
|