Issue Details (XML | Word | Printable)

Key: DERBY-781
Type: Improvement Improvement
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: A B
Reporter: Satheesh Bandaram
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Derby

Materialize subqueries in select list where possible to avoid creating invariant resultsets many times.

Created: 23/Dec/05 12:15 PM   Updated: 30/Jun/09 04:12 PM
Return to search
Component/s: SQL
Affects Version/s: 10.1.1.0, 10.2.1.6
Fix Version/s: 10.2.1.6

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works d781_v1.patch 2006-07-01 08:55 PM A B 89 kB
File Licensed for inclusion in ASF works d781_v1.stat 2006-07-01 08:55 PM A B 0.9 kB
Text File Licensed for inclusion in ASF works d781_v2.patch 2006-07-12 10:37 PM A B 76 kB
HTML File Licensed for inclusion in ASF works DERBY-781_v1.html 2006-07-01 08:55 PM A B 91 kB
Environment: generic

Issue & fix info: Release Note Needed
Resolution Date: 19/Sep/06 03:34 PM


 Description  « Hide
Derby's handling of union subqueries in from list can be improved by materializing invariant resultsets once, rather than creating them many times.

For example:

create view V1 as select i, j from T1 union select i,j from T2;
create view V2 as select a,b from T3 union select a,b from T4;
insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5);

For a query like select * from V1, V2 where V1.j = V2.b and V1.i in (1,2,3,4,5), it is possible the resultset for V2 is created 5 times. (assuming V2 is choosen as the the inner table) This can be very costly if the underlying selects can take long time and also may perform union many times.

Enhance materialization logic in setOperatorNode.java. It currently returns FALSE always.

public boolean performMaterialization(JBitSet outerTables)
throws StandardException
{
// RESOLVE - just say no to materialization right now - should be a cost based decision
return false;

/* Actual materialization, if appropriate, will be placed by our parent PRN.
* This is because PRN might have a join condition to apply. (Materialization
* can only occur before that.
*/
//return true;
}

 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Repository Revision Date User Message
ASF #423989 Thu Jul 20 17:09:03 UTC 2006 bandaram DERBY-781: Materialize select subqueries where possible to avoid recreating their resultsets multiple times. Here is more info from the contributor.

Attaching a patch (d781_v1.patch) to address this issue by allowing the optimizer to consider and choose hash joins with subqueries, which is a more general case of the specific union example mentioned in the description for this issue. In brief, the patch does this by following up on the suggestions given by Jeff Lichtman in comments above and also in the following thread:

http://article.gmane.org/gmane.comp.apache.db.derby.devel/12208

Since result set materialization comes for "free" with hash joins, that fact we now allow hash joins with subqueries (as of this patch) means that we implicitly have a way to materialize the subquery result sets.

The details of the patch are included as DERBY-781_v1.html. I added a simple test to lang/subquery.sql to demonstrate that the optimizer can and will choose to do hash joins for subqueries, and I updated one other master file--predicatesIntoViews--for which the optimizer is now choosing a hash join instead of a nested loop. Testing of "unsafe" hash joins (see section VII of the document) and generation of correct plans is done through existing tests, esp. the lang/lojreorder.sql test, which was very useful in helping to verify the correctness of the changes.

Note that I did not add the sample union query shown in the description for this issue to the tests because when I run it against the current codeline, the optimizer will already choose to do materialization of the UnionNode (via hash join) even without the patch for this issue, and thus it didn't seem like that particular test case was useful. The new test in subqery.sql is more relevant because the optimizer will choose to do a nested loop join with the subquery before my changes and will do a hash join after my changes, which seems to more accurately reflect what this issue is about.

I ran derbyall using sane jars on Red Hat Linux with ibm142 and saw no new failures, and the overall execution time does not change despite the extra work the optimizer is doing.

Submitted by Army Brown (qozinx@gmail.com)
Files Changed
MODIFY /db/derby/code/trunk/java/testing/org/apache/derbyTesting/functionTests/tests/lang/grantRevoke.java
MODIFY /db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/HashJoinStrategy.java
MODIFY /db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/RelationalOperator.java
MODIFY /db/derby/code/trunk/java/testing/org/apache/derbyTesting/functionTests/master/predicatesIntoViews.out
MODIFY /db/derby/code/trunk/java/testing/org/apache/derbyTesting/functionTests/master/predicatePushdown.out
MODIFY /db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/UnaryComparisonOperatorNode.java
MODIFY /db/derby/code/trunk/java/testing/org/apache/derbyTesting/functionTests/master/subquery.out
MODIFY /db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/Predicate.java
MODIFY /db/derby/code/trunk/java/testing/org/apache/derbyTesting/functionTests/tests/lang/subquery.sql
MODIFY /db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/ProjectRestrictNode.java
MODIFY /db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/OptimizerImpl.java
MODIFY /db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/PredicateList.java
MODIFY /db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/BinaryRelationalOperatorNode.java

Repository Revision Date User Message
ASF #436908 Fri Aug 25 19:07:43 UTC 2006 jta DERBY-1601 : Add documentation to the Tuning Guide about the new subquery materialization via hash
join behavior that was introduced with DERBY-781.

Committed patch derby1601_tuning2.diff by Laura Stewart <scotsmatrix@gmail.com>.
Files Changed
MODIFY /db/derby/docs/trunk/src/tuning/ctuntransform25857.dita

Repository Revision Date User Message
ASF #447899 Tue Sep 19 16:26:15 UTC 2006 rhillegas DERBY-1860: Add Army's release note for DERBY-781.
Files Changed
MODIFY /db/derby/code/branches/10.2/RELEASE-NOTES.html