Issue Details (XML | Word | Printable)

Key: DERBY-781
Type: Improvement Improvement
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: A B
Reporter: Satheesh Bandaram
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Derby

Materialize subqueries in select list where possible to avoid creating invariant resultsets many times.

Created: 23/Dec/05 12:15 PM   Updated: 30/Jun/09 04:12 PM
Return to search
Component/s: SQL
Affects Version/s: 10.1.1.0, 10.2.1.6
Fix Version/s: 10.2.1.6

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works d781_v1.patch 2006-07-01 08:55 PM A B 89 kB
File Licensed for inclusion in ASF works d781_v1.stat 2006-07-01 08:55 PM A B 0.9 kB
Text File Licensed for inclusion in ASF works d781_v2.patch 2006-07-12 10:37 PM A B 76 kB
HTML File Licensed for inclusion in ASF works DERBY-781_v1.html 2006-07-01 08:55 PM A B 91 kB
Environment: generic

Issue & fix info: Release Note Needed
Resolution Date: 19/Sep/06 03:34 PM


 Description  « Hide
Derby's handling of union subqueries in from list can be improved by materializing invariant resultsets once, rather than creating them many times.

For example:

create view V1 as select i, j from T1 union select i,j from T2;
create view V2 as select a,b from T3 union select a,b from T4;
insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5);

For a query like select * from V1, V2 where V1.j = V2.b and V1.i in (1,2,3,4,5), it is possible the resultset for V2 is created 5 times. (assuming V2 is choosen as the the inner table) This can be very costly if the underlying selects can take long time and also may perform union many times.

Enhance materialization logic in setOperatorNode.java. It currently returns FALSE always.

public boolean performMaterialization(JBitSet outerTables)
throws StandardException
{
// RESOLVE - just say no to materialization right now - should be a cost based decision
return false;

/* Actual materialization, if appropriate, will be placed by our parent PRN.
* This is because PRN might have a join condition to apply. (Materialization
* can only occur before that.
*/
//return true;
}

 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Satheesh Bandaram made changes - 23/Dec/05 02:10 PM
Field Original Value New Value
Description Derby's handling of union subqueries in from list can be improved by materializing invariant resultsets once, rather than creating them many times.

For example:

create view V1 as select i, j from T1 union all select i,j from T2;
create view V2 as select a,b from T3 union all select a,b from T4;
insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5);

For a query like select * from V1, V2 where i in (1,2,3,4,5), it is possible the resultset for V2 is created 5 times. (assuming V2 is choosen as the the inner table) This can be very costly if the underlying selects can take long time and also may perform union many times.

Enhance materialization logic in setOperatorNode.java. It currently returns FALSE always.

public boolean performMaterialization(JBitSet outerTables)
throws StandardException
{
// RESOLVE - just say no to materialization right now - should be a cost based decision
return false;

/* Actual materialization, if appropriate, will be placed by our parent PRN.
* This is because PRN might have a join condition to apply. (Materialization
* can only occur before that.
*/
//return true;
}
Derby's handling of union subqueries in from list can be improved by materializing invariant resultsets once, rather than creating them many times.

For example:

create view V1 as select i, j from T1 union select i,j from T2;
create view V2 as select a,b from T3 union select a,b from T4;
insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5);

For a query like select * from V1, V2 where i in (1,2,3,4,5), it is possible the resultset for V2 is created 5 times. (assuming V2 is choosen as the the inner table) This can be very costly if the underlying selects can take long time and also may perform union many times.

Enhance materialization logic in setOperatorNode.java. It currently returns FALSE always.

public boolean performMaterialization(JBitSet outerTables)
throws StandardException
{
// RESOLVE - just say no to materialization right now - should be a cost based decision
return false;

/* Actual materialization, if appropriate, will be placed by our parent PRN.
* This is because PRN might have a join condition to apply. (Materialization
* can only occur before that.
*/
//return true;
}
Satheesh Bandaram made changes - 24/Dec/05 12:57 AM
Description Derby's handling of union subqueries in from list can be improved by materializing invariant resultsets once, rather than creating them many times.

For example:

create view V1 as select i, j from T1 union select i,j from T2;
create view V2 as select a,b from T3 union select a,b from T4;
insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5);

For a query like select * from V1, V2 where i in (1,2,3,4,5), it is possible the resultset for V2 is created 5 times. (assuming V2 is choosen as the the inner table) This can be very costly if the underlying selects can take long time and also may perform union many times.

Enhance materialization logic in setOperatorNode.java. It currently returns FALSE always.

public boolean performMaterialization(JBitSet outerTables)
throws StandardException
{
// RESOLVE - just say no to materialization right now - should be a cost based decision
return false;

/* Actual materialization, if appropriate, will be placed by our parent PRN.
* This is because PRN might have a join condition to apply. (Materialization
* can only occur before that.
*/
//return true;
}
Derby's handling of union subqueries in from list can be improved by materializing invariant resultsets once, rather than creating them many times.

For example:

create view V1 as select i, j from T1 union select i,j from T2;
create view V2 as select a,b from T3 union select a,b from T4;
insert into T1 values (1,1), (2,2), (3,3), (4,4), (5,5);

For a query like select * from V1, V2 where V1.j = V2.b and V1.i in (1,2,3,4,5), it is possible the resultset for V2 is created 5 times. (assuming V2 is choosen as the the inner table) This can be very costly if the underlying selects can take long time and also may perform union many times.

Enhance materialization logic in setOperatorNode.java. It currently returns FALSE always.

public boolean performMaterialization(JBitSet outerTables)
throws StandardException
{
// RESOLVE - just say no to materialization right now - should be a cost based decision
return false;

/* Actual materialization, if appropriate, will be placed by our parent PRN.
* This is because PRN might have a join condition to apply. (Materialization
* can only occur before that.
*/
//return true;
}
Satheesh Bandaram made changes - 05/Jan/06 05:18 AM
Comment [ Thanks Jeff for your analysis. I was just getting ready to file another improvement request to make this optimization more generic. (not specific to unions) I have seen huge improvements in two different customer situations. For the situation I filed the defect, each of the views (V1 and V2) had 36 tables each and by materializing the inner view into a temp. table, I noticed speed up from 70-150 seconds to under 3 seconds. (including the cost of creating temp. table)

I also saw another situation later without unions where materializing some table subqueries improved performance by couple of orders of magnitude. So you are right... this optimization can be applied to other cases too.

I think materialization with or without hash joins should be useful. In both situations, creating temp. table that materialized derived tables improved so much.

    ]
A B made changes - 01/Jul/06 08:47 PM
Assignee A B [ army ]
A B made changes - 01/Jul/06 08:55 PM
Attachment d781_v1.stat [ 12336228 ]
Attachment DERBY-781_v1.html [ 12336229 ]
Attachment d781_v1.patch [ 12336227 ]
A B made changes - 01/Jul/06 08:58 PM
Derby Info [Patch Available]
A B made changes - 12/Jul/06 10:37 PM
Attachment d781_v2.patch [ 12336745 ]
A B made changes - 18/Jul/06 08:21 PM
Derby Info [Patch Available] [Patch Available, Release Note Needed]
Kathey Marsden made changes - 19/Jul/06 04:14 PM
Derby Info [Patch Available, Release Note Needed] [Patch Available, Existing Application Impact, Release Note Needed]
A B made changes - 21/Jul/06 04:56 PM
Resolution Fixed [ 1 ]
Derby Info [Existing Application Impact, Patch Available, Release Note Needed] [Existing Application Impact, Release Note Needed]
Status Open [ 1 ] Resolved [ 5 ]
A B made changes - 13/Sep/06 06:59 PM
Summary Materialize union subqueries in select list where possible to avoid creating invariant resultsets many times. Materialize subqueries in select list where possible to avoid creating invariant resultsets many times.
Derby Info [Release Note Needed, Existing Application Impact] [Existing Application Impact, Release Note Needed]
A B made changes - 13/Sep/06 07:01 PM
Status Resolved [ 5 ] Closed [ 6 ]
A B made changes - 19/Sep/06 03:33 PM
Resolution Fixed [ 1 ]
Status Closed [ 6 ] Reopened [ 4 ]
A B made changes - 19/Sep/06 03:34 PM
Status Reopened [ 4 ] Resolved [ 5 ]
Derby Info [Release Note Needed, Existing Application Impact] [Existing Application Impact, Release Note Needed]
Resolution Fixed [ 1 ]
Fix Version/s 10.2.1.0 [ 11187 ]
A B made changes - 19/Sep/06 03:34 PM
Status Resolved [ 5 ] Closed [ 6 ]
Dag H. Wanvik made changes - 30/Jun/09 04:12 PM
Issue & fix info [Release Note Needed, Existing Application Impact] [Release Note Needed]