Derby
  1. Derby
  2. DERBY-649

Useful indexes not used in UNION ALL

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 10.1.3.1, 10.2.1.6
    • Component/s: SQL
    • Labels:
      None

      Description

      Frederic Moreau reports (http://mail-archives.apache.org/mod_mbox/db-derby-user/200510.mbox/browser):

      Hello,

      The optimizer does not take my indexes into account when I do a select on
      a 'UNION ALL' type of view ; therefore, table scans are done and
      performances are bad.

      Note : my indexes are taken into account if I try equivalent selects on
      tables (instead of views).

      Please find below a sample illustrating the problem using the
      RUNTIMESTATISTICS calls.

      Could anynone help me on this subject ?
      Thank you.

      My cloudscape version is the 10.0.2.1 one.
      I also tried it on the 10.1.1.0 version (same result).

      c:\>java -classpath
      "%CLOUDSCAPE_INSTALL%\lib\derbyclient.jar;%CLOUDSCAPE_INSTALL%\lib\derbytools.jar"
      -Dij.driver=org.apache.derby.jdbc.ClientDriver
      -Dij.protocol=jdbc:derby://localhost:1527/ -Dij.user=APP -Dij.password=APP
      -Dij.maximumDisplayWidth=32768 org.apache.derby.tools.ij ij> connect
      'testdb' ;

      ij> ;
      ij> create table test.table1(a integer, b integer, c integer);
      ij> create index test.table1idx on test.table1(b);
      ij> ;
      ij> create table test.table2(a integer, b integer, c integer);
      ij> create index test.table2idx on test.table2(b);
      ij> ;
      ij> create view test.view0 as select all a,b from test.table1 union all
      select a,b from test.table2;
      ij> ;
      ij> CALL SYSCS_UTIL.SYSCS_SET_RUNTIMESTATISTICS(1);
      ij> select a from test.table1 where b=25;
      ij> VALUES SYSCS_UTIL.SYSCS_GET_RUNTIMESTATISTICS();
      ...
      Index Scan ResultSet for TABLE1 using index TABLE1IDX at read
      committed isolation level using instantaneous share row locking chosen by
      the optimizer
      ...
      ij> select a from test.table2 where b=25;
      ij> VALUES SYSCS_UTIL.SYSCS_GET_RUNTIMESTATISTICS();
      ...
      Index Scan ResultSet for TABLE2 using index TABLE2IDX at read
      committed isolation level using instantaneous share row locking chosen by
      the optimizer
      ...
      ij> select a from test.view0 where b=25;
      ij> VALUES SYSCS_UTIL.SYSCS_GET_RUNTIMESTATISTICS();
      ...
      Table Scan ResultSet for TABLE1 at read committed isolation level
      using share row locking chosen by the optimizer
      ...
      Table Scan ResultSet for TABLE2 at read committed isolation level
      using share row locking chosen by the optimizer
      ...
      ij> ;

      1. DERBY-649.stat
        0.5 kB
        Satheesh Bandaram
      2. DERBY-649.patch
        31 kB
        Satheesh Bandaram

        Activity

        Hide
        Jeff Lichtman added a comment -

        The problem is probably that the qualification is not being pushed down into the union. In the example script, the select from the view is expanded to:

        select a from (select all a,b from test.table1 union all
        select a,b from test.table2) tab where b = 25

        After this expansion, the compilation phase should push the qualification down into the union:

        select a from (select all a,b from test.table1 where b = 25 union all
        select a,b from test.table2 where b = 25) tab

        This should probably happen in the preprocessing phase.

        Show
        Jeff Lichtman added a comment - The problem is probably that the qualification is not being pushed down into the union. In the example script, the select from the view is expanded to: select a from (select all a,b from test.table1 union all select a,b from test.table2) tab where b = 25 After this expansion, the compilation phase should push the qualification down into the union: select a from (select all a,b from test.table1 where b = 25 union all select a,b from test.table2 where b = 25) tab This should probably happen in the preprocessing phase.
        Hide
        Satheesh Bandaram added a comment -

        I am trying to address this problem by trying to push predicates into UNION. For now, I am not handling the cases where the select list may have expressions and handling only simple column references. I am attempting to push predicates into both left and right ResultSets, much like PredicateList.pushExpressionsIntoSelect handles the case of single SelectNodes. Does this sound like a reasonable approach?

        My initial attempts don't seem to make the query use the index... I suspect I am not correctly remapping column references once pushed inside the selects.

        Show
        Satheesh Bandaram added a comment - I am trying to address this problem by trying to push predicates into UNION. For now, I am not handling the cases where the select list may have expressions and handling only simple column references. I am attempting to push predicates into both left and right ResultSets, much like PredicateList.pushExpressionsIntoSelect handles the case of single SelectNodes. Does this sound like a reasonable approach? My initial attempts don't seem to make the query use the index... I suspect I am not correctly remapping column references once pushed inside the selects.
        Hide
        Satheesh Bandaram added a comment -

        I have a patch for this bug. While I am still testing it, I would appreciate any comments about my approach. I have a customer who is hurting a lot because of this bug. Once complete, I would like to put this fix into both 10.1.x and trunk versions.

        I have implemented a simpler solution to address this optimization for some cases where predicates are of the form <ColumnReference> <RELOP> <constant>. These causes really benifit the most by pushing predicates down these into inner selects of union, since that would enable optimizer to use applicable indexes on them.

        I will also enhance this patch to make it more generic for trunk later.

        Show
        Satheesh Bandaram added a comment - I have a patch for this bug. While I am still testing it, I would appreciate any comments about my approach. I have a customer who is hurting a lot because of this bug. Once complete, I would like to put this fix into both 10.1.x and trunk versions. I have implemented a simpler solution to address this optimization for some cases where predicates are of the form <ColumnReference> <RELOP> <constant>. These causes really benifit the most by pushing predicates down these into inner selects of union, since that would enable optimizer to use applicable indexes on them. I will also enhance this patch to make it more generic for trunk later.
        Hide
        Satheesh Bandaram added a comment -

        First version of the patch. I am still testing and enhancing the patch. Appreciate any comments.

        Show
        Satheesh Bandaram added a comment - First version of the patch. I am still testing and enhancing the patch. Appreciate any comments.
        Hide
        Daniel John Debrunner added a comment -

        -1 on the patch, causes a regression. Will also vote -1 on the subsequent patch for 772 as it is dependent on this patch.

        Run this simple script, at svn revision 357054 the output is expected including the
        two argument IN clause on the select from the view returning two rows.
        Once this patch is applied, svn revision 357105, that select returns no rows.

        DROP VIEW V1;
        DROP TABLE D1;
        CREATE TABLE D1 (A INT, B VARCHAR(4) FOR BIT DATA);

        INSERT INTO D1 VALUES (1, x'600Eaaef') ;
        INSERT INTO D1 VALUES (2, x'83452213') ;

        select * from D1 where B IN (x'600Eaaef',x'83452213') ;
        select * from D1 where B IN (x'83452213') ;
        select * from D1 where B IN (x'600Eaaef') ;

        CREATE VIEW V1 AS SELECT A,B FROM D1 UNION SELECT A,B FROM D1;

        SELECT * FROM V1;

        – this fails! returns no rows
        select * from V1 where B IN (x'600Eaaef',x'83452213') ;
        select * from V1 where B IN (x'83452213') ;
        select * from V1 where B IN (x'600Eaaef') ;

        Show
        Daniel John Debrunner added a comment - -1 on the patch, causes a regression. Will also vote -1 on the subsequent patch for 772 as it is dependent on this patch. Run this simple script, at svn revision 357054 the output is expected including the two argument IN clause on the select from the view returning two rows. Once this patch is applied, svn revision 357105, that select returns no rows. DROP VIEW V1; DROP TABLE D1; CREATE TABLE D1 (A INT, B VARCHAR(4) FOR BIT DATA); INSERT INTO D1 VALUES (1, x'600Eaaef') ; INSERT INTO D1 VALUES (2, x'83452213') ; select * from D1 where B IN (x'600Eaaef',x'83452213') ; select * from D1 where B IN (x'83452213') ; select * from D1 where B IN (x'600Eaaef') ; CREATE VIEW V1 AS SELECT A,B FROM D1 UNION SELECT A,B FROM D1; SELECT * FROM V1; – this fails! returns no rows select * from V1 where B IN (x'600Eaaef',x'83452213') ; select * from V1 where B IN (x'83452213') ; select * from V1 where B IN (x'600Eaaef') ;
        Hide
        Daniel John Debrunner added a comment -

        I think I've found the bug in the patch. I'd appreciate any optimizer experts looking at this.

        In PredicateList.pushExpressionsIntoSelect when the predicate is copied, any type of binary relational node can be copied, but the new relational node create is always an quality node, it is not based upon the type being pushed.

        I'm replacing this code, difference is first argument to getNode() :
        around line 1438 - changes would also be made to te variable name, to correctly represent its use.

        BinaryRelationalOperatorNode newEquals = (BinaryRelationalOperatorNode)
        getNodeFactory().getNode(
        C_NodeTypes.BINARY_EQUALS_OPERATOR_NODE,
        newCRNode,
        opNode.getRightOperand(),
        getContextManager());

        with

        BinaryRelationalOperatorNode newEquals = (BinaryRelationalOperatorNode)
        getNodeFactory().getNode(
        opNode.getNodeType(),
        newCRNode,
        opNode.getRightOperand(),
        getContextManager());

        I'd incorrectly assumed in the review that the (incorrect) new equality node was related to the boolean constant TRUE being created. So sort of '(pushed expression) = TRUE' was being pushed and required for some reason. I knew there was a good reason I'd asked for comments on this code section:

        > Dec 15th 08:19
        > For example, why do we need a new nodes that
        > represent '= TRUE', I'm sure it's required but to a reader of the code
        it's not obvious why.

        Show
        Daniel John Debrunner added a comment - I think I've found the bug in the patch. I'd appreciate any optimizer experts looking at this. In PredicateList.pushExpressionsIntoSelect when the predicate is copied, any type of binary relational node can be copied, but the new relational node create is always an quality node, it is not based upon the type being pushed. I'm replacing this code, difference is first argument to getNode() : around line 1438 - changes would also be made to te variable name, to correctly represent its use. BinaryRelationalOperatorNode newEquals = (BinaryRelationalOperatorNode) getNodeFactory().getNode( C_NodeTypes.BINARY_EQUALS_OPERATOR_NODE, newCRNode, opNode.getRightOperand(), getContextManager()); with BinaryRelationalOperatorNode newEquals = (BinaryRelationalOperatorNode) getNodeFactory().getNode( opNode.getNodeType(), newCRNode, opNode.getRightOperand(), getContextManager()); I'd incorrectly assumed in the review that the (incorrect) new equality node was related to the boolean constant TRUE being created. So sort of '(pushed expression) = TRUE' was being pushed and required for some reason. I knew there was a good reason I'd asked for comments on this code section: > Dec 15th 08:19 > For example, why do we need a new nodes that > represent '= TRUE', I'm sure it's required but to a reader of the code it's not obvious why.
        Hide
        Rick Hillegas added a comment -

        Hi Dan, your fix looks good. I would be tempted to sand it down a bit: Since the new relational operator could be something other than "==", you might want to call it "newRelop" rather than "newEquals".

        Show
        Rick Hillegas added a comment - Hi Dan, your fix looks good. I would be tempted to sand it down a bit: Since the new relational operator could be something other than "==", you might want to call it "newRelop" rather than "newEquals".
        Hide
        Daniel John Debrunner added a comment -

        Submiited a modified version of the patch thus removing my veto for this and the DERBY-772 patch.

        Show
        Daniel John Debrunner added a comment - Submiited a modified version of the patch thus removing my veto for this and the DERBY-772 patch.
        Hide
        Satheesh Bandaram added a comment -

        Fix ported from 10.1 branch to trunk.

        Sending java\engine\org\apache\derby\impl\sql\compile\PredicateList.java
        Sending java\engine\org\apache\derby\impl\sql\compile\ProjectRestrictNode.java
        Sending java\engine\org\apache\derby\impl\sql\compile\SelectNode.java
        Sending java\engine\org\apache\derby\impl\sql\compile\UnionNode.java
        Sending java\testing\org\apache\derbyTesting\functionTests\master\predicatesIntoViews.out
        Sending java\testing\org\apache\derbyTesting\functionTests\tests\lang\predicatesIntoViews.sql
        Transmitting file data ......
        Committed revision 370247.

        Show
        Satheesh Bandaram added a comment - Fix ported from 10.1 branch to trunk. Sending java\engine\org\apache\derby\impl\sql\compile\PredicateList.java Sending java\engine\org\apache\derby\impl\sql\compile\ProjectRestrictNode.java Sending java\engine\org\apache\derby\impl\sql\compile\SelectNode.java Sending java\engine\org\apache\derby\impl\sql\compile\UnionNode.java Sending java\testing\org\apache\derbyTesting\functionTests\master\predicatesIntoViews.out Sending java\testing\org\apache\derbyTesting\functionTests\tests\lang\predicatesIntoViews.sql Transmitting file data ...... Committed revision 370247.
        Hide
        Satheesh Bandaram added a comment -

        Fix checked into both 10.1 and trunk.

        Show
        Satheesh Bandaram added a comment - Fix checked into both 10.1 and trunk.
        Hide
        Satheesh Bandaram added a comment -

        Fix has been verified against customer database and query on 10.2 and 10.1.

        Show
        Satheesh Bandaram added a comment - Fix has been verified against customer database and query on 10.2 and 10.1.
        Hide
        Andrew McIntyre added a comment -

        Reopening to set Fix In to 10.1.3 also.

        Show
        Andrew McIntyre added a comment - Reopening to set Fix In to 10.1.3 also.

          People

          • Assignee:
            Unassigned
            Reporter:
            Rick Hillegas
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development