Derby
  1. Derby
  2. DERBY-3253

NullPointer Exception (NPE) from query with IN predicate containing two values and joining a view with a large table. ERROR 38000: The exception 'java.lang.NullPointerException' was thrown while evaluating an expression.

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 10.3.1.4, 10.3.2.1, 10.4.1.3
    • Fix Version/s: 10.3.3.0, 10.4.1.3
    • Component/s: SQL
    • Labels:
      None
    • Bug behavior facts:
      Regression

      Description

      With a single value in the IN clause the query does not fail.
      > Run the following query in the attached database (v 10.3 db).

      SELECT A.TIMESTAMP, B.F_NAMEADDR, B.TOTAL_F,
      B.TOTAL_FS, B.TOTAL_FT, B.TOTAL_FX
      FROM TIME A, THE_VIEW B
      WHERE B.T_ID = A.T_ID AND B.F_NAMEADDR IN ('one.two.three.oscar','one.two.three.kathy')
      ORDER BY A.TIMESTAMP ASC;

      > result

      ERROR 38000: The exception 'java.lang.NullPointerException' was thrown while evaluating an expression.
      ERROR XJ001: Java exception: ': java.lang.NullPointerException'.

      Stack trace:
      Failed Statement is: SELECT A.TIMESTAMP, B.F_NAMEADDR, B.TOTAL_F,
      B.TOTAL_FS, B.TOTAL_FT, B.TOTAL_FX
      FROM TIME A, THE_VIEW B
      WHERE B.T_ID = A.T_ID AND B.F_NAMEADDR IN ('one.two.three.oscar','one.two.three.kathy')
      ORDER BY A.TIMESTAMP ASC
      ERROR 38000: The exception 'java.lang.NullPointerException' was thrown while evaluating an expression.
      at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
      at org.apache.derby.iapi.error.StandardException.unexpectedUserException(Unknown Source)
      at org.apache.derby.impl.services.reflect.DirectCall.invoke(Unknown Source)
      at org.apache.derby.impl.sql.execute.ProjectRestrictResultSet.getNextRowCore(Unknown Source)
      at org.apache.derby.impl.sql.execute.NestedLoopJoinResultSet.getNextRowCore(Unknown Source)
      at org.apache.derby.impl.sql.execute.ProjectRestrictResultSet.getNextRowCore(Unknown Source)
      at org.apache.derby.impl.sql.execute.SortResultSet.getRowFromResultSet(Unknown Source)
      at org.apache.derby.impl.sql.execute.SortResultSet.getNextRowFromRS(Unknown Source)
      at org.apache.derby.impl.sql.execute.SortResultSet.loadSorter(Unknown Source)
      at org.apache.derby.impl.sql.execute.SortResultSet.openCore(Unknown Source)
      at org.apache.derby.impl.sql.execute.BasicNoPutResultSetImpl.open(Unknown Source)
      at org.apache.derby.impl.sql.GenericPreparedStatement.execute(Unknown Source)
      at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown Source)
      at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source)
      at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source)
      at org.apache.derby.impl.tools.ij.ij.executeImmediate(Unknown Source)
      at org.apache.derby.impl.tools.ij.utilMain.doCatch(Unknown Source)
      at org.apache.derby.impl.tools.ij.utilMain.runScriptGuts(Unknown Source)
      at org.apache.derby.impl.tools.ij.utilMain.go(Unknown Source)
      at org.apache.derby.impl.tools.ij.Main.go(Unknown Source)
      at org.apache.derby.impl.tools.ij.Main.mainCore(Unknown Source)
      at org.apache.derby.impl.tools.ij.Main14.main(Unknown Source)
      at org.apache.derby.tools.ij.main(Unknown Source)
      Caused by: java.lang.NullPointerException
      at org.apache.derby.exe.ac601a400fx0116xa813xc2f7x00000010a3602.e8(Unknown Source)
      ... 21 more
      ============= begin nested exception, level (1) ===========
      java.lang.NullPointerException
      at org.apache.derby.exe.ac601a400fx0116xa813xc2f7x00000010a3602.e8(Unknown Source)
      at org.apache.derby.impl.services.reflect.DirectCall.invoke(Unknown Source)
      at org.apache.derby.impl.sql.execute.ProjectRestrictResultSet.getNextRowCore(Unknown Source)
      at org.apache.derby.impl.sql.execute.NestedLoopJoinResultSet.getNextRowCore(Unknown Source)
      at org.apache.derby.impl.sql.execute.ProjectRestrictResultSet.getNextRowCore(Unknown Source)
      at org.apache.derby.impl.sql.execute.SortResultSet.getRowFromResultSet(Unknown Source)
      at org.apache.derby.impl.sql.execute.SortResultSet.getNextRowFromRS(Unknown Source)
      at org.apache.derby.impl.sql.execute.SortResultSet.loadSorter(Unknown Source)
      at org.apache.derby.impl.sql.execute.SortResultSet.openCore(Unknown Source)
      at org.apache.derby.impl.sql.execute.BasicNoPutResultSetImpl.open(Unknown Source)
      at org.apache.derby.impl.sql.GenericPreparedStatement.execute(Unknown Source)
      at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown Source)
      at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source)
      at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source)
      at org.apache.derby.impl.tools.ij.ij.executeImmediate(Unknown Source)
      at org.apache.derby.impl.tools.ij.utilMain.doCatch(Unknown Source)
      at org.apache.derby.impl.tools.ij.utilMain.runScriptGuts(Unknown Source)
      at org.apache.derby.impl.tools.ij.utilMain.go(Unknown Source)
      at org.apache.derby.impl.tools.ij.Main.go(Unknown Source)
      at org.apache.derby.impl.tools.ij.Main.mainCore(Unknown Source)
      at org.apache.derby.impl.tools.ij.Main14.main(Unknown Source)
      at org.apache.derby.tools.ij.main(Unknown Source)
      ============= end nested exception, level (1) ===========

      Schema info:

      CREATE TABLE TIME ("T_ID" BIGINT NOT NULL, "TIMESTAMP" TIMESTAMP NOT NULL, "DAY" INTEGER NOT NULL, "WEEK" INTEGER NOT NULL, "MONTH" INTEGER NOT NULL, "YEAR_COL" INTEGER NOT NULL);

      CREATE TABLE F ("F_ID" BIGINT NOT NULL, "T_ID" BIGINT NOT NULL, "F_NAMEADDR" VARCHAR(250) NOT NULL, "TOTAL_F" BIGINT NOT NULL, "TOTAL_FS" BIGINT NOT NULL, "TOTAL_FT" BIGINT NOT NULL, "TOTAL_FX" BIGINT NOT NULL);

      CREATE VIEW the_view AS SELECT T.T_ID AS T_ID , T.F_NAMEADDR AS F_NAMEADDR,
      T.TOTAL_F AS TOTAL_F, T.TOTAL_FS AS TOTAL_FS, T.TOTAL_FT AS TOTAL_FT , T.TOTAL_FX AS TOTAL_FX
      FROM F AS T
      WHERE T.T_ID = (SELECT MAX(T_ID) FROM F);

      1. d3253_v3.patch
        9 kB
        A B
      2. d3253_v2_incomplete.patch
        10 kB
        A B
      3. d3253_v1.patch
        5 kB
        A B
      4. 3253ReproDB.zip
        2.17 MB
        Stan Bradbury

        Issue Links

          Activity

          Hide
          Stan Bradbury added a comment -

          The database with two tables and view definition needed to demonstrate this bug.

          Show
          Stan Bradbury added a comment - The database with two tables and view definition needed to demonstrate this bug.
          Hide
          A B added a comment -

          Based on a quick test it looks like this is a regression caused by DERBY-47.

          Show
          A B added a comment - Based on a quick test it looks like this is a regression caused by DERBY-47 .
          Hide
          A B added a comment - - edited

          Not sure if this a direct result of DERBY-47, or if DERBY-47 just changed execution plans such that another problem is now exposed, see esp. DERBY-3097 and related discussion.

          Show
          A B added a comment - - edited Not sure if this a direct result of DERBY-47 , or if DERBY-47 just changed execution plans such that another problem is now exposed, see esp. DERBY-3097 and related discussion.
          Hide
          A B added a comment -

          A surprisingly simple repro of this NPE can be created as follows:

          create table t1 (i int, vc varchar(10));
          insert into t1 values (1, 'one'), (2, 'two'), (3, 'three'), (1, 'un');

          select * from t1, (select * from t1) x
          where t1.i = x.i and x.vc in ('un', 'trois');

          The key here is that the IN list's left operand points to a column from the subselect, X.VC. As part of DERBY-47, the IN list will be changed into a BinaryRelationalOperatorNode of the form "X.VC = ?", and that node, which we call a "probe predicate", will serve to represent the IN list operator throughout the various phases of optimization.

          That said, as part of preprocessing Derby will look at the query and realize that the sub-select can be flattened. When flattening the subquery, any references to the subquery's RCL will be remapped to point to the underlying expression. That means the left operand of the probe predicate "X.VC = ?" will be changed to point directly to column "VC" of table T1. The code where this happens is in the "flatten" method of FromSubquery:

          /* Remap all ColumnReferences from the outer query to this node.

          • (We replace those ColumnReferences with clones of the matching
          • expression in the SELECT's RCL.
            */
            rcl.remapColumnReferencesToExpressions();
            outerPList.remapColumnReferencesToExpressions();

          For the example query above, outerPList holds the two predicates "T1.I = X.I" and "X.VC = ?", so we will attempt to remap the column references in those two predicates. That brings us to the remapColumnReferencesToExpressions() method of BinaryOperatorNode, where we have:

          public ValueNode remapColumnReferencesToExpressions()
          throws StandardException

          { leftOperand = leftOperand.remapColumnReferencesToExpressions(); rightOperand = rightOperand.remapColumnReferencesToExpressions(); return this; }

          Notice how the leftOperand can change here--and in the above query, it will change to point directly to T1 instead of indirectly to the subquery. So now the probe predicate's left operand is different from the left operand of the original InListOperatorNode that the probe predicate replaced. That in it itself is fine, but it causes problems later.

          Namely, when it comes time to generate the final tree for the query, we realize that the probe predicate is not "useful" for probing because it references "VC", which is the second column in table T1. Since probe predicates are only useful if they reference the first column in the table, per "orderUsefulPredicates(...)" of PredicateList.java:

          else if (pred.isInListProbePredicate()
          && (indexPosition > 0))
          {
          /* If the predicate is an IN-list probe predicate

          • then we only consider it to be useful if the
          • referenced column is the first one in the
          • index (i.e. if (indexPosition == 0)). Otherwise
          • the predicate would be treated as a qualifier
          • for store, which could lead to incorrect
          • results.
            */
            ....

          the probe predicate is not useful. That in turn means that when it comes time to generate the IN list operator, we'll "revert" back to the original InListOperatorNode--i.e. we will generate the InListOperatorNode instead of generating the probe predicate. This is found in the generateExpression() method of BinaryOperatorNode:

          if (ilon != null)

          { ilon.generateExpression(acb, mb); return; }

          But there's a problem here: as mentioned above, ilon (the InListOperatorNode) still has a left operand that points to a column from the subquery. Since we flattened the subquery out, that left operand is no longer valid--and that ultimately causes an execution time NPE because we try to apply the IN list restriction to a column from a subquery that does not exist.

          I tried a one-line fix to this code that seems to have resolved the issue:

          if (ilon != null)

          { ilon.setLeftOperand(this.leftOperand); // Added this line ilon.generateExpression(acb, mb); return; }

          (with appropriate code comments, of course).

          This has the effect of making sure that when we "revert" back to the original InListOperatorNode generation, we'll still generate the correct leftOperand--i.e. the left operand as it exists in the "probe predicate" upon completion of optimization.

          I'm attaching this small fix as d3253_v1.patch. I have yet to run the regression tests (they are running now), but I thought I'd post my findings for early review in the interim...

          Show
          A B added a comment - A surprisingly simple repro of this NPE can be created as follows: create table t1 (i int, vc varchar(10)); insert into t1 values (1, 'one'), (2, 'two'), (3, 'three'), (1, 'un'); select * from t1, (select * from t1) x where t1.i = x.i and x.vc in ('un', 'trois'); The key here is that the IN list's left operand points to a column from the subselect, X.VC. As part of DERBY-47 , the IN list will be changed into a BinaryRelationalOperatorNode of the form "X.VC = ?", and that node, which we call a "probe predicate", will serve to represent the IN list operator throughout the various phases of optimization. That said, as part of preprocessing Derby will look at the query and realize that the sub-select can be flattened. When flattening the subquery, any references to the subquery's RCL will be remapped to point to the underlying expression. That means the left operand of the probe predicate "X.VC = ?" will be changed to point directly to column "VC" of table T1. The code where this happens is in the "flatten" method of FromSubquery: /* Remap all ColumnReferences from the outer query to this node. (We replace those ColumnReferences with clones of the matching expression in the SELECT's RCL. */ rcl.remapColumnReferencesToExpressions(); outerPList.remapColumnReferencesToExpressions(); For the example query above, outerPList holds the two predicates "T1.I = X.I" and "X.VC = ?", so we will attempt to remap the column references in those two predicates. That brings us to the remapColumnReferencesToExpressions() method of BinaryOperatorNode, where we have: public ValueNode remapColumnReferencesToExpressions() throws StandardException { leftOperand = leftOperand.remapColumnReferencesToExpressions(); rightOperand = rightOperand.remapColumnReferencesToExpressions(); return this; } Notice how the leftOperand can change here--and in the above query, it will change to point directly to T1 instead of indirectly to the subquery. So now the probe predicate's left operand is different from the left operand of the original InListOperatorNode that the probe predicate replaced. That in it itself is fine, but it causes problems later. Namely, when it comes time to generate the final tree for the query, we realize that the probe predicate is not "useful" for probing because it references "VC", which is the second column in table T1. Since probe predicates are only useful if they reference the first column in the table, per "orderUsefulPredicates(...)" of PredicateList.java: else if (pred.isInListProbePredicate() && (indexPosition > 0)) { /* If the predicate is an IN-list probe predicate then we only consider it to be useful if the referenced column is the first one in the index (i.e. if (indexPosition == 0)). Otherwise the predicate would be treated as a qualifier for store, which could lead to incorrect results. */ .... the probe predicate is not useful. That in turn means that when it comes time to generate the IN list operator, we'll "revert" back to the original InListOperatorNode--i.e. we will generate the InListOperatorNode instead of generating the probe predicate. This is found in the generateExpression() method of BinaryOperatorNode: if (ilon != null) { ilon.generateExpression(acb, mb); return; } But there's a problem here: as mentioned above, ilon (the InListOperatorNode) still has a left operand that points to a column from the subquery . Since we flattened the subquery out, that left operand is no longer valid--and that ultimately causes an execution time NPE because we try to apply the IN list restriction to a column from a subquery that does not exist. I tried a one-line fix to this code that seems to have resolved the issue: if (ilon != null) { ilon.setLeftOperand(this.leftOperand); // Added this line ilon.generateExpression(acb, mb); return; } (with appropriate code comments, of course). This has the effect of making sure that when we "revert" back to the original InListOperatorNode generation, we'll still generate the correct leftOperand--i.e. the left operand as it exists in the "probe predicate" upon completion of optimization. I'm attaching this small fix as d3253_v1.patch. I have yet to run the regression tests (they are running now), but I thought I'd post my findings for early review in the interim...
          Hide
          Bryan Pendleton added a comment -

          Hi Army, thanks for taking the time to explain this in detail.

          Restating what I think I heard you say:

          • when we remap the predicate's column references during flattening,
            we modify the leftOperand of the InListOperatorNode, but when we
            generate the InListOperatorNode's expression (rather than the probe
            predicate's expression), we "recover" the InListOperatorNode's
            leftOperand from its BinaryOperatorNode's parent.

          Did I follow your explanation correctly?

          If so, it seems like it might be (slightly) cleaner if the logic were
          centralized in the InListOperatorNode itself, perhaps by defining an
          extra field in the node to hold the "before remapping" version of
          the leftOperand so that we can use that value rather than the remapped
          leftOperand during generateExpression.

          I'm just trying to avoid having code where BinaryOperatorNode corrects
          for a problem that is occuring in a child node; I always worry when I
          see class A having code that corrects the behavior of class B.

          Show
          Bryan Pendleton added a comment - Hi Army, thanks for taking the time to explain this in detail. Restating what I think I heard you say: when we remap the predicate's column references during flattening, we modify the leftOperand of the InListOperatorNode, but when we generate the InListOperatorNode's expression (rather than the probe predicate's expression), we "recover" the InListOperatorNode's leftOperand from its BinaryOperatorNode's parent. Did I follow your explanation correctly? If so, it seems like it might be (slightly) cleaner if the logic were centralized in the InListOperatorNode itself, perhaps by defining an extra field in the node to hold the "before remapping" version of the leftOperand so that we can use that value rather than the remapped leftOperand during generateExpression. I'm just trying to avoid having code where BinaryOperatorNode corrects for a problem that is occuring in a child node; I always worry when I see class A having code that corrects the behavior of class B.
          Hide
          A B added a comment - - edited

          > Did I follow your explanation correctly?

          Sorry, I think I messed up the explanation slightly. The order of events is:

          – Preprocess InListOperatorNode, which has some left operand OP_0, and convert it
          into a "probe predicate", which is a BinaryRelationalOperatorNode. That probe
          predicate does two things: 1) assumes its left operand from InListOperatorNode,
          i.e. the probe predicate's left operand is OP_0, as well; and 2) stores a pointer
          to the InListOperatorNode so that, if needed, the probe predicate can "revert" back
          to the InListOperatorNode at generation time. Upon completion of the preprocess
          phase for InListOperatorNode, the new probe predicate and the original InListOperatorNode
          both have the same left operand (OP_0)

          – Preprocess the subquery node, which involves flattening it. As part of flattening
          the probe predicate's left operand (not the InListOperatorNode's left operand)
          changes to OP_1; this is because it (the probe predicate) now represents the
          IN operation for the query tree, not the InListOperatorNode. So the InListOperatorNode
          remains the same after flattening--i.e. its left operand is still OP_0. That
          means the probe predicate and the InListOperatorNode have now different
          left operands.

          – Optimization completes, we decide that the probe predicate is not useful.

          – During code generation, we see that the probe predicate is not useful so we "revert"
          back to the original InListOperatorNode by calling "generate" on that node (which we
          stored inside the probe predicate (BinaryRelationalOperatorNode) at the beginning,
          per #2 of the first bullet above). But since the InListOperatorNode still has the old
          operand OP_0, it generates the wrong column reference.

          > If so, it seems like it might be (slightly) cleaner if the logic were
          > centralized in the InListOperatorNode itself,

          I don't think this is possible-or at least, not straightforwardbecause once the probe predicate comes into the picture, it (the probe predicate) becomes the focus of all method calls related to the IN list-so the only way to keep the InListOperatorNode up to date would be to add logic in BinaryRelationalOperatorNode that sends all method calls relating to leftOperand on down to the InListOperatorNode, as well. Note that doing so would require that the logic go into BinaryRelationalOperatorNode, not into InListOperatorNode.

          That was the way I was leaning when I first started, but a) that would require more logic in more places in BinaryRelationalOperatorNode to propagate operations down to the InListOperatorNode, and b) it seemed like the odds of missing some leftOperand operation could be non-neglible for very compilcated queries (just a theory, I didn't actually investigate this further). That said, it seemed like doing a single "setLeftOperand()" call at generation time was the preferable mechanism.

          I am of course open to change if you think this there is a better approach?

          Thanks for the feedback!

          Show
          A B added a comment - - edited > Did I follow your explanation correctly? Sorry, I think I messed up the explanation slightly. The order of events is: – Preprocess InListOperatorNode, which has some left operand OP_0, and convert it into a "probe predicate", which is a BinaryRelationalOperatorNode. That probe predicate does two things: 1) assumes its left operand from InListOperatorNode, i.e. the probe predicate's left operand is OP_0, as well; and 2) stores a pointer to the InListOperatorNode so that, if needed, the probe predicate can "revert" back to the InListOperatorNode at generation time. Upon completion of the preprocess phase for InListOperatorNode, the new probe predicate and the original InListOperatorNode both have the same left operand (OP_0) – Preprocess the subquery node, which involves flattening it. As part of flattening the probe predicate 's left operand ( not the InListOperatorNode's left operand) changes to OP_1; this is because it (the probe predicate) now represents the IN operation for the query tree, not the InListOperatorNode. So the InListOperatorNode remains the same after flattening--i.e. its left operand is still OP_0. That means the probe predicate and the InListOperatorNode have now different left operands. – Optimization completes, we decide that the probe predicate is not useful. – During code generation, we see that the probe predicate is not useful so we "revert" back to the original InListOperatorNode by calling "generate" on that node (which we stored inside the probe predicate (BinaryRelationalOperatorNode) at the beginning, per #2 of the first bullet above). But since the InListOperatorNode still has the old operand OP_0, it generates the wrong column reference. > If so, it seems like it might be (slightly) cleaner if the logic were > centralized in the InListOperatorNode itself, I don't think this is possible- or at least, not straightforward because once the probe predicate comes into the picture, it (the probe predicate) becomes the focus of all method calls related to the IN list -so the only way to keep the InListOperatorNode up to date would be to add logic in BinaryRelationalOperatorNode that sends all method calls relating to leftOperand on down to the InListOperatorNode, as well. Note that doing so would require that the logic go into BinaryRelationalOperatorNode, not into InListOperatorNode. That was the way I was leaning when I first started, but a) that would require more logic in more places in BinaryRelationalOperatorNode to propagate operations down to the InListOperatorNode, and b) it seemed like the odds of missing some leftOperand operation could be non-neglible for very compilcated queries (just a theory, I didn't actually investigate this further). That said, it seemed like doing a single "setLeftOperand()" call at generation time was the preferable mechanism. I am of course open to change if you think this there is a better approach? Thanks for the feedback!
          Hide
          Bryan Pendleton added a comment -

          The flow of execution makes more sense now, thanks for filling it in more completely.

          So when we revert from the probe predicate back to the original in-list-op, the problem
          is that the original in-list-op reflects the unflattened query, and since we've decided
          to flatten the query, it refers to a result set that no longer exists?

          Then I think when we perform the flattening, the probe predicate should ensure
          that it not only re-maps its own column references, but also calls the in-list that
          it's stashed away, and gives the in-list a chance to re-map its column references.

          That is: perform flatten remapping for both predicates, so that whichever one we
          eventually decide to use, it correctly matches the overall query tree.

          I guess that is basically the approach you were describing when you said

          "the only way to keep the InListOperatorNode up to date would be to add logic in
          BinaryRelationalOperatorNode that sends all method calls relating to
          leftOperand on down to the InListOperatorNode, as well."

          It seems like the "probe predicate" is this nifty shape-shifting operator that
          says: "I'd like to be a multi-probe index operator, but if it turns out I can't be,
          then I'll switch back and be an IN list instead. Until we decide, I'll retain enough
          information that I can be either operator." Given that perspective, to me
          it seems valid that the code has to be routinely passing method calls to both
          variants of the operator, since in a sense it is both variants of the operator,
          simultaneously, until at some crucial instant the optimizer determines that
          it can finalize the decision as to which variant to use.

          Show
          Bryan Pendleton added a comment - The flow of execution makes more sense now, thanks for filling it in more completely. So when we revert from the probe predicate back to the original in-list-op, the problem is that the original in-list-op reflects the unflattened query, and since we've decided to flatten the query, it refers to a result set that no longer exists? Then I think when we perform the flattening, the probe predicate should ensure that it not only re-maps its own column references, but also calls the in-list that it's stashed away, and gives the in-list a chance to re-map its column references. That is: perform flatten remapping for both predicates, so that whichever one we eventually decide to use, it correctly matches the overall query tree. I guess that is basically the approach you were describing when you said "the only way to keep the InListOperatorNode up to date would be to add logic in BinaryRelationalOperatorNode that sends all method calls relating to leftOperand on down to the InListOperatorNode, as well." It seems like the "probe predicate" is this nifty shape-shifting operator that says: "I'd like to be a multi-probe index operator, but if it turns out I can't be, then I'll switch back and be an IN list instead. Until we decide, I'll retain enough information that I can be either operator." Given that perspective, to me it seems valid that the code has to be routinely passing method calls to both variants of the operator, since in a sense it is both variants of the operator, simultaneously, until at some crucial instant the optimizer determines that it can finalize the decision as to which variant to use.
          Hide
          A B added a comment -

          Hi Bryan,

          Thank you for your continued feedback.

          I'm attaching two files. The first is d3253_v2_incomplete.patch, which was my first attempt at addressing the issue in the "other" way, i.e.:

          > add logic in BinaryRelationalOperatorNode that sends all method calls
          > relating to leftOperand on down to the InListOperatorNode, as well.

          I tried the obvious places where this had to be done and the error reported for this issue did indeed disappear. However, running more complicated queries with that approach leads to other execution-time NullPointerExceptions, plus a few changed query plans. When I applied the patch and ran derbyall there were four test failures, all of which suffered from at least one execution-time NPE:

          derbyall/derbyall.fail:lang/inbetween.sql
          derbyall/derbyall.fail:lang/predicatePushdown.sql
          derbyall/derbyall.fail:lang/predicatesIntoViews.sql
          derbyall/derbyall.fail:lang/subquery.sql

          I looked briefly into this to try to figure out what the problem was, but I admit that I gave up pretty quickly. In order to be sure that this approach works, I think we'd have to 1) look through all of the code to find any place that retrieved the left operand of a BinaryRelationalOperatorNode, 2) check to see if that place made any changes to the left operand in any way (direct or indirect), and 3) if changes were made, add logic to propagate those changes down to left operand of the InListOperatorNode that is tucked away inside the BinaryRelationalOperatorNode (if there is one). Does that sound correct to you? Or am I making this too complicated?

          The fact that just hitting what I thought to be the "basics" causes failures leads me back to my original assessment of this approach:

          > a) that would require more logic in more places in BinaryRelationalOperatorNode
          > to propagate operations down to the InListOperatorNode, and b) it seems like
          > the odds of missing some leftOperand operation could be non-neglible for very
          > complicated queries.

          You mentioned:

          > it seems valid that the code has to be routinely passing method calls to both
          > variants of the operator, since in a sense it is both variants of the operator,
          > simultaneously, until at some crucial instant the optimizer determines that
          > it can finalize the decision as to which variant to use.

          Is this in some way saying that instead of "do the work once and apply the result to the correct place", we'd be opting for "do the work twice and then throw one of the results away"? Given the fact the doing the work a "second" time affects more code and is more error-prone, I wonder if that is really the best approach? Regardless of which "version" of the operator we use, there should only be one correct form for the left operand when it comes time to generate code--and that should be exactly same between the two "versions". So if we have the operand in its correct form, it seems reasonable to just push it to the right place for code generation...

          I fully admit that I may be biased here and that my bias may be keeping from seeing a silly mistake (or omission) in my attempt at this approach. So I've attached my changes to this issue (d3253_v2_incomplete.patch); please feel free to look at it and to clean it up if there's anything obvious I missed. My apologies in advance if that proves to be the case.

          I'm also attaching a second file, d3253_v3.patch, which is a slight variation of the _v1 patch. With this approach we still use the "setLeftOperand(this.leftOperand)" approach seen in _v1, but instead of doing it once at code generation time, we do it as part of the BinaryRelationalOperatorNode.getInListOp() method. That is to say, any time any caller tries to access the InListOperatorNode that sits beneath a "probe predicate", we ensure that the InListOperatorNode has a valid, up-to-date left operand-which comes from the probe predicate itself-before returning it to the caller. This solves the issue, and passes all regression tests.

          I feel like _v3 is the best of the three approaches. It has more changes than _v1 but unlike _v1, it ensures that any code which accesses the underlying InListOperatorNode will see a left operand that has the correct information. It also seems a tad more maintainable than _v2 since the latter requires that any future code which directly or indirectly changes a BinaryRelationalOperatorNode's left operand must also account for the possibility that such a node is a probe predicate with an underlying InListOperatorNode.

          I'd like to proceed with the _v3 patch as my proposed solution. If it is still generally felt that _v2 is the best overall approach, then perhaps I can go ahead with the _v3 approach for now and anyone who is so inclined can implement the _v2 approach as a longer term solution. Does that seem like an acceptable proposal, or do you think that _v3 is in itself an incorrect or un-commitable fix?

          Show
          A B added a comment - Hi Bryan, Thank you for your continued feedback. I'm attaching two files. The first is d3253_v2_incomplete.patch, which was my first attempt at addressing the issue in the "other" way, i.e.: > add logic in BinaryRelationalOperatorNode that sends all method calls > relating to leftOperand on down to the InListOperatorNode, as well. I tried the obvious places where this had to be done and the error reported for this issue did indeed disappear. However, running more complicated queries with that approach leads to other execution-time NullPointerExceptions, plus a few changed query plans. When I applied the patch and ran derbyall there were four test failures, all of which suffered from at least one execution-time NPE: derbyall/derbyall.fail:lang/inbetween.sql derbyall/derbyall.fail:lang/predicatePushdown.sql derbyall/derbyall.fail:lang/predicatesIntoViews.sql derbyall/derbyall.fail:lang/subquery.sql I looked briefly into this to try to figure out what the problem was, but I admit that I gave up pretty quickly. In order to be sure that this approach works, I think we'd have to 1) look through all of the code to find any place that retrieved the left operand of a BinaryRelationalOperatorNode, 2) check to see if that place made any changes to the left operand in any way (direct or indirect), and 3) if changes were made, add logic to propagate those changes down to left operand of the InListOperatorNode that is tucked away inside the BinaryRelationalOperatorNode (if there is one). Does that sound correct to you? Or am I making this too complicated? The fact that just hitting what I thought to be the "basics" causes failures leads me back to my original assessment of this approach: > a) that would require more logic in more places in BinaryRelationalOperatorNode > to propagate operations down to the InListOperatorNode, and b) it seems like > the odds of missing some leftOperand operation could be non-neglible for very > complicated queries. You mentioned: > it seems valid that the code has to be routinely passing method calls to both > variants of the operator, since in a sense it is both variants of the operator, > simultaneously, until at some crucial instant the optimizer determines that > it can finalize the decision as to which variant to use. Is this in some way saying that instead of "do the work once and apply the result to the correct place", we'd be opting for "do the work twice and then throw one of the results away"? Given the fact the doing the work a "second" time affects more code and is more error-prone, I wonder if that is really the best approach? Regardless of which "version" of the operator we use, there should only be one correct form for the left operand when it comes time to generate code--and that should be exactly same between the two "versions". So if we have the operand in its correct form, it seems reasonable to just push it to the right place for code generation... I fully admit that I may be biased here and that my bias may be keeping from seeing a silly mistake (or omission) in my attempt at this approach. So I've attached my changes to this issue (d3253_v2_incomplete.patch); please feel free to look at it and to clean it up if there's anything obvious I missed. My apologies in advance if that proves to be the case. I'm also attaching a second file, d3253_v3.patch, which is a slight variation of the _v1 patch. With this approach we still use the "setLeftOperand(this.leftOperand)" approach seen in _v1, but instead of doing it once at code generation time, we do it as part of the BinaryRelationalOperatorNode.getInListOp() method. That is to say, any time any caller tries to access the InListOperatorNode that sits beneath a "probe predicate", we ensure that the InListOperatorNode has a valid, up-to-date left operand- which comes from the probe predicate itself -before returning it to the caller. This solves the issue, and passes all regression tests. I feel like _v3 is the best of the three approaches. It has more changes than _v1 but unlike _v1, it ensures that any code which accesses the underlying InListOperatorNode will see a left operand that has the correct information. It also seems a tad more maintainable than _v2 since the latter requires that any future code which directly or indirectly changes a BinaryRelationalOperatorNode's left operand must also account for the possibility that such a node is a probe predicate with an underlying InListOperatorNode. I'd like to proceed with the _v3 patch as my proposed solution. If it is still generally felt that _v2 is the best overall approach, then perhaps I can go ahead with the _v3 approach for now and anyone who is so inclined can implement the _v2 approach as a longer term solution. Does that seem like an acceptable proposal, or do you think that _v3 is in itself an incorrect or un-commitable fix?
          Hide
          Bryan Pendleton added a comment -

          I think the _v3 patch is a good one; I like the approach of the patch. Thank you
          for adding the comments, they are quite clear. I think that the manipulation
          of the leftOperand field is well encapsulated with this approach.

          I downloaded and reviewed the _v3 patch against the current trunk.
          In my environment, the new regression test fails as expected without the
          code changes from the patch, and passes with the code changes applied.

          Thank you again for taking the time to explore the various alternatives
          in detail, and for explaining how the processing works, it has been quite
          illuminating and I've learned a lot from observing your work.

          +1 to commit from me!

          Show
          Bryan Pendleton added a comment - I think the _v3 patch is a good one; I like the approach of the patch. Thank you for adding the comments, they are quite clear. I think that the manipulation of the leftOperand field is well encapsulated with this approach. I downloaded and reviewed the _v3 patch against the current trunk. In my environment, the new regression test fails as expected without the code changes from the patch, and passes with the code changes applied. Thank you again for taking the time to explore the various alternatives in detail, and for explaining how the processing works, it has been quite illuminating and I've learned a lot from observing your work. +1 to commit from me!
          Hide
          A B added a comment -

          Committed d3253_v3.patch with svn #605616:

          URL: http://svn.apache.org/viewvc?rev=605616&view=rev

          Many thanks (again) to Bryan for the review and discussion--I appreciate your time! I ran the svn merge command to port this back to 10.3 and it merged with no errors:

          svn merge -r 605615:605616 https://svn.apache.org/repos/asf/db/derby/code/trunk

          I'll kick off the 10.3 regression tests today. If they pass and if no issues arise in trunk for the new couple of days, I plan to commit to 10.3 by the end of the week...

          Show
          A B added a comment - Committed d3253_v3.patch with svn #605616: URL: http://svn.apache.org/viewvc?rev=605616&view=rev Many thanks (again) to Bryan for the review and discussion--I appreciate your time! I ran the svn merge command to port this back to 10.3 and it merged with no errors: svn merge -r 605615:605616 https://svn.apache.org/repos/asf/db/derby/code/trunk I'll kick off the 10.3 regression tests today. If they pass and if no issues arise in trunk for the new couple of days, I plan to commit to 10.3 by the end of the week...
          Hide
          A B added a comment - - edited

          Update Fix Version to reflect that this has now been ported back to 10.3, as of:

          URL: http://svn.apache.org/viewvc?rev=606277&view=rev

          Show
          A B added a comment - - edited Update Fix Version to reflect that this has now been ported back to 10.3, as of: URL: http://svn.apache.org/viewvc?rev=606277&view=rev
          Hide
          Dyre Tjeldvoll added a comment -

          Stan, do you think we can close this now?

          Show
          Dyre Tjeldvoll added a comment - Stan, do you think we can close this now?
          Hide
          Stan Bradbury added a comment -

          Catching up on Closing my reported issues. Thanks to Dyre for the workflow reminder today.

          Show
          Stan Bradbury added a comment - Catching up on Closing my reported issues. Thanks to Dyre for the workflow reminder today.

            People

            • Assignee:
              A B
              Reporter:
              Stan Bradbury
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development