Derby
  1. Derby
  2. DERBY-2282

Incorrect "transitive closure" logic leads to inconsistent behavior for binary comparison predicates.

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 10.0.2.0, 10.0.2.1, 10.0.2.2, 10.1.1.0, 10.1.2.1, 10.1.3.1, 10.1.3.2, 10.1.3.3, 10.2.1.6, 10.2.2.0, 10.2.2.1, 10.3.1.4
    • Fix Version/s: 10.6.1.0
    • Component/s: SQL
    • Labels:
      None
    • Urgency:
      Normal
    • Issue & fix info:
      Known fix, Repro attached

      Description

      The logic that handles "transive closure" for search predicates is in the "searchClauseTransitiveClosure()" method of impl/sql/compile/PredicateList.java. That method contains the following logic:

      else if (operator instanceof BinaryComparisonOperatorNode)
      {
      BinaryComparisonOperatorNode bcon = (BinaryComparisonOperatorNode) operator;
      ValueNode left = bcon.getLeftOperand();
      ValueNode right = bcon.getRightOperand();

      // RESOLVE: Consider using variant type of the expression, instead of
      // ConstantNode or ParameterNode in the future.
      if (left instanceof ColumnReference &&
      (right instanceof ConstantNode || right instanceof ParameterNode))

      { searchClauses.addElement(predicate); }

      else if (right instanceof ConstantNode && left instanceof ColumnReference)

      { // put the ColumnReference on the left to simplify things bcon.swapOperands(); searchClauses.addElement(predicate); }

      continue;
      }

      Notice that the inner "else-if" condition is wrong. It's supposed to be checking to see if the right node is a ColumnReference and the left node is a Constant, but that's not what it does-instead, it does a check that is really a sub-condition of the "if" condition-i.e. whenever the "else if" condition is true the "if" condition will be true and thus we won't ever execute the "else if" branch.

      I confirmed this by looking at the code coverage results for 10.2:

      http://people.apache.org/~fuzzylogic/codecoverage/428586/_files/2f4.html#2d

      The lines in question are never executed.

      What this means is that a query which specifies constants on the left side of a comparison predicate will behave differently than a query which specifies constants on the right side of the same comparison. As an example:

      create table t1 (i int);
      create table t2 (j int);

      insert into t1 values 1, 5, 7, 11, 13, 17, 19;
      insert into t2 values 23, 29, 31, 37, 43, 47, 53;
      insert into t1 select 23 * i from t1 where i < 19;
      insert into t2 select 23 * j from t2 where j < 55;

      – Following will show two qualifiers for T2 and three for T1
      – because transitive closure adds two new qualifiers, "t2.j >= 23"
      – and "t1.i <= 30" to the list.
      select * from t1, t2 where t1.i = t2.j and t1.i >= 23 and t2.j <= 30;

      – But if we put the constants on the left-hand side, we don't
      – detect the transitive closure and thus we have a single qualifier
      – for T2 and only two qualifiers for T1.
      select * from t1, t2 where t1.i = t2.j and 23 <= t1.i and 30 >= t2.j;

      The above two queries should in theory show the same query plan--but if we execute the above statements while logging query plans, we'll see a difference (as explained in the sql comments above).

      I did a quick scan of the various branches and found that this incorrect logic appears in every branch back to 10.0 (hence the massive "Affects Versions" list). That said, the result of this bug isn't an error nor is it wrong results, so I'm just marking it "Minor".

      The fix looks to be pretty straightforward....

      1. test.diff
        5 kB
        Knut Anders Hatlen
      2. d2282-1a.stat
        0.5 kB
        Knut Anders Hatlen
      3. d2282-1a.diff
        12 kB
        Knut Anders Hatlen
      4. d2282-2a.stat
        0.2 kB
        Knut Anders Hatlen
      5. d2282-2a.diff
        4 kB
        Knut Anders Hatlen

        Issue Links

          Activity

          Hide
          Rick Hillegas added a comment -

          Triaged for 10.5.2: assigned normal urgency, noted that a repro is attached, flagged as wrong query result.

          Show
          Rick Hillegas added a comment - Triaged for 10.5.2: assigned normal urgency, noted that a repro is attached, flagged as wrong query result.
          Hide
          Knut Anders Hatlen added a comment -

          Removed wrong query result flag since the bug doesn't cause wrong results, it only produces a suboptimal plan. Also flagged "known fix" since the bug description explains how to fix the broken logic.

          Show
          Knut Anders Hatlen added a comment - Removed wrong query result flag since the bug doesn't cause wrong results, it only produces a suboptimal plan. Also flagged "known fix" since the bug description explains how to fix the broken logic.
          Hide
          Knut Anders Hatlen added a comment -

          I'm assuming that the straightforward fix mentioned in the bug description is to swap left and right in the broken else if branch. But if we do that the second query starts returning an empty result (it's supposed to return one row). I believe this is because BinaryOperatorNode.swapOperands() only swaps the operands and doesn't actually turn the operator around. This means 23 <= t1.i is rewritten to t1.i <= 23, which is not an equivalent expression.

          Show
          Knut Anders Hatlen added a comment - I'm assuming that the straightforward fix mentioned in the bug description is to swap left and right in the broken else if branch. But if we do that the second query starts returning an empty result (it's supposed to return one row). I believe this is because BinaryOperatorNode.swapOperands() only swaps the operands and doesn't actually turn the operator around. This means 23 <= t1.i is rewritten to t1.i <= 23, which is not an equivalent expression.
          Hide
          Knut Anders Hatlen added a comment -

          The attachement test.diff contains a JUnit test that I used to test it. Right now, it fails because the second query's plan doesn't have the expected operators. If left and right are swapped in the broken if clause, it fails because the query returns wrong results.

          Show
          Knut Anders Hatlen added a comment - The attachement test.diff contains a JUnit test that I used to test it. Right now, it fails because the second query's plan doesn't have the expected operators. If left and right are swapped in the broken if clause, it fails because the query returns wrong results.
          Hide
          Knut Anders Hatlen added a comment -

          Here's a patch that addresses the issue by changing the else-if condition as suggested, and by swapping the operands and changing the operator type if necessary.

          The method BinaryOperatorNode.swapOperands(), which only works if the operator is symmetric, is no longer used, and it is therefore removed. Instead, a new method called getSwappedEquivalent() (in the lack of a better name, other suggestions are welcome) was added to BinaryComparisonOperatorNode. This method is modelled after getNegation() and creates a new node with the left and right operands swapped, and it changes the operator type to preserve the meaning of the expression (< is changed to >, = is kept as it is, >= is changed to <=, etc).

          All the regression tests ran cleanly.

          Show
          Knut Anders Hatlen added a comment - Here's a patch that addresses the issue by changing the else-if condition as suggested, and by swapping the operands and changing the operator type if necessary. The method BinaryOperatorNode.swapOperands(), which only works if the operator is symmetric, is no longer used, and it is therefore removed. Instead, a new method called getSwappedEquivalent() (in the lack of a better name, other suggestions are welcome) was added to BinaryComparisonOperatorNode. This method is modelled after getNegation() and creates a new node with the left and right operands swapped, and it changes the operator type to preserve the meaning of the expression (< is changed to >, = is kept as it is, >= is changed to <=, etc). All the regression tests ran cleanly.
          Hide
          Knut Anders Hatlen added a comment -

          The check for constants on the right side of the operator check for both ConstantNode and ParameterNode, whereas the check for constants on the left side check for ConstantNode only. Should the else-if condition also check for ParameterNode for completeness?

          Show
          Knut Anders Hatlen added a comment - The check for constants on the right side of the operator check for both ConstantNode and ParameterNode, whereas the check for constants on the left side check for ConstantNode only. Should the else-if condition also check for ParameterNode for completeness?
          Hide
          Bryan Pendleton added a comment -

          (< is changed to >, = is kept as it is, >= is changed to <=, etc)

          getInverseNode()? getReverseNode()?

          getEquivalentNodeWithSwappedOperands() seems far too long

          I think that getSwappedEquivalent is a pretty good name, actually.

          Show
          Bryan Pendleton added a comment - (< is changed to >, = is kept as it is, >= is changed to <=, etc) getInverseNode()? getReverseNode()? getEquivalentNodeWithSwappedOperands() seems far too long I think that getSwappedEquivalent is a pretty good name, actually.
          Hide
          Knut Anders Hatlen added a comment -

          Thanks for the suggestions, Bryan. getReverseNode() sounds like a good alternative. I'm not sure about getInverseNode(), since "invert" is an overloaded term (it is sometimes even used as a synonym for "negate", which is the opposite of what we're doing here). I think I'll stick to getSwappedEquivalent() for now, since it highlights both that we swap the operands and that the returned node should be equivalent to the node on which it is being called. But I'm sure there must be a technical term for this transformation...

          Show
          Knut Anders Hatlen added a comment - Thanks for the suggestions, Bryan. getReverseNode() sounds like a good alternative. I'm not sure about getInverseNode(), since "invert" is an overloaded term (it is sometimes even used as a synonym for "negate", which is the opposite of what we're doing here). I think I'll stick to getSwappedEquivalent() for now, since it highlights both that we swap the operands and that the returned node should be equivalent to the node on which it is being called. But I'm sure there must be a technical term for this transformation...
          Hide
          Bryan Pendleton added a comment -

          Is this a duplicate of DERBY-813?

          Show
          Bryan Pendleton added a comment - Is this a duplicate of DERBY-813 ?
          Hide
          Knut Anders Hatlen added a comment -

          Yes, I believe it is. Since DERBY-813 is a sub-task of a closed issue, it's probably best to close the sub-task and continue the work on this top-level issue.

          Show
          Knut Anders Hatlen added a comment - Yes, I believe it is. Since DERBY-813 is a sub-task of a closed issue, it's probably best to close the sub-task and continue the work on this top-level issue.
          Hide
          Knut Anders Hatlen added a comment -

          I committed the 1a patch to trunk with revision 887156.

          Regarding my previous comment about checking for ParameterNode as well, I see that it's also mentioned by Satheesh in DERBY-813. I will prepare a follow-up patch which handles ParameterNode and adds a test for that case.

          Show
          Knut Anders Hatlen added a comment - I committed the 1a patch to trunk with revision 887156. Regarding my previous comment about checking for ParameterNode as well, I see that it's also mentioned by Satheesh in DERBY-813 . I will prepare a follow-up patch which handles ParameterNode and adds a test for that case.
          Hide
          Dag H. Wanvik added a comment -

          +1, would be nice to handle the ParameterNode case as well Patch looks good.

          Show
          Dag H. Wanvik added a comment - +1, would be nice to handle the ParameterNode case as well Patch looks good.
          Hide
          Knut Anders Hatlen added a comment -

          Unsetting "Patch Available" since the first patch was committed.

          Show
          Knut Anders Hatlen added a comment - Unsetting "Patch Available" since the first patch was committed.
          Hide
          Knut Anders Hatlen added a comment -

          Attaching a new patch (2a) which enables rewriting of predicates with parameters on the left-hand side. The patch also adds a test case to verify that the rewriting is actually done.

          I have only run PredicateTest on the patch, so I'm not setting Patch Available yet.

          Show
          Knut Anders Hatlen added a comment - Attaching a new patch (2a) which enables rewriting of predicates with parameters on the left-hand side. The patch also adds a test case to verify that the rewriting is actually done. I have only run PredicateTest on the patch, so I'm not setting Patch Available yet.
          Hide
          Knut Anders Hatlen added a comment -

          All the regression tests ran cleanly. Setting patch available.

          Show
          Knut Anders Hatlen added a comment - All the regression tests ran cleanly. Setting patch available.
          Hide
          Knut Anders Hatlen added a comment -

          Committed revision 890370.
          Resolving the issue.

          Show
          Knut Anders Hatlen added a comment - Committed revision 890370. Resolving the issue.

            People

            • Assignee:
              Knut Anders Hatlen
              Reporter:
              A B
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development