Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-1018

SortJoinTransposeRule not firing due to getMaxRowCount(RelSubset) returning null

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.6.0
    • Component/s: None
    • Labels:
      None

      Description

      getMaxRowCount(RelSubset) defaults to getMaxRowCount(RelNode) and returns null, so if the Join rel has its outer child rel as a RelSubset, the SortJoinTransposeRule will fail the checkInputForCollationAndLimit check and will not proceed to push down a limit through join.
      Before fix of CALCITE-995 and CALCITE-987, getMaxRowCount(RelSubset) would return positive infinity, which would cause similar situation but an opposite effect as firing the rule infinitely when the join's outer child is a RelSubset.
      Neither of the above situation was reflected in the test cases for SortJoinTransposeRule in Calcite, since the RelSubset condition was not covered by RelOptRule test which uses HepPlanner running with just one or no more than a couple of rules together.
      Basically we need a more accurate way to get max row count for RelSubset (similarly for HexVertex as well), otherwise checkInputForCollationAndLimit would either always fail or always succeed for a Limit over a Join over a RelSubset. But I assume, to be real accurate, we'd have to introduce a similar mechanism to one that computes bestCost in RelSubset, which I doubt would be worth it.
      Another way, which might seem a little ugly, is to add something like "isSortPushedThrough()" in Sort rel, similar to "isSemiJoinDone()" in Join, in order to avoid the rule being fired infinitely.
      In my Phoenix project, I applied a temporary fix, but it proved to work:

      public class PhoenixRelMdMaxRowCount {
          public static final RelMetadataProvider SOURCE =
                  ReflectiveRelMetadataProvider.reflectiveSource(
                      BuiltInMethod.MAX_ROW_COUNT.method, new PhoenixRelMdMaxRowCount());
      
          private PhoenixRelMdMaxRowCount() {
          }
      
          public Double getMaxRowCount(RelSubset rel) {
              for (RelNode node : rel.getRels()) {
                  if (node instanceof Sort) {
                      Sort sort = (Sort) node;
                      if (sort.fetch != null) {
                          return (double) RexLiteral.intValue(sort.fetch);
                      }
                  }
              }
              
              return Double.POSITIVE_INFINITY;
          }
      }
      
      1. CALCITE-1018.01.patch
        1 kB
        Pengcheng Xiong
      2. CALCITE-1018.repro.diff
        5 kB
        Maryann Xue

        Issue Links

          Activity

          Hide
          julianhyde Julian Hyde added a comment -

          Resolved in release 1.6.0 (2016-01-22).

          Show
          julianhyde Julian Hyde added a comment - Resolved in release 1.6.0 (2016-01-22).
          Show
          maryannxue Maryann Xue added a comment - Fixed in http://git-wip-us.apache.org/repos/asf/calcite/commit/84b55ef .
          Hide
          maryannxue Maryann Xue added a comment - - edited

          Thanks a lot, Julian Hyde, for pointing me to CALCITE-794! I'll go ahead check in the temporary fix with a comment. The most important part of this check-in is the test case, though. Think we should add more test cases in JdbcTest for RelOptRule check, esp. those depending on metadata of RelSubset.

          Show
          maryannxue Maryann Xue added a comment - - edited Thanks a lot, Julian Hyde , for pointing me to CALCITE-794 ! I'll go ahead check in the temporary fix with a comment. The most important part of this check-in is the test case, though. Think we should add more test cases in JdbcTest for RelOptRule check, esp. those depending on metadata of RelSubset.
          Hide
          julianhyde Julian Hyde added a comment -

          The cyclic issue is CALCITE-794. It is a real problem because it is not practical eliminate all cycles from Volcano RelNode graphs, and the solution is quite tricky: introduce an extra parameter to every metadata method so that we can detect cycles. You can see my work in a branch attached to that case.

          This bug needs a short term fix and a proper fix. I think we should put in the short-term fix as Maryann Xue suggests, then fix CALCITE-794. The latter could just about make it into a December release, but dependent projects would need to upgrade code for the new API.

          Show
          julianhyde Julian Hyde added a comment - The cyclic issue is CALCITE-794 . It is a real problem because it is not practical eliminate all cycles from Volcano RelNode graphs, and the solution is quite tricky: introduce an extra parameter to every metadata method so that we can detect cycles. You can see my work in a branch attached to that case. This bug needs a short term fix and a proper fix. I think we should put in the short-term fix as Maryann Xue suggests, then fix CALCITE-794 . The latter could just about make it into a December release, but dependent projects would need to upgrade code for the new API.
          Hide
          maryannxue Maryann Xue added a comment -

          I have finally come up with a test case in JdbcTest that easily reproduces this issue and demonstrates the few other things I have stated above. Apply CALCITE-1019.patch together with this attached file, you'll see:
          1) The rule doesn't fire.
          2) After adding CALCITE-1018.01.patch, the test will fail due to infinite loop.
          3) Apply the temporary fix, the test case will pass. Again, I am not suggesting using this temporary fix. It rather just serves as a demonstration of the problem.
          This file also removes the original test case for SortJoinTransposeRule, the comment for which was wrong by saying that the rule did not take effect because "Limit and sort are different Enumerable operators, thus it does not apply". The old test case couldn't see the effect of SortJoinTransposeRule because EnumerableLimit didn't have a rowCount that reflects "limit", and even if it did the extra sorting cost of emp table might be bigger than the gain the limit could bring.

          Show
          maryannxue Maryann Xue added a comment - I have finally come up with a test case in JdbcTest that easily reproduces this issue and demonstrates the few other things I have stated above. Apply CALCITE-1019 .patch together with this attached file, you'll see: 1) The rule doesn't fire. 2) After adding CALCITE-1018 .01.patch, the test will fail due to infinite loop. 3) Apply the temporary fix, the test case will pass. Again, I am not suggesting using this temporary fix. It rather just serves as a demonstration of the problem. This file also removes the original test case for SortJoinTransposeRule, the comment for which was wrong by saying that the rule did not take effect because "Limit and sort are different Enumerable operators, thus it does not apply". The old test case couldn't see the effect of SortJoinTransposeRule because EnumerableLimit didn't have a rowCount that reflects "limit", and even if it did the extra sorting cost of emp table might be bigger than the gain the limit could bring.
          Hide
          maryannxue Maryann Xue added a comment -

          CALCITE-1019 shadows this issue, and once it is resolved it will be possible to reproduce this issue in JdbcTest.

          Show
          maryannxue Maryann Xue added a comment - CALCITE-1019 shadows this issue, and once it is resolved it will be possible to reproduce this issue in JdbcTest.
          Hide
          maryannxue Maryann Xue added a comment -

          Pengcheng Xiong, Julian Hyde, please take a look at this comment about RelMdColumnUniqueness:
          https://issues.apache.org/jira/browse/CALCITE-938?focusedCommentId=14974828&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14974828

          Basically they have the same problem with loop reference in RelSubset, which can be produced by rules like ProjectRemoveRule. So calling RelMetadataQuery.getMaxRowCount() on each of RelSubset's rels would cause infinite loop in certain cases.

          I ran the below query with your suggested patch and got StackOverflow:

          select * from "scott".emp e right join (
            select * from "scott".dept) d using (deptno) where dname = '1'
          
          Show
          maryannxue Maryann Xue added a comment - Pengcheng Xiong , Julian Hyde , please take a look at this comment about RelMdColumnUniqueness: https://issues.apache.org/jira/browse/CALCITE-938?focusedCommentId=14974828&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14974828 Basically they have the same problem with loop reference in RelSubset, which can be produced by rules like ProjectRemoveRule. So calling RelMetadataQuery.getMaxRowCount() on each of RelSubset's rels would cause infinite loop in certain cases. I ran the below query with your suggested patch and got StackOverflow: select * from "scott" .emp e right join ( select * from "scott" .dept) d using (deptno) where dname = '1'
          Hide
          pxiong Pengcheng Xiong added a comment -

          Following Julian Hyde's suggestion, I guess the solution should be similar to "Intersect". I put a short patch here without test cases......

          Show
          pxiong Pengcheng Xiong added a comment - Following Julian Hyde 's suggestion, I guess the solution should be similar to "Intersect". I put a short patch here without test cases......
          Hide
          maryannxue Maryann Xue added a comment -

          I am now trying to reproduce the problem in JdbcTest, but somehow found this might be shallowed by another mistake in checkInputForCollationAndLimit():

              final boolean alreadySorted = RelCollations.equal(
                  ImmutableList.of(collation), inputCollation);
          

          I don't think this is right when collation is empty.

          Show
          maryannxue Maryann Xue added a comment - I am now trying to reproduce the problem in JdbcTest, but somehow found this might be shallowed by another mistake in checkInputForCollationAndLimit(): final boolean alreadySorted = RelCollations.equal( ImmutableList.of(collation), inputCollation); I don't think this is right when collation is empty.
          Hide
          maryannxue Maryann Xue added a comment -

          Yes, theoretically, this is what it should do. But getting maxRowCount for each of its rels would lead to infinite loops, coz RelSubset has loop reference.

          Show
          maryannxue Maryann Xue added a comment - Yes, theoretically, this is what it should do. But getting maxRowCount for each of its rels would lead to infinite loops, coz RelSubset has loop reference.
          Hide
          julianhyde Julian Hyde added a comment -

          The standard provider should evaluate getMaxRowCount(RelSubset) as the least (not null) maxRowCount of any node in that subset. I don't think you will need to explicitly look for Sort.

          Show
          julianhyde Julian Hyde added a comment - The standard provider should evaluate getMaxRowCount(RelSubset) as the least (not null) maxRowCount of any node in that subset. I don't think you will need to explicitly look for Sort.
          Hide
          maryannxue Maryann Xue added a comment -
          Show
          maryannxue Maryann Xue added a comment - Any thoughts, Julian Hyde , Jesus Camacho Rodriguez , Pengcheng Xiong ?

            People

            • Assignee:
              maryannxue Maryann Xue
              Reporter:
              maryannxue Maryann Xue
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development