Details

    • Type: Sub-task
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.5.0
    • Component/s: None
    • Labels:
      None
    1. CALCITE-889_fix.patch
      5 kB
      Maryann Xue
    2. CALCITE-889.01.patch
      7 kB
      Pengcheng Xiong
    3. CALCITE-889.02.patch
      15 kB
      Pengcheng Xiong

      Activity

      Hide
      pxiong Pengcheng Xiong added a comment -

      Julian Hyde or Jesus Camacho Rodriguez, could u take a look? Thanks.

      Show
      pxiong Pengcheng Xiong added a comment - Julian Hyde or Jesus Camacho Rodriguez , could u take a look? Thanks.
      Hide
      jcamachorodriguez Jesus Camacho Rodriguez added a comment - - edited

      Pengcheng Xiong, some comments on the patch.

      • Please, check the style of the patch, as checkstyle does not pass (using tabs, incorrect order of imports, etc). If you compile with mvn, it will give you info about exact problems.
      • Concerning SortUnionTransposeRule, you are not adding all the union inputs to the list e.g. if Sort is not added, you skip that input.
      • Further, I'm not convinced about bailing out if the condition in line 75 is not met; although it might not be a valid construct in SQL, a Sort below the Union is a valid plan construct. In order to bail out properly, I think we could check if the input to the Union is a Sort operator. If it is, we would replace it with a copy of the top Sort operator. If it is not, we introduce the copy of the top Sort operator between the Union and its input.
      • In Line 87, you should not create a LogicalUnion, but instead use Union.copy (or a factory).

      Btw, for reviewing the patch, I think it is easier if you create a GitHub pull request.

      Thanks

      Show
      jcamachorodriguez Jesus Camacho Rodriguez added a comment - - edited Pengcheng Xiong , some comments on the patch. Please, check the style of the patch, as checkstyle does not pass (using tabs, incorrect order of imports, etc). If you compile with mvn, it will give you info about exact problems. Concerning SortUnionTransposeRule, you are not adding all the union inputs to the list e.g. if Sort is not added, you skip that input. Further, I'm not convinced about bailing out if the condition in line 75 is not met; although it might not be a valid construct in SQL, a Sort below the Union is a valid plan construct. In order to bail out properly, I think we could check if the input to the Union is a Sort operator. If it is, we would replace it with a copy of the top Sort operator. If it is not, we introduce the copy of the top Sort operator between the Union and its input. In Line 87, you should not create a LogicalUnion, but instead use Union.copy (or a factory). Btw, for reviewing the patch, I think it is easier if you create a GitHub pull request. Thanks
      Hide
      jnadeau Jacques Nadeau added a comment -

      I haven't looked at the patch here. However, I'm not quite clear how this a valid transformation in logical land. What reason is there to assume a union maintains collation?

      Show
      jnadeau Jacques Nadeau added a comment - I haven't looked at the patch here. However, I'm not quite clear how this a valid transformation in logical land. What reason is there to assume a union maintains collation?
      Hide
      jcamachorodriguez Jesus Camacho Rodriguez added a comment -

      Jacques Nadeau, it is not purely transposing: to remain correct, top Sort is preserved in the plan too.

      Show
      jcamachorodriguez Jesus Camacho Rodriguez added a comment - Jacques Nadeau , it is not purely transposing: to remain correct, top Sort is preserved in the plan too.
      Hide
      pxiong Pengcheng Xiong added a comment -

      Jesus Camacho Rodriguez, thanks for your constructive comments. I agree with most of them and I will modify accordingly. Just one correctness issue that we may need discussion. So you suggested "If it is, we would replace it with a copy of the top Sort operator." Let's say we have table A, B with columns (col0 int, col1 int). A's rows are ((1,2),(2,1)) B's row is (1,3). And the original plan is like this. The output should be (1,2)

      Sort by col1, limit 1
         Union
              Sort by col0, limit 1  
                   Scan A
              Sort by col1, limit 1
                   Scan B
      

      If we replace it with a copy of the top Sort operator

      Sort by col1, limit 1
         Union
              Sort by col1, limit 1  
                   Scan A
              Sort by col1, limit 1
                   Scan B
      

      The output should be (2,1). Could you double check if my understanding is wrong? Thanks.

      Show
      pxiong Pengcheng Xiong added a comment - Jesus Camacho Rodriguez , thanks for your constructive comments. I agree with most of them and I will modify accordingly. Just one correctness issue that we may need discussion. So you suggested "If it is, we would replace it with a copy of the top Sort operator." Let's say we have table A, B with columns (col0 int, col1 int). A's rows are ((1,2),(2,1)) B's row is (1,3). And the original plan is like this. The output should be (1,2) Sort by col1, limit 1 Union Sort by col0, limit 1 Scan A Sort by col1, limit 1 Scan B If we replace it with a copy of the top Sort operator Sort by col1, limit 1 Union Sort by col1, limit 1 Scan A Sort by col1, limit 1 Scan B The output should be (2,1). Could you double check if my understanding is wrong? Thanks.
      Hide
      jcamachorodriguez Jesus Camacho Rodriguez added a comment - - edited

      You are right, if sorting columns (1) are not the same, or (2) those in the top sort are not a subset of those in the input sort, it cannot be updated; otherwise, at least the limit in the input sort could be updated (if smaller), as this could be beneficial.

      Show
      jcamachorodriguez Jesus Camacho Rodriguez added a comment - - edited You are right, if sorting columns (1) are not the same, or (2) those in the top sort are not a subset of those in the input sort, it cannot be updated; otherwise, at least the limit in the input sort could be updated (if smaller), as this could be beneficial.
      Hide
      julianhyde Julian Hyde added a comment -

      I don't think it's possible for the rule to "replace" sorts on the inputs. In volcano, you just can't see what types the inputs are. So you'd just push down the sorts.

      If an input ends up as a sort on top of a sort (with or without limit) then a (to be written) SortMergeRule could look after it.

      Show
      julianhyde Julian Hyde added a comment - I don't think it's possible for the rule to "replace" sorts on the inputs. In volcano, you just can't see what types the inputs are. So you'd just push down the sorts. If an input ends up as a sort on top of a sort (with or without limit) then a (to be written) SortMergeRule could look after it.
      Hide
      jnadeau Jacques Nadeau added a comment -

      I always forget that Calcite logical includes Limit as part of Sort (grr). This would be much more comprehensible as LimitUnionTranspose or similar (if Limit was an operation in Calcite's logical algebra).

      Show
      jnadeau Jacques Nadeau added a comment - I always forget that Calcite logical includes Limit as part of Sort (grr). This would be much more comprehensible as LimitUnionTranspose or similar (if Limit was an operation in Calcite's logical algebra).
      Hide
      julianhyde Julian Hyde added a comment - - edited

      Further comments:

      • Replace "SortProjectTransposeRule" comment.
      • It is not valid to apply the rule if Union.all is false.
      • It is not valid to apply the rule if Sort.offset is not null.
      • If is valid to apply the rule if Sort.fetch is null, but pointless, given that union does not in general propagate sort order. Therefore I would apply the rule only if Sort.fetch is not null.
      • We want this to be a general-purpose rule that can run in the VolcanoPlanner was well as HepPlanner, therefore the rule must not cast its inputs to HepRelVertex.
      • In principle, this rule could be one of the "standard rules" that are always enabled. Add it to CalcitePrepareImpl.DEFAULT_RULES and check that all tests give the same result or improved result.
      Show
      julianhyde Julian Hyde added a comment - - edited Further comments: Replace "SortProjectTransposeRule" comment. It is not valid to apply the rule if Union.all is false. It is not valid to apply the rule if Sort.offset is not null. If is valid to apply the rule if Sort.fetch is null, but pointless, given that union does not in general propagate sort order. Therefore I would apply the rule only if Sort.fetch is not null. We want this to be a general-purpose rule that can run in the VolcanoPlanner was well as HepPlanner, therefore the rule must not cast its inputs to HepRelVertex. In principle, this rule could be one of the "standard rules" that are always enabled. Add it to CalcitePrepareImpl.DEFAULT_RULES and check that all tests give the same result or improved result.
      Hide
      julianhyde Julian Hyde added a comment -

      I would like to know the currentRel’s type (whether it is Sort or not) which is wrapped in HepRelVertex. That is the reason why I do a cast. I know this will only work for HepPlanner. Could you please let me know what should I do to extend this to work for VolcanoPlanner? If there is any example rules that apply to VolcanoPlanner, could you point me there and let me take a look?

      I don't think you need to look for a Sort explicitly. You should look for an input where you don't need to push down. What Jesus Camacho Rodriguez did in SortJoinTransposeRule CALCITE-892 was to look for inputs whose estimated row count is less than the limit. Such inputs have probably had the rule applied already. I think you could do the same.

      Show
      julianhyde Julian Hyde added a comment - I would like to know the currentRel’s type (whether it is Sort or not) which is wrapped in HepRelVertex. That is the reason why I do a cast. I know this will only work for HepPlanner. Could you please let me know what should I do to extend this to work for VolcanoPlanner? If there is any example rules that apply to VolcanoPlanner, could you point me there and let me take a look? I don't think you need to look for a Sort explicitly. You should look for an input where you don't need to push down. What Jesus Camacho Rodriguez did in SortJoinTransposeRule CALCITE-892 was to look for inputs whose estimated row count is less than the limit. Such inputs have probably had the rule applied already. I think you could do the same.
      Hide
      pxiong Pengcheng Xiong added a comment -

      Julian Hyde, thanks for your comments. I have addressed them by leveraging the function Jesus Camacho Rodriguez did in SortJoinTransposeRule. Also thank Jacques Nadeau for help with formatting issue in Eclipse!

      Show
      pxiong Pengcheng Xiong added a comment - Julian Hyde , thanks for your comments. I have addressed them by leveraging the function Jesus Camacho Rodriguez did in SortJoinTransposeRule. Also thank Jacques Nadeau for help with formatting issue in Eclipse!
      Hide
      jcamachorodriguez Jesus Camacho Rodriguez added a comment -

      Pengcheng Xiong, patch looks good now, thanks! I still have a couple of comments:

      • Please, move the following condition to the matches method of the rule:
            // It is not valid to apply the rule if
            // Union.all is false;
            // or Sort.offset is not null
            // or Sort.fetch is null.
            if (!union.all || sort.offset != null || sort.fetch == null) {
              return;
            }
        

        This should improve performance when many rules are being applied, as the rule won't fire up.

      • Could we provide a public constructor whose input parameters are the classes to match e.g. as in SortJoinTransposeRule? Then the INSTANCE final static object will receive as parameters the concrete instances. This should avoid that in the future we need to revisit the rule if we want it to match different subclasses.
      Show
      jcamachorodriguez Jesus Camacho Rodriguez added a comment - Pengcheng Xiong , patch looks good now, thanks! I still have a couple of comments: Please, move the following condition to the matches method of the rule: // It is not valid to apply the rule if // Union.all is false; // or Sort.offset is not null // or Sort.fetch is null. if (!union.all || sort.offset != null || sort.fetch == null) { return; } This should improve performance when many rules are being applied, as the rule won't fire up. Could we provide a public constructor whose input parameters are the classes to match e.g. as in SortJoinTransposeRule? Then the INSTANCE final static object will receive as parameters the concrete instances. This should avoid that in the future we need to revisit the rule if we want it to match different subclasses.
      Hide
      julianhyde Julian Hyde added a comment -

      I've merged this into my branch https://github.com/julianhyde/incubator-calcite/tree/master along with some other changes and am running the test suite. I am seeing intermittent JVM crashes when I run in JDK 1.8 so I am investigating.

      Show
      julianhyde Julian Hyde added a comment - I've merged this into my branch https://github.com/julianhyde/incubator-calcite/tree/master along with some other changes and am running the test suite. I am seeing intermittent JVM crashes when I run in JDK 1.8 so I am investigating.
      Show
      julianhyde Julian Hyde added a comment - Fixed in http://git-wip-us.apache.org/repos/asf/incubator-calcite/commit/82ac7b2d . Thanks for the patch, Pengcheng Xiong !
      Hide
      julianhyde Julian Hyde added a comment -

      By the way, Jesus Camacho Rodriguez's remarks are still valid. If you create a follow-up pull request it would be much appreciated.

      Show
      julianhyde Julian Hyde added a comment - By the way, Jesus Camacho Rodriguez 's remarks are still valid. If you create a follow-up pull request it would be much appreciated.
      Hide
      maryannxue Maryann Xue added a comment -

      If is valid to apply the rule if Sort.fetch is null, but pointless, given that union does not in general propagate sort order. Therefore I would apply the rule only if Sort.fetch is not null.

      Julian Hyde, given that there could a physical Union operator that does merge for its inputs (I already have one for Phoenix now), think it would still make sense to apply this rule if sort.fetch == null, right?

      Show
      maryannxue Maryann Xue added a comment - If is valid to apply the rule if Sort.fetch is null, but pointless, given that union does not in general propagate sort order. Therefore I would apply the rule only if Sort.fetch is not null. Julian Hyde , given that there could a physical Union operator that does merge for its inputs (I already have one for Phoenix now), think it would still make sense to apply this rule if sort.fetch == null, right?
      Hide
      julianhyde Julian Hyde added a comment -

      Maryann Xue, It would make sense to apply the rule in that case, yes.

      You will still need to be careful if the Sort has an offset as well as a fetch. I'm not sure what would be the right thing to do in that case.

      Show
      julianhyde Julian Hyde added a comment - Maryann Xue , It would make sense to apply the rule in that case, yes. You will still need to be careful if the Sort has an offset as well as a fetch. I'm not sure what would be the right thing to do in that case.
      Hide
      maryannxue Maryann Xue added a comment -

      Phoenix right now does not support offset, so having "offset!=null" should not fire this rule for Phoenix either. So I suggest having another instance of this rule which turns on the match if sort.fetch == null && sort.offset == null ?

      Show
      maryannxue Maryann Xue added a comment - Phoenix right now does not support offset, so having "offset!=null" should not fire this rule for Phoenix either. So I suggest having another instance of this rule which turns on the match if sort.fetch == null && sort.offset == null ?
      Hide
      julianhyde Julian Hyde added a comment -

      I suggest that you contribute a patch to the rule that adds parameters 'Class<? extends Sort> sortClass, Class<? extends Union> unionClass, boolean matchNullFetch, RelBuilderFactory relBuilderFactory, String description' to the constructor. (But keeps the current constructor, deprecated.)

      The last 2 parameters are encouraged for all rules since CALCITE-828.

      Show
      julianhyde Julian Hyde added a comment - I suggest that you contribute a patch to the rule that adds parameters 'Class<? extends Sort> sortClass, Class<? extends Union> unionClass, boolean matchNullFetch, RelBuilderFactory relBuilderFactory, String description' to the constructor. (But keeps the current constructor, deprecated.) The last 2 parameters are encouraged for all rules since CALCITE-828 .
      Hide
      maryannxue Maryann Xue added a comment -

      Hi, Julian Hyde I'm attaching the extension patch. The old constructor had been declared as private, so I thought there was no need to keep that one, and I just used it for creating default instances with an additional parameter.

      Show
      maryannxue Maryann Xue added a comment - Hi, Julian Hyde I'm attaching the extension patch. The old constructor had been declared as private, so I thought there was no need to keep that one, and I just used it for creating default instances with an additional parameter.
      Hide
      jcamachorodriguez Jesus Camacho Rodriguez added a comment -

      Maryann Xue, could you create a follow-up JIRA case and move this check

      if (!union.all || sort.offset != null || (!matchNullFetch && sort.fetch == null))
      

      to the matches method? I will check the patch in. Thanks

      Show
      jcamachorodriguez Jesus Camacho Rodriguez added a comment - Maryann Xue , could you create a follow-up JIRA case and move this check if (!union.all || sort.offset != null || (!matchNullFetch && sort.fetch == null)) to the matches method? I will check the patch in. Thanks
      Hide
      maryannxue Maryann Xue added a comment -

      Thank you, Jesus Camacho Rodriguez! New entry created as CALCITE-939.

      Show
      maryannxue Maryann Xue added a comment - Thank you, Jesus Camacho Rodriguez ! New entry created as CALCITE-939 .
      Hide
      jcamachorodriguez Jesus Camacho Rodriguez added a comment -

      Resolved in release 1.5.0 (2015-11-10)

      Show
      jcamachorodriguez Jesus Camacho Rodriguez added a comment - Resolved in release 1.5.0 (2015-11-10)

        People

        • Assignee:
          julianhyde Julian Hyde
          Reporter:
          pxiong Pengcheng Xiong
        • Votes:
          0 Vote for this issue
          Watchers:
          5 Start watching this issue

          Dates

          • Created:
            Updated:
            Resolved:

            Development