Details

    • Type: Sub-task
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.11.0
    • Fix Version/s: 1.13.0
    • Component/s: core
    • Labels:

      Description

      [ONE ROW | ALL ROWS] PER MATCH: Choosing Summaries or Details for Each Match

      You will sometimes want summary data about the matches and other times need details. You can do that with the following SQL:

      • ONE ROW PER MATCH
        Each match produces one summary row. This is the default.
      • ALL ROWS PER MATCH
        A match spanning multiple rows will produce one output row for each row in the match.
        The output is explained in "Row Pattern Output".

      The MATCH_RECOGNIZE clause may find a match with zero rows. For an empty match, ONE ROW PER MATCH returns a summary row: the PARTITION BY columns take the values from the row where the empty match occurs, and the measure columns are evaluated over an empty set of rows.

      ALL ROWS PER MATCH has three suboptions:

      • ALL ROWS PER MATCH SHOW EMPTY MATCHES
      • ALL ROWS PER MATCH OMIT EMPTY MATCHES
      • ALL ROWS PER MATCH WITH UNMATCHED ROWS

        Activity

        Hide
        ransom Zhiqiang He added a comment -
        Show
        ransom Zhiqiang He added a comment - https://github.com/apache/calcite/pull/452 please review it. thanks.
        Hide
        julianhyde Julian Hyde added a comment -

        Review comments:

        • Why does Match.allRows have type Boolean not boolean?
        • Rather than SqlLiteral.toValue().equals use SqlLiteral.getValue() == because value is an enum
        • Typo: "isALlRows"
        • Typo: "onw row or all row per match" should be "One row or all rows per match"
        • We have a lot of very similar parser tests. Maybe combine with some existing ones, or use a more concise (minimal) example?
        Show
        julianhyde Julian Hyde added a comment - Review comments: Why does Match.allRows have type Boolean not boolean? Rather than SqlLiteral.toValue().equals use SqlLiteral.getValue() == because value is an enum Typo: "isALlRows" Typo: "onw row or all row per match" should be "One row or all rows per match" We have a lot of very similar parser tests. Maybe combine with some existing ones, or use a more concise (minimal) example?
        Hide
        ransom Zhiqiang He added a comment -

        Julian Hyde
        2,3,4: fixed, please review it. thanks.
        1: The boolean type default value is false, it means "ONE ROW PER MATCH", it will changed many test case. and these test case reltosql result is not equals input sql.
        5: every test case in match recoginze is test different case. It can not be removed. and I think some test case can be merged. I will merged it in next merge request.

        Show
        ransom Zhiqiang He added a comment - Julian Hyde 2,3,4: fixed, please review it. thanks. 1: The boolean type default value is false, it means "ONE ROW PER MATCH", it will changed many test case. and these test case reltosql result is not equals input sql. 5: every test case in match recoginze is test different case. It can not be removed. and I think some test case can be merged. I will merged it in next merge request.
        Hide
        julianhyde Julian Hyde added a comment - - edited

        Thanks for fixing 2, 3, 4.

        Regarding 1. I still think it should be boolean, not Boolean.

        The goal of a RelNode should be to make it easy to write rules. Which includes canonization: if two expressions are equivalent, they should be structurally the same. It is less important whether the generated SQL looks the same as the input.

        Also, we don't want people who write rules to have to know whether ONE ROW PER MATCH or ALL ROWS PER MATCH is the default. That should be sorted out at SQL-to-Rel time.

        (Did you know that people can write "COUNT(ALL x)" which means the same as "COUNT(x)" and is the opposite of "COUNT(DISTINCT x)"? But of course we don't record whether they wrote "ALL". AggregateCall.distinct is a boolean, not a Boolean.)

        Regarding 5. Merging some test cases next request would be great. Thanks. It's not urgent. I just want to keep some balance between number of tests (maintenance burden) and coverage. We can reduce the number of tests quite a bit, I think, without much drop in coverage.

        Show
        julianhyde Julian Hyde added a comment - - edited Thanks for fixing 2, 3, 4. Regarding 1. I still think it should be boolean, not Boolean. The goal of a RelNode should be to make it easy to write rules. Which includes canonization: if two expressions are equivalent, they should be structurally the same. It is less important whether the generated SQL looks the same as the input. Also, we don't want people who write rules to have to know whether ONE ROW PER MATCH or ALL ROWS PER MATCH is the default. That should be sorted out at SQL-to-Rel time. (Did you know that people can write "COUNT(ALL x)" which means the same as "COUNT(x)" and is the opposite of "COUNT(DISTINCT x)"? But of course we don't record whether they wrote "ALL". AggregateCall.distinct is a boolean, not a Boolean.) Regarding 5. Merging some test cases next request would be great. Thanks. It's not urgent. I just want to keep some balance between number of tests (maintenance burden) and coverage. We can reduce the number of tests quite a bit, I think, without much drop in coverage.
        Hide
        ransom Zhiqiang He added a comment -

        Julian Hyde Boolean is already changed to boolean. please review it. thanks.

        Show
        ransom Zhiqiang He added a comment - Julian Hyde Boolean is already changed to boolean. please review it. thanks.
        Hide
        julianhyde Julian Hyde added a comment -

        Finished review, testing now.

        I renamed your enum AfterMatchOption to RowsPerMatchOption. Your name just didn't make sense.

        I see you haven't added any tests to SqlToRelConverterTest yet. Next commit, can you please add a test there. Then you'll notice that the Match.explainTerms method is way out of date.

        Show
        julianhyde Julian Hyde added a comment - Finished review, testing now. I renamed your enum AfterMatchOption to RowsPerMatchOption . Your name just didn't make sense. I see you haven't added any tests to SqlToRelConverterTest yet. Next commit, can you please add a test there. Then you'll notice that the Match.explainTerms method is way out of date.
        Hide
        julianhyde Julian Hyde added a comment -
        Show
        julianhyde Julian Hyde added a comment - Fixed in http://git-wip-us.apache.org/repos/asf/calcite/commit/e117c10c ; thanks for the PR, Zhiqiang He !
        Hide
        jcamachorodriguez Jesus Camacho Rodriguez added a comment -

        Resolved in release 1.13.0 (2017-06-26).

        Show
        jcamachorodriguez Jesus Camacho Rodriguez added a comment - Resolved in release 1.13.0 (2017-06-26).

          People

          • Assignee:
            ransom Zhiqiang He
            Reporter:
            ransom Zhiqiang He
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development