AFTER MATCH SKIP: Defining Where to Restart the Matching Process After a Match Is Found
The AFTER MATCH SKIP clause determines the point to resume row pattern matching after a non-empty match was found. The default for the clause is AFTER MATCH SKIP PAST LAST ROW. The options are as follows:
- AFTER MATCH SKIP TO NEXT ROW
Resume pattern matching at the row after the first row of the current match.
- AFTER MATCH SKIP PAST LAST ROW
Resume pattern matching at the next row after the last row of the current match.
- AFTER MATCH SKIP TO FIRST pattern_variable
Resume pattern matching at the first row that is mapped to the pattern variable.
- AFTER MATCH SKIP TO LAST pattern_variable
Resume pattern matching at the last row that is mapped to the pattern variable.
- AFTER MATCH SKIP TO pattern_variable
The same as AFTER MATCH SKIP TO LAST pattern_variable.
When using AFTER MATCH SKIP TO FIRST or AFTER MATCH SKIP TO [LAST], it is possible that no row is mapped to the pattern_variable. For example:
AFTER MATCH SKIP TO A PATTERN (X A* X),
The pattern variable A in the example might have no rows mapped to A. If there is no row mapped to A, then there is no row to skip to, so a runtime exception is generated. Another problem condition is that AFTER MATCH SKIP may try to resume pattern matching at the same row that the last match started. For example:
AFTER MATCH SKIP TO X PATTERN (X Y+ Z),
In this example, AFTER MATCH SKIP TO X tries to resume pattern matching at the same row where the previous match was found. This would result in an infinite loop, so a runtime exception is generated for this scenario.
Note that the AFTER MATCH SKIP syntax only determines the point to resume scanning for a match after a non-empty match. When an empty match is found, one row is skipped (as if SKIP TO NEXT ROW had been specified). Thus an empty match never causes one of these exceptions. A query that gets one of these exceptions should be rewritten, as, for example, in the following:
AFTER MATCH SKIP TO A PATTERN (X (A | B) Y)
This will cause a run-time error when a row is mapped to B, because no row was mapped to A. If the intent is to skip to either A or B, the following will work:
AFTER MATCH SKIP TO C PATTERN (X (A | B) Y) SUBSET C = (A, B)
In the revised example, no runtime error is possible, whether A or B is matched.
As another example:
AFTER MATCH SKIP TO FIRST A PATTERN (A* X)
This example gets an exception after the first match, either for skipping to the first row of the match (if A* matches) or for skipping to a nonexistent row (if A* does not match). In this example, SKIP TO NEXT ROW is a better choice.
When using ALL ROWS PER MATCH together with skip options other than AFTER MATCH SKIP PAST LAST ROW, it is possible for consecutive matches to overlap, in which case a row R of the row pattern input table might occur in more than one match. In that case, the row pattern output table will have one row for each match in which the row participates. If a row of the row pattern input table participates in multiple matches, the MATCH_NUMBER function can be used to distinguish among the matches. When a row participates in more than one match, its classifier can be different in each match.
- relates to
CALCITE-1773 Add Test sql validator test for Pattern skip syntax in MATCH_RECOGNIZE