• Sub-task
    • Status: Closed
    • Major
    • Resolution: Resolved
    • 1.11.0
    • 1.12.0
    • core


      MATCH_RECOGNIZE syntax like this:
      Only pattern and define is supported in first step.

      PATTERN: Defining the Row Pattern to Be Matched

      The PATTERN keyword specifies the pattern to be recognized in the ordered sequence of rows in a partition. Each variable name in a pattern corresponds to a Boolean condition, which is specified later using the DEFINE component of the syntax.

      The PATTERN clause is used to specify a regular expression. It is outside the scope of this material to explain regular expression concepts and details. If you are not familiar with regular expressions, you are encouraged to familiarize yourself with the topic using other sources.

      The regular expression in a PATTERN clause is enclosed in parentheses. PATTERN may use the following operators:

      • Concatenation

      Concatenation is used to list two or more items in a pattern to be matched in that order. Items are concatenated when there is no operator sign between two successive items. For example: PATTERN (A B C).

      • Quantifiers

      Quantifiers define the number of iterations accepted for a match. Quantifiers are postfix operators with the following choices:

        • * — 0 or more iterations
        • + — 1 or more iterations
        • ? — 0 or 1 iterations
        • {n}

          — n iterations (n > 0)

        • {n,} — n or more iterations (n >= 0)
          ** {n,m} — between n and m (inclusive) iterations (0 <= n <= m, 0 < m)
          ** {,m} — between 0 and m (inclusive) iterations (m > 0)
          ** reluctant quantifiers — indicated by an additional question mark following a quantifier (*?, +?, ??, {n,}


          { n, m }



          ?). See "Reluctant Versus Greedy Quantifier" for the difference between reluctant and non-reluctant quantifiers.

      The following are examples of using quantifier operators:

        • A* matches 0 or more iterations of A
        • A {3,6}

          matches 3 to 6 iterations of A

        • A {,4}

          matches 0 to 4 iterations of A

      • Alternation
        Alternation matches a single regular expression from a list of several possible regular expressions. The alternation list is created by placing a vertical bar (|) between each regular expression. Alternatives are preferred in the order they are specified. As an example, PATTERN (A | B | C) attempts to match A first. If A is not matched, it attempts to match B. If B is not matched, it attempts to match C.
      • Grouping
        Grouping treats a portion of the regular expression as a single unit, enabling you to apply regular expression operators such as quantifiers to that group. Grouping is created with parentheses. As an example, PATTERN ((A B) {3}

        C) attempts to match the group (A B) three times and then seeks one occurrence of C.

      • PERMUTE
        See "How to Express All Permutations" for more information.
      • Exclusion
        Parts of the pattern to be excluded from the output of ALL ROWS PER MATCH are enclosed between {- and -}. See "How to Exclude Portions of the Pattern from the Output".
      • Anchors
        Anchors work in terms of positions rather than rows. They match a position either at the start or end of a partition.
        • ^ matches the position before the first row in the partition.
        • $ matches the position after the last row in the partition.
          As an example, PATTERN (^A+$) will match only if all rows in a partition satisfy the condition for A. The resulting match spans the entire partition.
      • Empty pattern (), matches an empty set of rows
        This section contains the following topics:
        Reluctant Versus Greedy Quantifier
        Operator Precedence

        Reluctant Versus Greedy Quantifier

        Pattern quantifiers are referred to as greedy; they will attempt to match as many instances of the regular expression on which they are applied as possible. The exception is pattern quantifiers that have a question mark ? as a suffix, and those are referred to as reluctant. They will attempt to match as few instances as possible of the regular expression on which they are applied.

      The difference between greedy and reluctant quantifiers appended to a single pattern variable is illustrated as follows: A* tries to map as many rows as possible to A, whereas A*? tries to map as few rows as possible to A. For example:

      PATTERN (X Y* Z)

      The pattern consists of three variable names, X, Y, and Z, with Y quantified with *. This means a pattern match will be recognized and reported when the following condition is met by consecutive incoming input rows:

        • A row satisfies the condition that defines variable X followed by zero or more rows that satisfy the condition that defines the variable Y followed by a row that satisfies the condition that defines the variable Z.

      During the pattern matching process, after a row was mapped to X and 0 or more rows were mapped to Y, if the following row can be mapped to both variables Y and Z (which satisfies the defining condition of both Y and Z), then, because the quantifier * for Y is greedy, the row is preferentially mapped to Y instead of Z. Due to this greedy property, Y gets preference over Z and a greater number of rows to Y are mapped. If the pattern expression was PATTERN (X Y*? Z), which uses a reluctant quantifier *? over Y, then Z gets preference over Y.

      Operator Precedence

      The precedence of the elements in a regular expression, in decreasing order, is as follows:

      • row_pattern_primary
        These elements include primary pattern variables (pattern variables not created with the SUBSET clause described in "SUBSET: Defining Union Row Pattern Variables"), anchors, PERMUTE, parenthetic expressions, exclusion syntax, and empty pattern
      • Quantifier
        A row_pattern_primary may have zero or one quantifier.
      • Concatenation
      • Alternation
        Precedence of alternation is illustrated by PATTERN(A B | C D), which is equivalent to PATTERN ((A B) | (C D)). It is not, however, equivalent to PATTERN (A (B | C) D).
        Precedence of quantifiers is illustrated by PATTERN (A B ), which is equivalent to PATTERN (A (B)). It is not, however, PATTERN ((A B)*).
        A quantifier may not immediately follow another quantifier. For example, PATTERN(A**) is prohibited.
        It is permitted for a primary pattern variable to occur more than once in a pattern, for example, PATTERN (X Y X).

      *DEFINE*: Defining Primary Pattern Variables

      DEFINE is a mandatory clause, used to specify the conditions that define primary pattern variables. In the example:

      DEFINE UP AS UP.Price > PREV(UP.Price),
      DOWN AS DOWN.Price < PREV(DOWN.Price)

      UP is defined by the condition UP.Price > PREV (UP.Price), and DOWN is defined by the condition DOWN.Price < PREV (DOWN.Price). (PREV is a row pattern navigation operation which evaluates an expression in the previous row; see "Row Pattern Navigation Operations" regarding the complete set of row pattern navigation operations.)

      A pattern variable does not require a definition; if there is no definition, any row can be mapped to the pattern variable.

      A union row pattern variable (see discussion of SUBSET in "SUBSET: Defining Union Row Pattern Variables") cannot be defined by DEFINE, but can be referenced in the definition of a pattern variable.

      The definition of a pattern variable can reference another pattern variable, which is illustrated in Example 20-6.

      Example 20-6 Defining Pattern Variables

      SELECT *
           PARTITION BY Symbol
           FROM Ticker
           MATCH_RECOGNIZE (
           PARTITION BY Symbol
           ORDER BY tstamp
           MEASURES FIRST (A.tstamp) AS A_Firstday,
                    LAST (D.tstamp) AS D_Lastday,
                    AVG (B.Price) AS B_Avgprice,
                    AVG (D.Price) AS D_Avgprice
           PATTERN (A B+ C+ D)
           SUBSET BC = (B,C)
           DEFINE A AS Price > 100,
                  B AS B.Price > A.Price,
                  C AS C.Price < AVG (B.Price),
                  D AS D.Price > MAX (BC.Price)
      ) M

      In this example:
      The definition of A implicitly references the universal row pattern variable (because of the unqualified column reference Price).
      The definition of B references the pattern variable A.
      The definition of C references the pattern variable B.
      The definition of D references the union row pattern variable BC.
      The conditions are evaluated on successive rows of a partition in a trial match, with the current row being tentatively mapped to a pattern variable as permitted by the pattern. To be successfully mapped, the condition must evaluate to true.
      In the previous example:

      A AS Price > 100

      Price refers to the Price in the current row, because the last row mapped to any primary row pattern variable is the current row, which is tentatively mapped to A. Alternatively, in this example, using A.Price would have led to the same results.

      B AS B.Price > A.Price

      B.Price refers to the Price in the current row (because B is being defined), whereas A.Price refers to the last row mapped to A. In view of the pattern, the only row mapped to A is the first row to be mapped.

      C AS C.Price < AVG(B.Price)

      Here C.Price refers to the Price in the current row, because C is being defined. The aggregate AVG (that is, insert Price) is computed as the average of all rows that are already mapped to B.

      D AS D.Price > MAX(BC.Price)

      The pattern variable D is similar to pattern variable C, though it illustrates the use of a union row pattern variable in the Boolean condition. In this case, MAX(BC.Price) returns the maximum price value of the rows matched to variable B or variable C. The semantics of Boolean conditions are discussed in more detail in "Expressions in MEASURES and DEFINE".

      MATCH_RECOGNIZE syntax :

      table_reference ::=
        {only (query_table_expression) | query_table_expression }[flashback_query_clause]
         [pivot_clause|unpivot_clause|row_pattern_recognition_clause] [t_alias]
      row_pattern_recognition_clause ::=
         PATTERN (row_pattern)
         DEFINE row_pattern_definition_list
      row_pattern ::=
        | row_pattern "|" row_pattern_term
      row_pattern_term ::=
        | row_pattern_term row_pattern_factor
      row_pattern_factor ::=
         row_pattern_primary [row_pattern_quantifier]
      row_pattern_quantifier ::=
         |"{"[unsigned_integer ],[unsigned_integer]"}"[?]
         |"{"unsigned_integer "}"
      row_pattern_primary ::=
         |"{-" row_pattern"-}"
         | row_pattern_permute
      row_pattern_permute ::=
         PERMUTE (row_pattern [, row_pattern] ...)
      row_pattern_subset_clause ::=
         SUBSET row_pattern_subset_item [, row_pattern_subset_item] ...
      row_pattern_subset_item ::=
         variable_name = (variable_name[ , variable_name]...)
      row_pattern_definition_list ::=
         row_pattern_definition[, row_pattern_definition]...
      row_pattern_definition ::=
         variable_name AS condition


        Issue Links



              ransom Zhiqiang He
              ransom Zhiqiang He
              0 Vote for this issue
              2 Start watching this issue