Uploaded image for project: 'Apache Apex Malhar'
  1. Apache Apex Malhar
  2. APEXMALHAR-2085

Implement Windowed Operators

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 3.5.0
    • None
    • None

    Description

      As per our recent several discussions in the community. A group of Windowed Operators that delivers the window semantic follows the google Data Flow model(https://cloud.google.com/dataflow/) is very important.
      The operators should be designed and implemented in a way for
      High-level API
      Beam translation
      Easy to use with other popular operator

      Operator Hierarchy

      Hierarchy of the operators,
      The windowed operators should cover all possible transformations that require window, and batch processing is also considered as special window called global window

                         +-------------------+
             +---------> |  WindowedOperator | <--------+
             |           +--------+----------+          |
             |                    ^      ^--------------------------------+
             |                    |                     |                 |
             |                    |                     |                 |
      +------+--------+    +------+------+      +-------+-----+    +------+-----+
      |CombineOperator|    |GroupOperator|      |KeyedOperator|    |JoinOperator|
      +---------------+    +-------------+      +------+------+    +-----+------+
                                         +---------^   ^                 ^
                                         |             |                 |
                                +--------+---+   +-----+----+       +----+----+
                                |KeyedCombine|   |KeyedGroup|       | CoGroup |
                                +------------+   +----------+       +---------+
      

      Combine operation includes all operations that combine all tuples in one window into one or small number of tuples, Group operation group all tuples in one window, Join and CoGroup are used to join and group tuples from different inputs.

      Components
      • Window Component
        It includes configuration, window state that should be checkpointed, etc. It should support NonMergibleWindow(fixed or slide) MergibleWindow(Session)
      • Trigger
        It should support early trigger, late trigger with customizable trigger behaviour
      • Other related components:
        • Watermark generator, can be plugged into input source to generate watermark
        • Tuple schema support:
          It should handle either predefined tuple type or give a declarative API to describe the user defined tuple class

      Most component API should be reused in High-Level API

      This is the umbrella ticket, separate tickets would be created for different components and operators respectively

      Attachments

        1. Windowed Operator.pdf
          68 kB
          David Yan

        Issue Links

          Activity

            People

              davidyan David Yan
              hsy541 Siyuan Hua
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: