Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-6047 Add support for Retraction in Table API / SQL
  3. FLINK-6216

DataStream unbounded groupby aggregate with early firing



    • Type: Sub-task
    • Status: Closed
    • Priority: Major
    • Resolution: Implemented
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Table SQL / API
    • Labels:


      Groupby aggregate results in a replace table. For infinite groupby aggregate, we need a mechanism to define when the data should be emitted (early-fired). This task is aimed to implement the initial version of unbounded groupby aggregate, where we update and emit aggregate value per each arrived record. In the future, we will implement the mechanism and interface to let user define the frequency/period of early-firing the unbounded groupby aggregation results.

      The limit space of backend state is one of major obstacles for supporting unbounded groupby aggregate in practical. Due to this reason, we suggest two common (and very useful) use-cases of this unbounded groupby aggregate:
      1. The range of grouping key is limit. In this case, a new arrival record will either insert to state as new record or replace the existing record in the backend state. The data in the backend state will not be evicted if the resource is properly provisioned by the user, such that we can provision the correctness on aggregation results.
      2. When the grouping key is unlimited, we will not be able ensure the 100% correctness of "unbounded groupby aggregate". In this case, we will reply on the TTL mechanism of the RocksDB backend state to evicted old data such that we can provision the correct results in a certain time range.


          Issue Links



              • Assignee:
                ShaoxuanWang Shaoxuan Wang
                ShaoxuanWang Shaoxuan Wang
              • Votes:
                0 Vote for this issue
                3 Start watching this issue


                • Created: