Details
-
Sub-task
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
To have a useful enumerable implementation for streaming windowing semantic in Calcite, and also make Emit syntax useful, we will need to introduce watermark into Calcite. More specifically, the following should be discussed:
1. How is watermark generated? What changes are required on source/table scan?
2. How does watermark propagate through the whole plan? How are watermarks from multiple inputs aggregated together?
3. Where watermark is used to control the time to emit data?
4. How are event timestamp combined per key (typical in aggregation/join).
From Julian:
0. Can watermarks be handled using algebraic rewrites? I.e. can we use the usual operators but restart them, giving them just the rows that changed, when watermarks arrive.
This is consistent with Calcite's general approach, which is to solve things at a high level using algebra rather than dive into the details of a particular kind of engine.