Details
-
Sub-task
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
The goal of this issue is to add support for inner equi-join on proc time streams to the SQL interface.
Queries similar to the following should be supported:
SELECT o.rowtime , o.productId, o.orderId, s.rowtime AS shipTime
FROM Orders AS o
JOIN Shipments AS s
ON o.orderId = s.orderId
AND o.rowtime BETWEEN s.rowtime AND s.rowtime + INTERVAL '1' HOUR;
The following restrictions should initially apply:
- The join hint only support inner join
- The ON clause should include equi-join condition
- The time-condition o.rowtime BETWEEN s.rowtime AND s.rowtime + INTERVAL '1' HOUR only can use rowtime that is a system attribute, the time condition only support bounded time range like o.rowtime BETWEEN s.rowtime - INTERVAL '1' HOUR AND s.rowtime + INTERVAL '1' HOUR, not support unbounded like o.rowtime < s.rowtime , and should include both two stream's rowtime attribute, o.rowtime between rowtime () and rowtime () + 1 should also not be supported.
An row-time streams join will not be able to handle late data, because this would mean in insert a row into a sorted order shift all other computations. This would be too expensive to maintain. Therefore, we will throw an error if a user tries to use an row-time stream join with late data handling.
This issue includes:
- Design of the DataStream operator to deal with stream join
- Translation from Calcite's RelNode representation (LogicalJoin).
Attachments
Issue Links
- links to