Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Duplicate
-
1.15.0
-
None
-
None
Description
The intermediate result of current test case temporalJoinITCase#testEventTimeMultiTemporalJoin is wrong:
+I, 5,RMB,40,2020-08-16T00:03,null,null,null,null +I, 2,US Dollar,1,2020-08-15T00:02,102,2020-08-15T00:00:02,102,2020-08-15T00:00:02 +I, 3,RMB,40,2020-08-15T00:03,702,2020-08-15T00:00:04,702,2020-08-15T00:00:04 -U, 2,US Dollar,1,2020-08-16T00:03,106,2020-08-16T00:02,106,2020-08-16T00:02 ...
because the "-U, 2,US Dollar,1,2020-08-16T00:03..." has a different 'order_time' column against "+I, 2,US Dollar,1,2020-08-15T00:02...", and after join there's no upsert key, so downstream operator can only do retract by the complete row, and will fail at this case.
The root cause is when cdc source carries meta data column (e.g., operation time in binlog or operation type, which will make the delete|update_before message not exactly the same as the previous version), and after some operations like join (not on the primary key of cdc source, the output will have no upsert key anymore), then downstream operator can not do retract correctly.
This is obscure to users, but we should think of a way to at least report the error to users (during compiling), or other solution eliminate the problem completely.
Attachments
Issue Links
- blocks
-
FLINK-24666 Add job level "table.exec.state-stale.error-handling" option and apply to related stateful stream operators
- Open
- is duplicated by
-
FLINK-27849 Harden correctness for non-deterministic updates present in the changelog pipeline
- Open