Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-28242

CDC source with meta columns may cause error result on downstream stateful operators

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • 1.15.0
    • None
    • Table SQL / Runtime
    • None

    Description

      The intermediate result of current test case temporalJoinITCase#testEventTimeMultiTemporalJoin is wrong:

      
      +I,    5,RMB,40,2020-08-16T00:03,null,null,null,null
      +I,    2,US Dollar,1,2020-08-15T00:02,102,2020-08-15T00:00:02,102,2020-08-15T00:00:02
      +I,    3,RMB,40,2020-08-15T00:03,702,2020-08-15T00:00:04,702,2020-08-15T00:00:04
      -U,   2,US Dollar,1,2020-08-16T00:03,106,2020-08-16T00:02,106,2020-08-16T00:02
      
      ...
      
      

      because the "-U,   2,US Dollar,1,2020-08-16T00:03..." has a different 'order_time' column against "+I,    2,US Dollar,1,2020-08-15T00:02...", and after join there's no upsert key, so downstream operator can only do retract by the complete row, and will fail at this case.

      The root cause is when cdc source carries meta data column (e.g., operation time in binlog or operation type, which will make the delete|update_before message not exactly the same as the previous version), and after some operations like join (not on the primary key of cdc source, the output will have no upsert key anymore), then downstream operator can not do retract correctly.

      This is obscure to users, but we should think of a way to at least report the error to users (during compiling), or other solution eliminate the problem completely.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              lincoln.86xy lincoln lee
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: