Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-27849 Harden correctness for non-deterministic updates present in the changelog pipeline
  3. FLINK-28566

Adds materialization support to eliminate the non determinism generated by lookup join node

    XMLWordPrintableJSON

Details

    Description

      In order to minimize the potential exceptions or data errors when many users use the update stream to lookup join an external 
      table (essentially due to the non-deterministic result based on processing-time to lookup external tables). 
      When update exists in the input stream and the lookup key does not contain the primary key of the external table,
      FLINK automatically adds materialization of the update by default, so that it will only lookup the external table 
      when the insert or update_after message arrives, and when the delete or update_before message arrives, it will 
      directly querying the latest version of the locally materialized data and sent it to downstream operator.

      To do so,we introduce a new option 'table.exec.lookup-join.upsert-materialize' and resue the `UpsertMaterialize`. By default, the materialize operator will be added when an update stream lookup an external table without containing its primary keys(includes no primary key defined). You can also choose no materialization(NONE) or force materialization(FORCE) which will always enable materialization except input is insert only.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              lincoln.86xy lincoln lee
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: