Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-6188

Unify the logic of intra-partition upsert and cross-partition upsert in flink state index.

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • index
    • 1

    Description

      Now when partitioning upsert, according to precombine.field parameter, keep the record with the largest value after upserting.

      This is widely used to solve the case of out-of-order data, by setting the precombine.field to the event time to keep records with the largest event time.

      However, when using the FLINK_STATE index type, if cross-partition occurs, the precombine.field parameter will not fully take effect.

      In the case of cross-partitioning, the current logic uses data that arrives later, even if the event time is smaller.

      It may be necessary to unify the logic of intra-partition upsert and cross-partition upsert, which is convenient for users to understand and use.

      Attachments

        Issue Links

          Activity

            People

              brucekellan Ying Lin
              brucekellan Ying Lin
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: