Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-5260

Insert into sql with strict insert mode and no preCombineField should not overwrite existing records

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • None
    • 0.12.2
    • None

    Description

      Spark sql insert updates the whole record if the record with same PK already exists in hudi table that has no preCombineField specified and strict insert mode is used.

      To Reproduce

      Steps to reproduce the behavior:

      create table hudi_cow_nonpcf_tbl (
        uuid int,
        name string,
        price double
      ) using hudi;

      set hoodie.sql.insert.mode=strict;

      1. first insert
        insert into hudi_cow_nonpcf_tbl select 1, ‘a1’, 20;

      select * from hudi_cow_nonpcf_tbl;

      1. returns
        1    a1    20.0
      1. another insert with the same key, different values:
        insert into hudi_cow_nonpcf_tbl select 1, ‘a2’, 30;

      select * from hudi_cow_nonpcf_tbl;

      1. returns
        1    a2    30.0
        Expected behavior

      There's a difference in behavior when precombine field is specified and Hudi throws an error.
      I would expect the second insert fail if a record with the same key already exists when precombine field is not specified and strict insert mode is enabled.

      https://github.com/apache/hudi/issues/7266

      Attachments

        Issue Links

          Activity

            People

              kazdy kazdy
              kazdy kazdy
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: