Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Fixed
-
None
-
None
Description
Spark sql insert updates the whole record if the record with same PK already exists in hudi table that has no preCombineField specified and strict insert mode is used.
To Reproduce
Steps to reproduce the behavior:
create table hudi_cow_nonpcf_tbl (
uuid int,
name string,
price double
) using hudi;
set hoodie.sql.insert.mode=strict;
- first insert
insert into hudi_cow_nonpcf_tbl select 1, ‘a1’, 20;
select * from hudi_cow_nonpcf_tbl;
- returns
1 a1 20.0
- another insert with the same key, different values:
insert into hudi_cow_nonpcf_tbl select 1, ‘a2’, 30;
select * from hudi_cow_nonpcf_tbl;
- returns
1 a2 30.0
Expected behavior
There's a difference in behavior when precombine field is specified and Hudi throws an error.
I would expect the second insert fail if a record with the same key already exists when precombine field is not specified and strict insert mode is enabled.
Attachments
Issue Links
- links to