Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-5857

Snapshot query result is wrong after apply insert overwrite to an existed table with simple bucket index

    XMLWordPrintableJSON

Details

    Description

      Snapshot query result is wrong after apply insert overwrite to an existed table with simple bucket index.

      The bug could be produced by the following steps.

      1. create a mor table with bucket index
        create table test_hudi_zj0221(  
          id int,  
          name string,  
          price double, 
          ts long,  
          dt string) using hudipartitioned by (dt)
        options(  
          type='mor',  
          primaryKey='id',  
          preCombineField = 'ts', 
          'hoodie.index.type'='BUCKET',
          'hoodie.bucket.index.num.buckets'='8) 
      1. insert into data
        insert
        into test_hudi_zj0221 select 8 as id, 'hudi3' as name, 30 as price, 3000 as ts,
        '2021-05-05' as dt;
        
        insert
        into test_hudi_zj0221 select 9 as id, 'hudi3' as name, 30 as price, 3000 as ts,
        '2021-05-05' as dt;
        
        insert
        into test_hudi_zj0221 select 10 as id, 'hudi3' as name, 30 as price, 3000 as
        ts, '2021-05-05' as dt;
        
        insert
        into test_hudi_zj0221 select 11 as id, 'hudi3' as name, 30 as price, 3000 as
        ts, '2021-05-05' as dt;
        
        insert
        into test_hudi_zj0221 select 12 as id, 'hudi3' as name, 30 as price, 3000 as
        ts, '2021-05-05' as dt;
        
        insert
        into test_hudi_zj0221 select 13 as id, 'hudi3' as name, 30 as price, 3000 as
        ts, '2021-05-05' as dt;
        
        insert
        into test_hudi_zj0221 select 14 as id, 'hudi3' as name, 30 as price, 3000 as
        ts, '2021-05-05' as dt;
        
        insert
        into test_hudi_zj0221 select 15 as id, 'hudi3' as name, 30 as price, 3000 as
        ts, '2021-05-05' as dt; 
      1. find something wrong, use insert overwrite to overwrite a partition
         insert overwrite table test_hudi_zj0221 partition(dt = '2021-05-05') select 2222, 'a2',30, 3000; 
      1. snapshot query on the table
        select * from test_hudi_zj0221 where dt='2021-05-05';
        -- or
        select * from test_hudi_zj0221; 

      Attachments

        Issue Links

          Activity

            People

              jingzhang Jing Zhang
              jingzhang Jing Zhang
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: