Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-24840

Materialized View incremental rebuild produces wrong result set after compaction

    XMLWordPrintableJSON

Details

    Description

      create table t1(a int, b varchar(128), c float) stored as orc TBLPROPERTIES ('transactional'='true');
      insert into t1(a,b, c) values (1, 'one', 1.1), (2, 'two', 2.2), (NULL, NULL, NULL);
      
      create materialized view mat1 stored as orc TBLPROPERTIES ('transactional'='true') as 
                  select a,b,c from t1 where a > 0 or a is null;
      
      delete from t1 where a = 1;
      
      alter table t1 compact 'major';
      
      -- Wait until compaction finished.
      alter materialized view mat1 rebuild;
      

      Expected result of query

      select * from mat1;
      
      2 two 2
      NULL NULL NULL
      

      but if incremental rebuild is enabled the result is

      1 one 1
      2 two 2
      NULL NULL NULL
      

      Cause: Incremental rebuild queries whether the source tables of a materialized view has delete or update transaction since the last rebuild from metastore from COMPLETED_TXN_COMPONENTS table. However when a major compaction is performed on the source tables the records related to these tables are deleted from COMPLETED_TXN_COMPONENTS.

      Attachments

        Issue Links

          Activity

            People

              kkasa Krisztian Kasa
              kkasa Krisztian Kasa
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h 50m
                  2h 50m