Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-12661

StatsSetupConst.COLUMN_STATS_ACCURATE is not used correctly

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.0.0, 1.2.1
    • Fix Version/s: 2.0.0
    • Component/s: None
    • Labels:
      None

      Description

      PROBLEM:
      Hive stats are autogathered properly till an 'analyze table [tablename] compute statistics for columns' is run. Then it does not auto-update the stats till the command is run again. repo:

      set hive.stats.autogather=true; 
      set hive.stats.atomic=false ; 
      set hive.stats.collect.rawdatasize=true ; 
      set hive.stats.collect.scancols=false ; 
      set hive.stats.collect.tablekeys=false ; 
      set hive.stats.fetch.column.stats=true; 
      set hive.stats.fetch.partition.stats=true ; 
      set hive.stats.reliable=false ; 
      set hive.compute.query.using.stats=true; 
      
      CREATE TABLE `default`.`calendar` (`year` int) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' TBLPROPERTIES ( 'orc.compress'='NONE') ; 
      
      insert into calendar values (2010), (2011), (2012); 
      select * from calendar; 
      +----------------+--+ 
      | calendar.year | 
      +----------------+--+ 
      | 2010 | 
      | 2011 | 
      | 2012 | 
      +----------------+--+ 
      
      select max(year) from calendar; 
      | 2012 | 
      
      insert into calendar values (2013); 
      select * from calendar; 
      +----------------+--+ 
      | calendar.year | 
      +----------------+--+ 
      | 2010 | 
      | 2011 | 
      | 2012 | 
      | 2013 | 
      +----------------+--+ 
      
      select max(year) from calendar; 
      | 2013 | 
      
      insert into calendar values (2014); 
      select max(year) from calendar; 
      | 2014 |
      
      analyze table calendar compute statistics for columns;
      
      insert into calendar values (2015);
      select max(year) from calendar;
      | 2014 |
      
      insert into calendar values (2016), (2017), (2018);
      select max(year) from calendar;
      | 2014  |
      
      analyze table calendar compute statistics for columns;
      select max(year) from calendar;
      | 2018  |
      

        Attachments

        1. HIVE-12661.final.patch
          1.63 MB
          Pengcheng Xiong
        2. HIVE-12661.12.patch
          1.62 MB
          Pengcheng Xiong
        3. HIVE-12661.11.patch
          1.58 MB
          Pengcheng Xiong
        4. HIVE-12661.10.patch
          1.55 MB
          Pengcheng Xiong
        5. HIVE-12661.09.patch
          1.55 MB
          Pengcheng Xiong
        6. HIVE-12661.08.patch
          1.54 MB
          Pengcheng Xiong
        7. HIVE-12661.07.patch
          1.56 MB
          Pengcheng Xiong
        8. HIVE-12661.06.patch
          1.38 MB
          Pengcheng Xiong
        9. HIVE-12661.05.patch
          1.36 MB
          Pengcheng Xiong
        10. HIVE-12661.04.patch
          1.11 MB
          Pengcheng Xiong
        11. HIVE-12661.03.patch
          1.08 MB
          Pengcheng Xiong
        12. HIVE-12661.02.patch
          92 kB
          Pengcheng Xiong
        13. HIVE-12661.01.patch
          49 kB
          Pengcheng Xiong

          Issue Links

            Activity

              People

              • Assignee:
                pxiong Pengcheng Xiong
                Reporter:
                pxiong Pengcheng Xiong
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: