Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-11160 Auto-gather column stats
  3. HIVE-18108

in case basic stats are missing; rowcount estimation depends on the selected columns size

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None
    • None
    • None

    Description

      in case basicstats are not available (especially rowcount):

      set hive.stats.autogather=false;
      create table t (a integer, b string);
      
      insert into t values (1,'asd1');
      insert into t values (2,'asd2');
      insert into t values (3,'asd3');
      insert into t values (4,'asd4');
      insert into t values (5,'asd5');
      
      explain select a,count(1) from t group by a;
      -- estimated to read 8 rows from table t
      explain select b,count(1) from t group by b;
      -- estimated: 1 rows
      explain select a,b,count(1) from t group by a,b;
      -- estimated: 1 rows
      

      it may not depend on the actually selected column set.

      Attachments

        1. HIVE-18108.03.patch
          1.16 MB
          Zoltan Haindrich
        2. HIVE-18108.02.patch
          1.15 MB
          Zoltan Haindrich
        3. HIVE-18108.02wip02.patch
          24 kB
          Zoltan Haindrich
        4. HIVE-18108.01.patch
          15 kB
          Zoltan Haindrich

        Issue Links

          Activity

            People

              kgyrtkirk Zoltan Haindrich
              kgyrtkirk Zoltan Haindrich
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: