Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-17465

Statistics: Drill-down filters don't reduce row-counts progressively

    XMLWordPrintableJSON

Details

    Description

      explain select count(d_date_sk) from date_dim where d_year=2001 ;
      explain select count(d_date_sk) from date_dim where d_year=2001  and d_moy = 9;
      explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 and d_dom = 21;
      

      All 3 queries end up with the same row-count estimates after the filter.

                  Map Operator Tree:
                      TableScan
                        alias: date_dim
                        filterExpr: (d_year = 2001) (type: boolean)
                        Statistics: Num rows: 73049 Data size: 82034027 Basic stats: COMPLETE Column stats: COMPLETE
                        Filter Operator
                          predicate: (d_year = 2001) (type: boolean)
                          Statistics: Num rows: 363 Data size: 4356 Basic stats: COMPLETE Column stats: COMPLETE
       
              Map 1 
                  Map Operator Tree:
                      TableScan
                        alias: date_dim
                        filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: boolean)
                        Statistics: Num rows: 73049 Data size: 82034027 Basic stats: COMPLETE Column stats: COMPLETE
                        Filter Operator
                          predicate: ((d_year = 2001) and (d_moy = 9)) (type: boolean)
                          Statistics: Num rows: 363 Data size: 5808 Basic stats: COMPLETE Column stats: COMPLETE
              Map 1 
                  Map Operator Tree:
                      TableScan
                        alias: date_dim
                        filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = 21)) (type: boolean)
                        Statistics: Num rows: 73049 Data size: 82034027 Basic stats: COMPLETE Column stats: COMPLETE
                        Filter Operator
                          predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = 21)) (type: boolean)
                          Statistics: Num rows: 363 Data size: 7260 Basic stats: COMPLETE Column stats: COMPLETE
      

      Attachments

        1. HIVE-17465.1.patch
          11 kB
          Vineet Garg
        2. HIVE-17465.2.patch
          1.20 MB
          Vineet Garg
        3. HIVE-17465.3.patch
          1.06 MB
          Vineet Garg
        4. HIVE-17465.4.patch
          173 kB
          Vineet Garg
        5. HIVE-17465.5.patch
          150 kB
          Vineet Garg
        6. HIVE-17465.6.patch
          150 kB
          Vineet Garg
        7. HIVE-17465.7.patch
          121 kB
          Vineet Garg

        Issue Links

          Activity

            People

              vgarg Vineet Garg
              gopalv Gopal Vijayaraghavan
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: