Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-17465

Statistics: Drill-down filters don't reduce row-counts progressively

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      explain select count(d_date_sk) from date_dim where d_year=2001 ;
      explain select count(d_date_sk) from date_dim where d_year=2001  and d_moy = 9;
      explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 and d_dom = 21;
      

      All 3 queries end up with the same row-count estimates after the filter.

                  Map Operator Tree:
                      TableScan
                        alias: date_dim
                        filterExpr: (d_year = 2001) (type: boolean)
                        Statistics: Num rows: 73049 Data size: 82034027 Basic stats: COMPLETE Column stats: COMPLETE
                        Filter Operator
                          predicate: (d_year = 2001) (type: boolean)
                          Statistics: Num rows: 363 Data size: 4356 Basic stats: COMPLETE Column stats: COMPLETE
       
              Map 1 
                  Map Operator Tree:
                      TableScan
                        alias: date_dim
                        filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: boolean)
                        Statistics: Num rows: 73049 Data size: 82034027 Basic stats: COMPLETE Column stats: COMPLETE
                        Filter Operator
                          predicate: ((d_year = 2001) and (d_moy = 9)) (type: boolean)
                          Statistics: Num rows: 363 Data size: 5808 Basic stats: COMPLETE Column stats: COMPLETE
              Map 1 
                  Map Operator Tree:
                      TableScan
                        alias: date_dim
                        filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = 21)) (type: boolean)
                        Statistics: Num rows: 73049 Data size: 82034027 Basic stats: COMPLETE Column stats: COMPLETE
                        Filter Operator
                          predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = 21)) (type: boolean)
                          Statistics: Num rows: 363 Data size: 7260 Basic stats: COMPLETE Column stats: COMPLETE
      

      Attachments

        1. HIVE-17465.7.patch
          121 kB
          Vineet Garg
        2. HIVE-17465.6.patch
          150 kB
          Vineet Garg
        3. HIVE-17465.5.patch
          150 kB
          Vineet Garg
        4. HIVE-17465.4.patch
          173 kB
          Vineet Garg
        5. HIVE-17465.3.patch
          1.06 MB
          Vineet Garg
        6. HIVE-17465.2.patch
          1.20 MB
          Vineet Garg
        7. HIVE-17465.1.patch
          11 kB
          Vineet Garg

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            vgarg Vineet Garg Assign to me
            gopalv Gopal Vijayaraghavan
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment