Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-18823

Vectorization: introduce qtest for SUM (IF/WHEN) with vectorization for ORC

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Test
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.3.0
    • 4.0.0-alpha-1
    • Hive
    • None

    Description

      HIVE-16110 introduced some issues when using SUM aggregations with WHEN clause.
      As far as I can see, there is no separate qtests for validating that, vectorized_case is quite close, but not the same.

      The test case would be:

      set hive.vectorized.execution.enabled=false;
      drop table if exists vectorization_sum_if_when_a;
      drop table if exists vectorization_sum_if_when_b;
      create table vectorization_sum_if_when_a (x int) stored as orc;
      insert into table vectorization_sum_if_when_a values (0), (1), (0), (NULL), (NULL), (NULL), (NULL), (NULL), (NULL), (NULL);
      create table vectorization_sum_if_when_b (x int) stored as orc;
      insert into table vectorization_sum_if_when_b select least(t1.x + t2.x + t3.x + t4.x, 1) from vectorization_sum_if_when_a t1, vectorization_sum_if_when_a t2, vectorization_sum_if_when_a t3, vectorization_sum_if_when_a t4;
      select count(*), x from vectorization_sum_if_when_b group by x;
      
      select sum(IF(x is null, 1, 0)), count(1) from vectorization_sum_if_when_b;
      select sum(IF(x=1, 1, 0)), count(1) from vectorization_sum_if_when_b;
      select sum((case WHEN x = 1 THEN 1 else 0 end)) from vectorization_sum_if_when_b;
      select sum((case WHEN x = 1 THEN 1 else 0 end)), sum((case WHEN x = 1 THEN 1 when x is null then 0 else 0 end)) from vectorization_sum_if_when_b;
      
      set hive.vectorized.execution.enabled=true;
      select sum(IF(x is null, 1, 0)), count(1) from vectorization_sum_if_when_b;
      select sum(IF(x=1, 1, 0)), count(1) from vectorization_sum_if_when_b;
      select sum((case WHEN x = 1 THEN 1 else 0 end)) from vectorization_sum_if_when_b;
      select sum((case WHEN x = 1 THEN 1 else 0 end)), sum((case WHEN x = 1 THEN 1 when x is null then 0 else 0 end)) from vectorization_sum_if_when_b;
      

      Attachments

        1. HIVE-18823.01.patch
          9 kB
          László Bodor

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            abstractdog László Bodor Assign to me
            abstractdog László Bodor
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment