Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-12435

SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and vectorization is enabled.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 2.0.0
    • 2.0.0
    • Vectorization
    • None

    Description

      Run the following query:

      create table count_case_groupby (key string, bool boolean) STORED AS orc;
      insert into table count_case_groupby values ('key1', true),('key2', false),('key3', NULL),('key4', false),('key5',NULL);
      

      The table contains the following:

      key1	true
      key2	false
      key3	NULL
      key4	false
      key5	NULL
      

      The below query returns:

      SELECT key, COUNT(CASE WHEN bool THEN 1 WHEN NOT bool THEN 0 ELSE NULL END) AS cnt_bool0_ok FROM count_case_groupby GROUP BY key;
      key1	1
      key2	1
      key3	1
      key4	1
      key5	1
      

      while it expects the following results:

      key1	1
      key2	1
      key3	0
      key4	1
      key5	0
      

      The query works with hive ver 1.2. Also it works when a table is not orc format.
      Also even if it's an orc table, when vectorization is disabled, the query works.

      Attachments

        1. HIVE-12435.01.patch
          13 kB
          Matt McCline
        2. HIVE-12435.02.patch
          18 kB
          Matt McCline
        3. HIVE-12435.03.patch
          18 kB
          Matt McCline
        4. HIVE-12435.04.patch
          18 kB
          Matt McCline

        Issue Links

          Activity

            People

              mmccline Matt McCline
              taksaito Takahiko Saito
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: