Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-2178

Impala returns incorrect value when group by multiple CHAR column

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • Impala 2.1, Impala 2.2
    • Impala 2.2.7, Impala 2.3.0
    • None

    Description

      When group by contains multiple CHAR columns which have size <=2, Impala returns incorrect values for some columns.
      Step to reproduce:
      1. create a data file and put it into hdfs

      10067481,A,0000,1A,3001,T,9,74.01
      10067481,C,0020,0A,1001, ,4,834.01
      10067488,I,0000,0A,1001,P,1,234.01
      10067488, ,0002,0B,1001,S,5,34.01
      9006748,S,0000,0A,1001, ,8,234.01
      10067488,F,0100,2A,1000,P,1,45234.01
      10067481,C,0020,0A,1001, ,0,2034.01
      10067483,I,0000,0A,1001,P,1,234.01
      10067483, ,0000,0A,1001,P,3,34.01
      9006748,I,0000,0A,1001,P,6,24.01
      10067483,G,0000,0A,1001,P,5,1234.01
      10067484,I,0000,0A,1001,P,6,24.01
      9006748,I,0000,0A,1001,P,4,74.01
      10067484,S,0400,0A,1001,P,8,234.01
      10067489,A,0400,1A,3001,T,9,74.01
      9006748,T,0040,0A,1001,P,2,1134.01
      10067489,C,0020,0A,1001, ,4,834.01
      10067483,I,0000,0A,1005, ,4,4.01
      10067483,G,0000,0A,1005, ,9,24.00
      10067484,T,0300,0A,1001,P,2,1134.01
      10067484,I,0300,0A,1005,P,4,74.01
      10067489,C,0020,0A,1001, ,0,2034.01
      10067488,F,0100,2A,1001,P,6,278.1
      

      2. create table that contains multiple char columns by using this data file.

      create external table test_groupby (
        col_id  int,
        col_char1  CHAR(1),
        col_char4 CHAR(4),
        col_char2       CHAR(2),
        col2_char4 CHAR(4),
        col2_char1  CHAR(1),
        col_decimal1   DECIMAL(1,0),
        col_decimal2  DECIMAL(14,2)
        )
      ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
      location '/test-warehouse/test_groupby'
      

      3. run a group by query. the "col_id" in resultset doesn't exist in original data. however, the row count and rest columns contain right value.

      Query: select col_id,
      col_char2,
      col2_char1,
      Sum(col_decimal1)     sum1,
      Sum(col_decimal2)     sum2
      FROM  test_groupby_txt
      GROUP  BY col_id,
      col_char1,
      col_char2,
      col2_char1
      +-----------------+---------+------------------+-------------+----------------+
      | col_id            | col_char2 | col2_char1 | sum1 | sum2|
      +-----------------+---------+------------------+-------------+----------------+
      | 10067521        | 0A      |                  | 16          | 11472.08       |
      | 9006657         | 0A      | P                | 20          | 196.04         |
      | 10067521        | 1A      | T                | 36          | 296.04         |
      | 9006657         | 0A      | P                | 8           | 234.01         |
      | 10067521        | 0A      | P                | 16          | 468.02         |
      | 10067521        | 0A      | P                | 4           | 2268.02        |
      | 10067522        | 0B      | S                | 10          | 68.02          |
      | 10067521        | 2A      | P                | 14          | 91024.22       |
      | 9006657         | 0A      |                  | 8           | 234.01         |
      | 10067521        | 0A      | P                | 6           | 68.02          |
      | 9006657         | 0A      | P                | 4           | 2268.02        |
      | 10067521        | 0A      | P                | 19          | 2492.02        |
      | 10067521        | 0A      |                  | 4           | 4.01           |
      | 10067521        | 0A      | P                | 28          | 1136.09        |
      | 10067521        | 0A      |                  | 9           | 24.00          |
      +-----------------+---------+------------------+-------------+----------------+
      

      Attachments

        Activity

          People

            skye Skye Wanderman-Milne
            jyu@cloudera.com Juan Yu
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: