Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-901

Incorrect result with group by query with null value in group by data

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: Impala 1.3
    • Fix Version/s: Impala 1.3.1
    • Component/s: None
    • Labels:
      None

      Description

      I've tried this on master, ed4cb660b7a60d9b9248df525c477bab4d218c4b and a nightly c5 cluster

      This problem seems to be data dependent. By changing a single value in the underlying table the results can become correct (or incorrect). Also by changing from a tinyint to an int, the results may be correct, but I'm not sure if the reverse is true.

      The first select query below has incorrect results (missing the row with the null), the other related queries that follow are correct.

      [nightly-2.ent.cloudera.com:21000] > create table foo (col_1 int, col_2 tinyint);
      Query: create table foo (col_1 int, col_2 tinyint)
      
      Returned 0 row(s) in 0.17s
      
      
      [nightly-2.ent.cloudera.com:21000] > insert into foo values (0, -59), (0, null), (0, -4);
      Query: insert into foo values (0, -59), (0, null), (0, -4)
      Inserted 3 rows in 1.28s
      
      
      [nightly-2.ent.cloudera.com:21000] > select col_1, col_2 from foo group by 1, 2;
      Query: select col_1, col_2 from foo group by 1, 2
      +-------+-------+
      | col_1 | col_2 |
      +-------+-------+
      | 0     | -4    |
      | 0     | -59   |
      +-------+-------+
      Returned 2 row(s) in 0.05s
      

      Changing the first value by 1

      [nightly-2.ent.cloudera.com:21000] > drop table foo;
      Query: drop table foo
      
      
      [nightly-2.ent.cloudera.com:21000] > create table foo (col_1 int, col_2 tinyint);
      Query: create table foo (col_1 int, col_2 tinyint)
      
      Returned 0 row(s) in 0.35s
      
      
      [nightly-2.ent.cloudera.com:21000] > insert into foo values (0, -60), (0, null), (0, -4);
      Query: insert into foo values (0, -60), (0, null), (0, -4)
      Inserted 3 rows in 1.28s
      
      
      [nightly-2.ent.cloudera.com:21000] > select col_1, col_2 from foo group by 1, 2;
      Query: select col_1, col_2 from foo group by 1, 2
      +-------+-------+
      | col_1 | col_2 |
      +-------+-------+
      | 0     | -4    |
      | 0     | -60   |
      | 0     | NULL  |
      +-------+-------+
      Returned 3 row(s) in 0.07s
      

      Changing the data type

      [nightly-2.ent.cloudera.com:21000] > drop table foo;
      Query: drop table foo
      
      
      [nightly-2.ent.cloudera.com:21000] > create table foo (col_1 int, col_2 int);
      Query: create table foo (col_1 int, col_2 int)
      
      Returned 0 row(s) in 0.27s
      
      
      [nightly-2.ent.cloudera.com:21000] > insert into foo values (0, -59), (0, null), (0, -4);
      Query: insert into foo values (0, -59), (0, null), (0, -4)
      Inserted 3 rows in 1.60s
      
      
      [nightly-2.ent.cloudera.com:21000] > select col_1, col_2 from foo group by 1, 2;
      Query: select col_1, col_2 from foo group by 1, 2
      +-------+-------+
      | col_1 | col_2 |
      +-------+-------+
      | 0     | -4    |
      | 0     | NULL  |
      | 0     | -59   |
      +-------+-------+
      Returned 3 row(s) in 0.05s
      

        Attachments

          Activity

            People

            • Assignee:
              henryr Henry Robinson
              Reporter:
              caseyc casey
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: