Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3938

Semantically incorrect query referencing nested collections is accepted and run (with arbitrary results).

    XMLWordPrintableJSON

    Details

      Description

      In some situations, Impala accepts a semantically incorrect query that references nested collections. The returned results are arbitrary.

      Example and repro
      Setup:
      In Hive create and populate this table:

      create table dummy (x int);
      insert into dummy values(10);
      
      create table test_table(id int, locations array<map<string,string>>) stored as parquet;
      
      insert into table test_table select 10001, array( map("Location1.City","Bratislava","Location1.Country","SK","Location1.LAT","41"), map("Location2.City","Kosice","Location2.Country","SK","Location2.LAT","42") ) from dummy limit 1;
      
      insert into table test_table select 10002, array( map("Location1.City","Wien","Location1.Country","AT","Location1.LAT","40"), map("Location2.City","Graz","Location2.Country","AT","Location2.LAT","40") ) from dummy limit 1;
      

      Impala will incorrectly execute this invalid query:

      select t.id, l.pos, l.key, l.value
      from test_table t join t.locations l
      order by id, l.pos;
      

      The above query is semantically not correct because "l.key" and "l.value" are fields in the nested map in "l" but the map itself is not referenced in the FROM clause. The query should fail semantic analysis.

      The correct query references the nested map in the FROM clause:

      select t.id, l.pos, m.key, m.value
      from test_table t join t.locations l join l.item m
      order by id, l.pos;
      +-------+-----+-------------------+------------+
      | id    | pos | key               | value      |
      +-------+-----+-------------------+------------+
      | 10001 | 0   | Location1.Country | SK         |
      | 10001 | 0   | Location1.City    | Bratislava |
      | 10001 | 0   | Location1.LAT     | 41         |
      | 10001 | 1   | Location2.LAT     | 42         |
      | 10001 | 1   | Location2.Country | SK         |
      | 10001 | 1   | Location2.City    | Kosice     |
      | 10002 | 0   | Location1.Country | AT         |
      | 10002 | 0   | Location1.City    | Wien       |
      | 10002 | 0   | Location1.LAT     | 40         |
      | 10002 | 1   | Location2.LAT     | 40         |
      | 10002 | 1   | Location2.Country | AT         |
      | 10002 | 1   | Location2.City    | Graz       |
      +-------+-----+-------------------+------------+
      

        Attachments

          Activity

            People

            • Assignee:
              cchanning Chris Channing
              Reporter:
              alex.behm Alexander Behm
            • Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: