Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-847

AnalysisException on query with join+group by using same expression twice+agg(distinct)

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 1.3
    • Impala 1.3
    • None
    • None

    Description

      I only tried this on master.

      The first query below seems to be triggering some sort of edge case bug. If one of the GROUP BYs is removed or the DISTINCT is removed, then the query runs.

      [localhost:21000] > select t1.int_col AS int_col_1, t1.smallint_col AS smallint_col_2, t1.int_col AS int_col_3, SUM(DISTINCT 1) AS int_col_4 FROM alltypes AS t1 INNER JOIN alltypestiny AS t2 ON t2.smallint_col = t1.smallint_col GROUP BY t1.int_col, t1.smallint_col, t1.int_col;
      Query: select t1.int_col AS int_col_1, t1.smallint_col AS smallint_col_2, t1.int_col AS int_col_3, SUM(DISTINCT 1) AS int_col_4 FROM alltypes AS t1 INNER JOIN alltypestiny AS t2 ON t2.smallint_col = t1.smallint_col GROUP BY t1.int_col, t1.smallint_col, t1.int_col
      ERROR: AnalysisException: null
      CAUSED BY: IndexOutOfBoundsException: Index: 3, Size: 3
      
      
      [localhost:21000] > select t1.int_col AS int_col_1, t1.smallint_col AS smallint_col_2, SUM(DISTINCT 1) AS int_col_4 FROM alltypes AS t1 INNER JOIN alltypestiny AS t2 ON t2.smallint_col = t1.smallint_col GROUP BY t1.int_col, t1.smallint_col;
      Query: select t1.int_col AS int_col_1, t1.smallint_col AS smallint_col_2, SUM(DISTINCT 1) AS int_col_4 FROM alltypes AS t1 INNER JOIN alltypestiny AS t2 ON t2.smallint_col = t1.smallint_col GROUP BY t1.int_col, t1.smallint_col
      +-----------+----------------+-----------+
      | int_col_1 | smallint_col_2 | int_col_4 |
      +-----------+----------------+-----------+
      | 0         | 0              | 1         |
      | 1         | 1              | 1         |
      +-----------+----------------+-----------+
      Returned 2 row(s) in 0.04s
      
      
      [localhost:21000] > select t1.int_col AS int_col_1, t1.smallint_col AS smallint_col_2, t1.int_col AS int_col_3 FROM alltypes AS t1 INNER JOIN alltypestiny AS t2 ON t2.smallint_col = t1.smallint_col GROUP BY t1.int_col, t1.smallint_col, t1.int_col;
      Query: select t1.int_col AS int_col_1, t1.smallint_col AS smallint_col_2, t1.int_col AS int_col_3 FROM alltypes AS t1 INNER JOIN alltypestiny AS t2 ON t2.smallint_col = t1.smallint_col GROUP BY t1.int_col, t1.smallint_col, t1.int_col
      +-----------+----------------+-----------+
      | int_col_1 | smallint_col_2 | int_col_3 |
      +-----------+----------------+-----------+
      | 0         | 0              | 0         |
      | 1         | 1              | 1         |
      +-----------+----------------+-----------+
      Returned 2 row(s) in 0.04s
      
      
      [localhost:21000] > select t1.int_col AS int_col_1, t1.smallint_col AS smallint_col_2, t1.int_col AS int_col_3, SUM(1) AS int_col_4 FROM alltypes AS t1 INNER JOIN alltypestiny AS t2 ON t2.smallint_col = t1.smallint_col GROUP BY t1.int_col, t1.smallint_col, t1.int_col;
      Query: select t1.int_col AS int_col_1, t1.smallint_col AS smallint_col_2, t1.int_col AS int_col_3, SUM(1) AS int_col_4 FROM alltypes AS t1 INNER JOIN alltypestiny AS t2 ON t2.smallint_col = t1.smallint_col GROUP BY t1.int_col, t1.smallint_col, t1.int_col
      +-----------+----------------+-----------+-----------+
      | int_col_1 | smallint_col_2 | int_col_3 | int_col_4 |
      +-----------+----------------+-----------+-----------+
      | 0         | 0              | 0         | 2920      |
      | 1         | 1              | 1         | 2920      |
      +-----------+----------------+-----------+-----------+
      Returned 2 row(s) in 0.06s
      

      Attachments

        Activity

          People

            alex.behm Alexander Behm
            caseyc casey
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: