Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-817

Wrong expression may be used in aggregate query if there are multiple similar expressions

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 1.3
    • Impala 1.3
    • None
    • None

    Description

      I only saw this in the master branch, not in 1.2.4 or 1.3.0.

      In the plan below you can see it's using the wrong expression. This seems to happen with expressions that use the same columns but only differ by an operator.

      [localhost:21000] > explain select int_col + int_col, int_col * int_col FROM alltypesagg GROUP BY int_col + int_col, int_col * int_col having (int_col * int_col) < 0 limit 10;
      Query: explain select int_col + int_col, int_col * int_col FROM alltypesagg GROUP BY int_col + int_col, int_col * int_col having (int_col * int_col) < 0 limit 10
      +----------------------------------------------------------+
      | Explain String                                           |
      +----------------------------------------------------------+
      | Estimated Per-Host Requirements: Memory=74.00MB VCores=2 |
      |                                                          |
      | 04:EXCHANGE [PARTITION=UNPARTITIONED]                    |
      | |  limit: 10                                             |
      | |                                                        |
      | 03:AGGREGATE [MERGE FINALIZE]                            |
      | |  group by: int_col + int_col                           |
      | |  having: int_col + int_col < 0                         |
      | |  limit: 10                                             |
      | |                                                        |
      | 02:EXCHANGE [PARTITION=HASH(int_col + int_col)]          |
      | |                                                        |
      | 01:AGGREGATE                                             |
      | |  group by: int_col + int_col                           |
      | |                                                        |
      | 00:SCAN HDFS [functional.alltypesagg]                    |
      |    partitions=10/10 size=743.67KB                        |
      +----------------------------------------------------------+
      Returned 17 row(s) in 0.01s
      

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            alex.behm Alexander Behm
            caseyc casey
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment