Hive
  1. Hive
  2. HIVE-2597

Repeated key in GROUP BY is erroneously displayed when using DISTINCT

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.14.0
    • Component/s: None
    • Labels:
      None

      Description

      The following query was simplified for illustration purposes.

      This works correctly:

      select client_tid, "" as myvalue1, "" as myvalue2 from clients cluster by client_tid
      

      The intent here is to produce two empty columns in between data.

      The following query does not work:

      select distinct client_tid, "" as myvalue1, "" as myvalue2 from clients cluster by client_tid
      
      FAILED: Error in semantic analysis: Line 1:44 Repeated key in GROUP BY ""
      

      The key is not repeated since the aliases were given. Seems like Hive is ignoring the aliases when the "distinct" keyword is specified.

      1. HIVE-2597.D8967.2.patch
        37 kB
        Phabricator
      2. HIVE-2597.D8967.1.patch
        32 kB
        Phabricator
      3. HIVE-2597.4.patch.txt
        44 kB
        Navis
      4. HIVE-2597.3.patch.txt
        15 kB
        Navis

        Issue Links

          Activity

          Hide
          Thejas M Nair added a comment -

          This has been fixed in 0.14 release. Please open new jira if you see any issues.

          Show
          Thejas M Nair added a comment - This has been fixed in 0.14 release. Please open new jira if you see any issues.
          Hide
          Ashutosh Chauhan added a comment -

          Committed to trunk. Thanks, Navis!

          Show
          Ashutosh Chauhan added a comment - Committed to trunk. Thanks, Navis!
          Hide
          Ashutosh Chauhan added a comment -

          +1

          Show
          Ashutosh Chauhan added a comment - +1
          Hide
          Navis added a comment -

          Ashutosh Chauhan Could you review this?

          Show
          Navis added a comment - Ashutosh Chauhan Could you review this?
          Hide
          Hive QA added a comment -

          Overall: -1 at least one tests failed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12652100/HIVE-2597.4.patch.txt

          ERROR: -1 due to 2 failed/errored test(s), 5655 tests executed
          Failed tests:

          org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas
          

          Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/570/testReport
          Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/570/console
          Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-570/

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 2 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12652100

          Show
          Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12652100/HIVE-2597.4.patch.txt ERROR: -1 due to 2 failed/errored test(s), 5655 tests executed Failed tests: org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/570/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/570/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-570/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed This message is automatically generated. ATTACHMENT ID: 12652100
          Hide
          Navis added a comment -

          Updated XML results

          Show
          Navis added a comment - Updated XML results
          Hide
          Hive QA added a comment -

          Overall: -1 at least one tests failed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12651945/HIVE-2597.3.patch.txt

          ERROR: -1 due to 17 failed/errored test(s), 5670 tests executed
          Failed tests:

          org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_ppd_key_range
          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization
          org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas
          org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby1
          org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby4
          org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby5
          org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby6
          org.apache.hadoop.hive.ql.parse.TestParse.testParse_join1
          org.apache.hadoop.hive.ql.parse.TestParse.testParse_join2
          org.apache.hadoop.hive.ql.parse.TestParse.testParse_join3
          org.apache.hadoop.hive.ql.parse.TestParse.testParse_join4
          org.apache.hadoop.hive.ql.parse.TestParse.testParse_join5
          org.apache.hadoop.hive.ql.parse.TestParse.testParse_join6
          org.apache.hadoop.hive.ql.parse.TestParse.testParse_join7
          org.apache.hadoop.hive.ql.parse.TestParse.testParse_join8
          org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
          

          Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/568/testReport
          Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/568/console
          Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-568/

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 17 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12651945

          Show
          Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12651945/HIVE-2597.3.patch.txt ERROR: -1 due to 17 failed/errored test(s), 5670 tests executed Failed tests: org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_ppd_key_range org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby1 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby4 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby5 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby6 org.apache.hadoop.hive.ql.parse.TestParse.testParse_join1 org.apache.hadoop.hive.ql.parse.TestParse.testParse_join2 org.apache.hadoop.hive.ql.parse.TestParse.testParse_join3 org.apache.hadoop.hive.ql.parse.TestParse.testParse_join4 org.apache.hadoop.hive.ql.parse.TestParse.testParse_join5 org.apache.hadoop.hive.ql.parse.TestParse.testParse_join6 org.apache.hadoop.hive.ql.parse.TestParse.testParse_join7 org.apache.hadoop.hive.ql.parse.TestParse.testParse_join8 org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/568/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/568/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-568/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 17 tests failed This message is automatically generated. ATTACHMENT ID: 12651945
          Hide
          Ashutosh Chauhan added a comment -

          Patch is resulting in tons of test failures.

          Show
          Ashutosh Chauhan added a comment - Patch is resulting in tons of test failures.
          Hide
          Phabricator added a comment -

          ashutoshc has accepted the revision "HIVE-2597 [jira] Repeated key in GROUP BY is erroneously displayed when using DISTINCT".

          +1 will commit if tests pass.

          REVISION DETAIL
          https://reviews.facebook.net/D8967

          BRANCH
          HIVE-2597

          ARCANIST PROJECT
          hive

          To: JIRA, ashutoshc, navis
          Cc: njain

          Show
          Phabricator added a comment - ashutoshc has accepted the revision " HIVE-2597 [jira] Repeated key in GROUP BY is erroneously displayed when using DISTINCT". +1 will commit if tests pass. REVISION DETAIL https://reviews.facebook.net/D8967 BRANCH HIVE-2597 ARCANIST PROJECT hive To: JIRA, ashutoshc, navis Cc: njain
          Hide
          Phabricator added a comment -

          navis updated the revision "HIVE-2597 [jira] Repeated key in GROUP BY is erroneously displayed when using DISTINCT".

          Rebased to trunk & Addressed comments

          Reviewers: JIRA

          REVISION DETAIL
          https://reviews.facebook.net/D8967

          CHANGE SINCE LAST DIFF
          https://reviews.facebook.net/D8967?vs=28755&id=29421#toc

          AFFECTED FILES
          ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
          ql/src/test/queries/clientpositive/groupby_constant.q
          ql/src/test/results/clientpositive/groupby_constant.q.out

          To: JIRA, navis
          Cc: njain

          Show
          Phabricator added a comment - navis updated the revision " HIVE-2597 [jira] Repeated key in GROUP BY is erroneously displayed when using DISTINCT". Rebased to trunk & Addressed comments Reviewers: JIRA REVISION DETAIL https://reviews.facebook.net/D8967 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D8967?vs=28755&id=29421#toc AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java ql/src/test/queries/clientpositive/groupby_constant.q ql/src/test/results/clientpositive/groupby_constant.q.out To: JIRA, navis Cc: njain
          Hide
          Phabricator added a comment -

          njain has commented on the revision "HIVE-2597 [jira] Repeated key in GROUP BY is erroneously displayed when using DISTINCT".

          INLINE COMMENTS
          ql/src/test/queries/clientpositive/groupby_constant.q:1 can you add the test you had in the description
          ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:2749 spelling
          ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:2961 spelling
          ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:3643 spell

          REVISION DETAIL
          https://reviews.facebook.net/D8967

          To: JIRA, navis
          Cc: njain

          Show
          Phabricator added a comment - njain has commented on the revision " HIVE-2597 [jira] Repeated key in GROUP BY is erroneously displayed when using DISTINCT". INLINE COMMENTS ql/src/test/queries/clientpositive/groupby_constant.q:1 can you add the test you had in the description ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:2749 spelling ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:2961 spelling ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:3643 spell REVISION DETAIL https://reviews.facebook.net/D8967 To: JIRA, navis Cc: njain
          Hide
          Namit Jain added a comment -

          comments

          Show
          Namit Jain added a comment - comments
          Hide
          Phabricator added a comment -

          navis requested code review of "HIVE-2597 [jira] Repeated key in GROUP BY is erroneously displayed when using DISTINCT".

          Reviewers: JIRA

          HIVE-2597 Repeated key in GROUP BY is erroneously displayed when using DISTINCT

          The following query was simplified for illustration purposes.

          This works correctly:
          select client_tid, "" as myvalue1, "" as myvalue2 from clients cluster by client_tid

          The intent here is to produce two empty columns in between data.

          The following query does not work:
          select distinct client_tid, "" as myvalue1, "" as myvalue2 from clients cluster by client_tid

          FAILED: Error in semantic analysis: Line 1:44 Repeated key in GROUP BY ""

          The key is not repeated since the aliases were given. Seems like Hive is ignoring the aliases when the "distinct" keyword is specified.

          TEST PLAN
          EMPTY

          REVISION DETAIL
          https://reviews.facebook.net/D8967

          AFFECTED FILES
          ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
          ql/src/test/queries/clientpositive/groupby_constant.q
          ql/src/test/results/clientpositive/groupby_constant.q.out

          MANAGE HERALD RULES
          https://reviews.facebook.net/herald/view/differential/

          WHY DID I GET THIS EMAIL?
          https://reviews.facebook.net/herald/transcript/21711/

          To: JIRA, navis

          Show
          Phabricator added a comment - navis requested code review of " HIVE-2597 [jira] Repeated key in GROUP BY is erroneously displayed when using DISTINCT". Reviewers: JIRA HIVE-2597 Repeated key in GROUP BY is erroneously displayed when using DISTINCT The following query was simplified for illustration purposes. This works correctly: select client_tid, "" as myvalue1, "" as myvalue2 from clients cluster by client_tid The intent here is to produce two empty columns in between data. The following query does not work: select distinct client_tid, "" as myvalue1, "" as myvalue2 from clients cluster by client_tid FAILED: Error in semantic analysis: Line 1:44 Repeated key in GROUP BY "" The key is not repeated since the aliases were given. Seems like Hive is ignoring the aliases when the "distinct" keyword is specified. TEST PLAN EMPTY REVISION DETAIL https://reviews.facebook.net/D8967 AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java ql/src/test/queries/clientpositive/groupby_constant.q ql/src/test/results/clientpositive/groupby_constant.q.out MANAGE HERALD RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/21711/ To: JIRA, navis
          Hide
          Alex Rovner added a comment -

          Anyone looking at this issue?

          Show
          Alex Rovner added a comment - Anyone looking at this issue?

            People

            • Assignee:
              Navis
              Reporter:
              Alex Rovner
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development