Hive
  1. Hive
  2. HIVE-2597

Repeated key in GROUP BY is erroneously displayed when using DISTINCT

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      The following query was simplified for illustration purposes.

      This works correctly:
      select client_tid, "" as myvalue1, "" as myvalue2 from clients cluster by client_tid

      The intent here is to produce two empty columns in between data.

      The following query does not work:
      select distinct client_tid, "" as myvalue1, "" as myvalue2 from clients cluster by client_tid

      FAILED: Error in semantic analysis: Line 1:44 Repeated key in GROUP BY ""

      The key is not repeated since the aliases were given. Seems like Hive is ignoring the aliases when the "distinct" keyword is specified.

      1. HIVE-2597.D8967.1.patch
        32 kB
        Phabricator
      2. HIVE-2597.D8967.2.patch
        37 kB
        Phabricator

        Activity

        Hide
        Ashutosh Chauhan added a comment -

        Patch is resulting in tons of test failures.

        Show
        Ashutosh Chauhan added a comment - Patch is resulting in tons of test failures.
        Hide
        Phabricator added a comment -

        ashutoshc has accepted the revision "HIVE-2597 [jira] Repeated key in GROUP BY is erroneously displayed when using DISTINCT".

        +1 will commit if tests pass.

        REVISION DETAIL
        https://reviews.facebook.net/D8967

        BRANCH
        HIVE-2597

        ARCANIST PROJECT
        hive

        To: JIRA, ashutoshc, navis
        Cc: njain

        Show
        Phabricator added a comment - ashutoshc has accepted the revision " HIVE-2597 [jira] Repeated key in GROUP BY is erroneously displayed when using DISTINCT". +1 will commit if tests pass. REVISION DETAIL https://reviews.facebook.net/D8967 BRANCH HIVE-2597 ARCANIST PROJECT hive To: JIRA, ashutoshc, navis Cc: njain
        Hide
        Phabricator added a comment -

        navis updated the revision "HIVE-2597 [jira] Repeated key in GROUP BY is erroneously displayed when using DISTINCT".

        Rebased to trunk & Addressed comments

        Reviewers: JIRA

        REVISION DETAIL
        https://reviews.facebook.net/D8967

        CHANGE SINCE LAST DIFF
        https://reviews.facebook.net/D8967?vs=28755&id=29421#toc

        AFFECTED FILES
        ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
        ql/src/test/queries/clientpositive/groupby_constant.q
        ql/src/test/results/clientpositive/groupby_constant.q.out

        To: JIRA, navis
        Cc: njain

        Show
        Phabricator added a comment - navis updated the revision " HIVE-2597 [jira] Repeated key in GROUP BY is erroneously displayed when using DISTINCT". Rebased to trunk & Addressed comments Reviewers: JIRA REVISION DETAIL https://reviews.facebook.net/D8967 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D8967?vs=28755&id=29421#toc AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java ql/src/test/queries/clientpositive/groupby_constant.q ql/src/test/results/clientpositive/groupby_constant.q.out To: JIRA, navis Cc: njain
        Hide
        Phabricator added a comment -

        njain has commented on the revision "HIVE-2597 [jira] Repeated key in GROUP BY is erroneously displayed when using DISTINCT".

        INLINE COMMENTS
        ql/src/test/queries/clientpositive/groupby_constant.q:1 can you add the test you had in the description
        ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:2749 spelling
        ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:2961 spelling
        ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:3643 spell

        REVISION DETAIL
        https://reviews.facebook.net/D8967

        To: JIRA, navis
        Cc: njain

        Show
        Phabricator added a comment - njain has commented on the revision " HIVE-2597 [jira] Repeated key in GROUP BY is erroneously displayed when using DISTINCT". INLINE COMMENTS ql/src/test/queries/clientpositive/groupby_constant.q:1 can you add the test you had in the description ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:2749 spelling ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:2961 spelling ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:3643 spell REVISION DETAIL https://reviews.facebook.net/D8967 To: JIRA, navis Cc: njain
        Hide
        Namit Jain added a comment -

        comments

        Show
        Namit Jain added a comment - comments
        Hide
        Phabricator added a comment -

        navis requested code review of "HIVE-2597 [jira] Repeated key in GROUP BY is erroneously displayed when using DISTINCT".

        Reviewers: JIRA

        HIVE-2597 Repeated key in GROUP BY is erroneously displayed when using DISTINCT

        The following query was simplified for illustration purposes.

        This works correctly:
        select client_tid, "" as myvalue1, "" as myvalue2 from clients cluster by client_tid

        The intent here is to produce two empty columns in between data.

        The following query does not work:
        select distinct client_tid, "" as myvalue1, "" as myvalue2 from clients cluster by client_tid

        FAILED: Error in semantic analysis: Line 1:44 Repeated key in GROUP BY ""

        The key is not repeated since the aliases were given. Seems like Hive is ignoring the aliases when the "distinct" keyword is specified.

        TEST PLAN
        EMPTY

        REVISION DETAIL
        https://reviews.facebook.net/D8967

        AFFECTED FILES
        ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
        ql/src/test/queries/clientpositive/groupby_constant.q
        ql/src/test/results/clientpositive/groupby_constant.q.out

        MANAGE HERALD RULES
        https://reviews.facebook.net/herald/view/differential/

        WHY DID I GET THIS EMAIL?
        https://reviews.facebook.net/herald/transcript/21711/

        To: JIRA, navis

        Show
        Phabricator added a comment - navis requested code review of " HIVE-2597 [jira] Repeated key in GROUP BY is erroneously displayed when using DISTINCT". Reviewers: JIRA HIVE-2597 Repeated key in GROUP BY is erroneously displayed when using DISTINCT The following query was simplified for illustration purposes. This works correctly: select client_tid, "" as myvalue1, "" as myvalue2 from clients cluster by client_tid The intent here is to produce two empty columns in between data. The following query does not work: select distinct client_tid, "" as myvalue1, "" as myvalue2 from clients cluster by client_tid FAILED: Error in semantic analysis: Line 1:44 Repeated key in GROUP BY "" The key is not repeated since the aliases were given. Seems like Hive is ignoring the aliases when the "distinct" keyword is specified. TEST PLAN EMPTY REVISION DETAIL https://reviews.facebook.net/D8967 AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java ql/src/test/queries/clientpositive/groupby_constant.q ql/src/test/results/clientpositive/groupby_constant.q.out MANAGE HERALD RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/21711/ To: JIRA, navis
        Hide
        Alex Rovner added a comment -

        Anyone looking at this issue?

        Show
        Alex Rovner added a comment - Anyone looking at this issue?

          People

          • Assignee:
            Navis
            Reporter:
            Alex Rovner
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:

              Development