Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-4564

Distinct along with order by is not working when table name is part of column name in order by clause

    XMLWordPrintableJSON

    Details

      Description

      I have following table named 'region'.

      hive> desc region;
      r_regionkey int
      r_name string
      r_comment string

      When we use <distinct, orderby> clause combination in table_name.column_name format, the query throws SemanticException.

      For example, the following query throws error.
      hive> select distinct region.r_name from region order by region.r_name;
      FAILED: SemanticException [Error 10004]: Line 1:51 Invalid table alias or column reference 'region': (possible column names are: _col0)

      The same query works fine if the same query is used without table name in order by clause.
      The following query works fine for region table.
      hive> select distinct region.r_name from region order by r_name;

      This is a common scenario in actual real world scenarios.

      For example, I want to find out what are all the cities my employees are from.
      >> SELECT DISTINCT CITY.NAME FROM EMPLOYEE, CITY WHERE EMPLOYEE.CID=CITY.CID ORDER BY CITY.NAME
      Here we are forced to use CITY.NAME as it may conflict with EMPLOYEE.NAME.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                chandra.gunturi chandra sekhar gunturi
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated: