Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-18375

Cannot ORDER by subquery fields unless they are selected

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 2.3.2
    • Fix Version/s: None
    • Component/s: Query Processor
    • Labels:
      None
    • Environment:

      Description

      Give these tables:

      CREATE TABLE employees (
          emp_no      INT,
          first_name  VARCHAR(14),
          last_name   VARCHAR(16)
      );
      insert into employees values
      (1, 'Gottlob', 'Frege'),
      (2, 'Bertrand', 'Russell'),
      (3, 'Ludwig', 'Wittgenstein');
      CREATE TABLE salaries (
          emp_no      INT,
          salary      INT,
          from_date   DATE,
          to_date     DATE
      );
      insert into salaries values
      (1, 10, '1900-01-01', '1900-01-31'),
      (1, 18, '1900-09-01', '1900-09-30'),
      (2, 15, '1940-03-01', '1950-01-01'),
      (3, 20, '1920-01-01', '1950-01-01');
      

      This query returns the names of the employees ordered by their peak salary:

      SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary`
      FROM `default`.`employees`
      INNER JOIN
       (SELECT `emp_no`, MAX(`salary`) `max_salary`
        FROM `default`.`salaries`
        WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL
        GROUP BY `emp_no`) AS `t1`
      ON `employees`.`emp_no` = `t1`.`emp_no`
      ORDER BY `t1`.`max_salary` DESC;
      

      However, this should still work even if the max_salary is not part of the projection:

      SELECT `employees`.`last_name`, `employees`.`first_name`
      FROM `default`.`employees`
      INNER JOIN
       (SELECT `emp_no`, MAX(`salary`) `max_salary`
        FROM `default`.`salaries`
        WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL
        GROUP BY `emp_no`) AS `t1`
      ON `employees`.`emp_no` = `t1`.`emp_no`
      ORDER BY `t1`.`max_salary` DESC;
      

      However, that fails with this error:

      Error while compiling statement: FAILED: SemanticException [Error 10004]: line 9:9 Invalid table alias or column reference 't1': (possible column names are: last_name, first_name)
      

      FWIW, this also fails:

      SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` AS `max_sal`
      FROM `default`.`employees`
      INNER JOIN
       (SELECT `emp_no`, MAX(`salary`) `max_salary`
        FROM `default`.`salaries`
        WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL
        GROUP BY `emp_no`) AS `t1`
      ON `employees`.`emp_no` = `t1`.`emp_no`
      ORDER BY `t1`.`max_salary` DESC;
      

      But this succeeds:

      SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` AS `max_sal`
      FROM `default`.`employees`
      INNER JOIN
       (SELECT `emp_no`, MAX(`salary`) `max_salary`
        FROM `default`.`salaries`
        WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL
        GROUP BY `emp_no`) AS `t1`
      ON `employees`.`emp_no` = `t1`.`emp_no`
      ORDER BY `max_sal` DESC;
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                pauljackson123 Paul Jackson
              • Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated: