Details
-
Bug
-
Status: Open
-
Critical
-
Resolution: Unresolved
-
Impala 2.8.0
-
None
Description
When a query uses the UNION, EXCEPT, or INTERSECT operators, the ORDER BY clause must be specified at the end of the statement and the results of the combined queries are sorted. ORDER BY clauses are not allowed in individual branches unless the branch is enclosed by parentheses.
There are two bugs currently:
- An ORDER BY is allowed in a branch of a UNION that is not enclosed in parentheses
- The final ORDER BY of a UNION is attached to the nearest branch when it should be sorting the combined results of the UNION(s)
For example, this is not valid syntax but is allowed in Impala
select * from t1 order by 1 union all select * from t2
And for queries like this, the ORDER BY should order the unioned result, not just the nearest branch which is the current behavior.
select * from t1 union all select * from t2 order by 1
If one wants ordering within a branch, the query block must be enclosed by parentheses like such:
(select * from t1 order by 1) union all (select * from t2 order by 2)
Here is an example where incorrect results are returned.
Impala
[impalad:21000] > select r_regionkey, r_name from region union all select r_regionkey, r_name from region order by 1 limit 2; +-------------+-------------+ | r_regionkey | r_name | +-------------+-------------+ | 0 | AFRICA | | 1 | AMERICA | | 2 | ASIA | | 3 | EUROPE | | 4 | MIDDLE EAST | | 0 | AFRICA | | 1 | AMERICA | +-------------+-------------+ Fetched 7 row(s) in 0.12s
PostgreSQL
tpch=# select r_regionkey, r_name from region union all select r_regionkey, r_name from region order by 1 limit 2; r_regionkey | r_name -------------+--------------------------- 0 | AFRICA 0 | AFRICA (2 rows)
see also https://cloud.google.com/spanner/docs/query-syntax#syntax_5
Attachments
Attachments
Issue Links
- duplicates
-
IMPALA-1115 Impala should not ignore ORDER BY when using UNION
- Closed
-
IMPALA-1379 LIMIT should apply to the total result set after UNION
- Closed