Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
I've always wondered why union all has to be in subqueries in hive.
After looking at it, problems are:
- Hive Parser:
- Union happens at the wrong place (insert ... select ... union all select ...) is parsed as (insert select) union select.
- There are many rewrite rules in the parser to force any query into the a from - insert -select form. No doubt for historical reasons.
- Plan generation/semantic analysis assumes top level "TOK_QUERY" and not top level "TOK_UNION".
The rewrite rules don't work when we move the "UNION ALL" into the select statements. However, it's not hard to do that in code.