Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
None
-
ghx-label-2
Description
Bug report from here - https://community.cloudera.com/t5/Support-Questions/quot-union-all-quot-dropping-records-with-all-null-empty/m-p/303153#M221415
Repro:
create database if not exists as_adventure; use as_adventure; CREATE tABLE IF NOT EXISTS as_adventure.t1 ( productsubcategorykey INT, productline STRING); insert into t1 values (1,'l1'); insert into t1 values (2,'l1'); insert into t1 values (1,'l2'); insert into t1 values (3,'l3'); insert into t1 values (null,''); select * from t1; SELECT MIN(t_53.c_41) c_41, CAST(NULL AS DOUBLE) c_43, CAST(NULL AS BIGINT) c_44, t_53.c2 c2, t_53.c3s0 c3s0, t_53.c4 c4, t_53.c5s0 c5s0 FROM ( SELECT t.productsubcategorykey c_41, t.productline c2, t.productline c3s0, t.productsubcategorykey c4, t.productsubcategorykey c5s0 FROM as_adventure.t1 t WHERE true GROUP BY 2, 3, 4, 5 ) t_53 GROUP BY 4, 5, 6, 7 UNION ALL SELECT MIN(t_53.c_41) c_41, CAST(NULL AS DOUBLE) c_43, CAST(NULL AS BIGINT) c_44, t_53.c2 c2, t_53.c3s0 c3s0, t_53.c4 c4, t_53.c5s0 c5s0 FROM ( SELECT t.productsubcategorykey c_41, t.productline c2, t.productline c3s0, t.productsubcategorykey c4, t.productsubcategorykey c5s0 FROM as_adventure.t1 t WHERE true GROUP BY 2, 3, 4, 5 ) t_53 GROUP BY 4, 5, 6, 7
Somewhat similar to IMPALA-7957 in that the inferred predicates from the column equivalences get placed in a Select node. It's a bit different in that the NULLs that are filtered out from the predicates come from the base table.
+------------------------------------------------------------------------------------------------+ | Explain String | +------------------------------------------------------------------------------------------------+ | Max Per-Host Resource Reservation: Memory=136.02MB Threads=6 | | Per-Host Resource Estimates: Memory=576MB | | WARNING: The following tables are missing relevant table and/or column statistics. | | as_adventure.t1 | | | | PLAN-ROOT SINK | | | | | 13:EXCHANGE [UNPARTITIONED] | | | | | 00:UNION | | | row-size=52B cardinality=2 | | | | | |--08:SELECT | | | | predicates: t_53.c4 = t_53.c5s0, t_53.c2 = t_53.c3s0 | | | | row-size=36B cardinality=1 | | | | | | | 07:AGGREGATE [FINALIZE] | | | | output: min(t.productsubcategorykey) | | | | group by: t.productline, t.productline, t.productsubcategorykey, t.productsubcategorykey | | | | row-size=36B cardinality=2 | | | | | | | 12:AGGREGATE [FINALIZE] | | | | group by: t.productline, t.productsubcategorykey | | | | row-size=16B cardinality=2 | | | | | | | 11:EXCHANGE [HASH(t.productline,t.productsubcategorykey)] | | | | | | | 06:AGGREGATE [STREAMING] | | | | group by: t.productline, t.productsubcategorykey | | | | row-size=16B cardinality=2 | | | | | | | 05:SCAN HDFS [as_adventure.t1 t] | | | HDFS partitions=1/1 files=5 size=24B | | | row-size=16B cardinality=2 | | | | | 04:SELECT | | | predicates: t_53.c4 = t_53.c5s0, t_53.c2 = t_53.c3s0 | | | row-size=36B cardinality=1 | | | | | 03:AGGREGATE [FINALIZE] | | | output: min(t.productsubcategorykey) | | | group by: t.productline, t.productline, t.productsubcategorykey, t.productsubcategorykey | | | row-size=36B cardinality=2 | | | | | 10:AGGREGATE [FINALIZE] | | | group by: t.productline, t.productsubcategorykey | | | row-size=16B cardinality=2 | | | | | 09:EXCHANGE [HASH(t.productline,t.productsubcategorykey)] | | | | | 02:AGGREGATE [STREAMING] | | | group by: t.productline, t.productsubcategorykey | | | row-size=16B cardinality=2 | | | | | 01:SCAN HDFS [as_adventure.t1 t] | | HDFS partitions=1/1 files=5 size=24B | | row-size=16B cardinality=2 | +------------------------------------------------------------------------------------------------+
This query is a simplified version of the original one:
SELECT t_53.c4 c4, t_53.c5s0 c5s0 FROM ( SELECT t.productsubcategorykey c4, t.productsubcategorykey c5s0 FROM as_adventure.t1 t GROUP BY 1, 2 ) t_53 GROUP BY 1,2 UNION ALL SELECT t_53.c4 c4, t_53.c5s0 c5s0 FROM ( SELECT t.productsubcategorykey c4, t.productsubcategorykey c5s0 FROM as_adventure.t1 t GROUP BY 1, 2 ) t_53 GROUP BY 1, 2;
Attachments
Issue Links
- relates to
-
IMPALA-7957 UNION ALL query returns incorrect results
-
- Resolved
-
-
IMPALA-8386 Incorrect predicate in a left outer join query
-
- Resolved
-
-
IMPALA-9694 IllegalStateException when inlineView has AggregationNode and different alias on the same column
-
- Resolved
-