Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
3.0.0
-
None
Description
When hive.spark.use.ts.stats.for.mapjoin is true, HoS would not check whether the big table branch has upstream UNION operators. This is wrong and could generate incorrect plan. To reproduce:
set hive.auto.convert.join=true; set hive.auto.convert.join.noconditionaltask.size=16; set hive.spark.use.ts.stats.for.mapjoin=true; create table a (c1 string, c2 int); create table b (c3 string, c4 int); create table c (c1 string, c2 int); create table d (c3 string, c4 int); create table e (c5 string, c6 int); insert into table a values ("a1", 1), ("a2", 2), ("a3", 3), ("a4", 4), ("a5", 5), ("a6", 6), ("a7", 7); insert into table b values ("b1", 1), ("b2", 2), ("b3", 3), ("b4", 4); insert into table c values ("c1", 1), ("c2", 2), ("c3", 3), ("c4", 4), ("c5", 5), ("c6", 6), ("c7", 7); insert into table d values ("d1", 1), ("d2", 2), ("d3", 3), ("d4", 4); insert into table e values ("d1", 1), ("d2", 2); explain with t1 as ( select a.c1 as c1, a.c2 as c2, b.c3 as c3 from a join b on a.c2 = b.c4 ), t2 as ( select c.c1 as c1, c.c2 as c2, d.c3 as c3 from c join d on c.c2 = d.c4 ), t3 as ( select * from t1 union all select * from t2 ), t4 as ( select t3.c1, t3.c3, t5.c5 from t3 join e as t5 on t3.c2 = t5.c6 ) select * from t4;
Attachments
Attachments
Issue Links
- is related to
-
HIVE-15489 Alternatively use table scan stats for HoS
-
- Resolved
-