Details
-
Bug
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
2.3.0, 3.1.1
-
None
-
None
-
None
Description
Queries with SUBSTR function in EXISTS clause have much more complicated execution plan in Hive-2.3 with enabled CBO then with disabled. A query below has 8 stages which submit 4 MR jobs in Hive-2.3 when CBO is enabled, while with disabled it has 4 stages and submits 1 MR job.
STEPS TO REPRODUCE:
CREATE TABLE i1122 (id STRING); INSERT INTO i1122 VALUES (1),(1001); EXPLAIN SELECT * FROM i1122 AS t1 WHERE EXISTS ( SELECT 1 FROM i1122 AS t2 WHERE t2.id = substr(t1.id,4) );
ACTUAL RESULT:
Explain plan in Hive-2.3 with disabled CBO(The same execution plan was in Hive-1.2 with enabled CBO):
hive> SET hive.cbo.enable=false;
hive> EXPLAIN
> SELECT *
> FROM i1122 AS t1
> WHERE EXISTS (
> SELECT 1
> FROM i1122 AS t2
> WHERE t2.id = substr(t1.id,4)
> );
OK
STAGE DEPENDENCIES:
Stage-4 is a root stage
Stage-3 depends on stages: Stage-4
Stage-0 depends on stages: Stage-3
STAGE PLANS:
...
Explain plan in Hive-2.3 with enabled CBO:
hive> SET hive.cbo.enable=true;
hive> EXPLAIN
> SELECT *
> FROM i1122 AS t1
> WHERE EXISTS (
> SELECT 1
> FROM i1122 AS t2
> WHERE t2.id = substr(t1.id,4)
> );
OK
STAGE DEPENDENCIES:
Stage-3 is a root stage
Stage-8 depends on stages: Stage-3
Stage-6 depends on stages: Stage-8
Stage-5 depends on stages: Stage-6 , consists of Stage-7, Stage-1
Stage-7 has a backup stage: Stage-1
Stage-4 depends on stages: Stage-7
Stage-1
Stage-0 depends on stages: Stage-4, Stage-1
STAGE PLANS:
...
Full explain plans you can find in attachment