Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
Description
The problem can be reproduced by using the following query which at the moment can be found in subquery_in.q file:
explain cbo select * from part where p_name IN (select p_name from part p where p.p_size = part.p_size AND part.p_size + 121150 = p.p_partkey );
The plans before and after HiveSubQueryRemoveRule are shown below:
2021-04-09T14:29:08,031 DEBUG [9f8b0342-5609-4917-95a9-e7abc884f619 main] parse.CalcitePlanner: Plan before removing subquery: HiveProject(p_partkey=[$0], p_name=[$1], p_mfgr=[$2], p_brand=[$3], p_type=[$4], p_size=[$5], p_container=[$6], p_retailprice=[$7], p_comment=[$8]) HiveFilter(condition=[IN($1, { HiveProject(p_name=[$1]) HiveFilter(condition=[AND(=($5, $cor0.p_size), =(+($cor0.p_size, 121150), $0))]) HiveTableScan(table=[[default, part]], table:alias=[p]) })]) HiveTableScan(table=[[default, part]], table:alias=[part]) 2021-04-09T14:29:08,056 DEBUG [9f8b0342-5609-4917-95a9-e7abc884f619 main] parse.CalcitePlanner: Plan just after removing subquery: HiveProject(p_partkey=[$0], p_name=[$1], p_mfgr=[$2], p_brand=[$3], p_type=[$4], p_size=[$5], p_container=[$6], p_retailprice=[$7], p_comment=[$8]) HiveFilter(condition=[=($1, $12)]) LogicalCorrelate(correlation=[$cor0], joinType=[semi], requiredColumns=[{5}]) HiveTableScan(table=[[default, part]], table:alias=[part]) HiveProject(p_name=[$1]) HiveFilter(condition=[AND(=($5, $cor0.p_size), =(+($cor0.p_size, 121150), $0))]) HiveTableScan(table=[[default, part]], table:alias=[p])
The plan after applying the rule is invalid. The HiveFilter(condition=[=($1, $12)]) above the correlate references columns ($12) from the right input which do not exist since the correlate is of type SEMI. Running the test with -Dcalcite.debug property enabled raises an AssertionError when building the HiveFilter.
The problem is hidden at the moment since there is a specific hack in HiveRelDecorrelator that turns this invalid plan into a valid one. This mechanism is very brittle and it can break easily as it happened while fixing HIVE-24957.
Attachments
Issue Links
- Is contained by
-
HIVE-26751 Bug Fixes and Improvements for 3.2.0 release
- Open
- is depended upon by
-
HIVE-24957 Wrong results when subquery has COALESCE in correlation predicate
- Closed
- relates to
-
CALCITE-4574 Wrong/Invalid plans when using RelBuilder#join with correlations
- Closed
- links to