Details
-
Sub-task
-
Status: Open
-
Major
-
Resolution: Unresolved
-
2.1.0
-
None
-
None
Description
Follow-up on HIVE-13068.
When we create a left semijoin, we could infer the constants from the SEL below when we create the GB to remove duplicates on the right hand side.
Ex. ql/src/test/results/clientpositive/constprog_semijoin.q.out
explain select table1.id, table1.val, table1.val1 from table1 left semi join table3 on table1.dimid = table3.id and table3.id = 100 where table1.dimid = 100;
Plan:
STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: table1 Statistics: Num rows: 10 Data size: 200 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (((dimid = 100) = true) and (dimid = 100)) (type: boolean) Statistics: Num rows: 2 Data size: 40 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: id (type: int), val (type: string), val1 (type: string) outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 2 Data size: 40 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: 100 (type: int), true (type: boolean) sort order: ++ Map-reduce partition columns: 100 (type: int), true (type: boolean) Statistics: Num rows: 2 Data size: 40 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: int), _col1 (type: string), _col2 (type: string) TableScan alias: table3 Statistics: Num rows: 5 Data size: 15 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (((id = 100) = true) and (id = 100)) (type: boolean) Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: 100 (type: int), true (type: boolean) outputColumnNames: _col0, _col1 Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE Column stats: NONE Group By Operator keys: _col0 (type: int), _col1 (type: boolean) mode: hash outputColumnNames: _col0, _col1 Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: int), _col1 (type: boolean) sort order: ++ Map-reduce partition columns: _col0 (type: int), _col1 (type: boolean) Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE Column stats: NONE Reduce Operator Tree: Join Operator condition map: Left Semi Join 0 to 1 keys: 0 100 (type: int), true (type: boolean) 1 _col0 (type: int), _col1 (type: boolean) outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 2 Data size: 44 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 2 Data size: 44 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink
Attachments
Issue Links
- relates to
-
HIVE-13068 Disable Hive ConstantPropagate optimizer when CBO has optimized the plan II
- Closed