Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
2.1.1
-
None
-
None
Description
Reproduce:
- Create three tables, mytest_t1, mytest_t2, mytest_t4
- hive -e "explain select coalesce(t1.wh_guid,t2.wh_guid) as wh_guid from dw_dev.mytest_t1 t1 full join dw_dev.mytest_t2 t2 on t1.material_code = t2.material_code;"
- hive -e "explain select coalesce(t1.wh_guid,t2.wh_guid) as wh_guid from dw_dev.mytest_t1 t1 full join dw_dev.mytest_t2 t2 on t1.material_code = t2.material_code join dw_dev.mytest_t5 t5 on t5.material_code = coalesce(t1.material_code,t2.material_code);"
- expect output row are over 6000, but actually get 685 rows
2 - explain
Map Reduce
Map Operator Tree:
TableScan
alias: t1
Statistics: Num rows: 6159 Data size: 1724520 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: material_code (type: string), wh_guid (type: string)
outputColumnNames: _col0, _col1
Statistics: Num rows: 6159 Data size: 1724520 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: string)
sort order: +
Map-reduce partition columns: _col0 (type: string)
Statistics: Num rows: 6159 Data size: 1724520 Basic stats: COMPLETE Column stats: NONE
value expressions: _col1 (type: string)
TableScan
alias: t2
Statistics: Num rows: 1201 Data size: 259416 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: material_code (type: string), wh_guid (type: string)
outputColumnNames: _col0, _col1
Statistics: Num rows: 1201 Data size: 259416 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: string)
sort order: +
Map-reduce partition columns: _col0 (type: string)
Statistics: Num rows: 1201 Data size: 259416 Basic stats: COMPLETE Column stats: NONE
value expressions: _col1 (type: string)
Reduce Operator Tree:
Join Operator
condition map:
Outer Join 0 to 1
keys:
0 _col0 (type: string)
1 _col0 (type: string)
outputColumnNames: _col1, _col3
Statistics: Num rows: 6774 Data size: 1896972 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: COALESCE(_col1,_col3) (type: string)
outputColumnNames: _col0
Statistics: Num rows: 6774 Data size: 1896972 Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
Statistics: Num rows: 6774 Data size: 1896972 Basic stats: COMPLETE Column stats: NONE
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
3 - explain
STAGE PLANS:
Stage: Stage-7
Map Reduce Local Work
Alias -> Map Local Tables:
$hdt$_1:t2
Fetch Operator
limit: -1
$hdt$_2:t5
Fetch Operator
limit: -1
Alias -> Map Local Operator Tree:
$hdt$_1:t2
TableScan
alias: t2
Statistics: Num rows: 1201 Data size: 259416 Basic stats: COMPLETE Column stats: NONE
Filter Operator
predicate: material_code is not null (type: boolean)
Statistics: Num rows: 1201 Data size: 259416 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: material_code (type: string), wh_guid (type: string)
outputColumnNames: _col0, _col1
Statistics: Num rows: 1201 Data size: 259416 Basic stats: COMPLETE Column stats: NONE
HashTable Sink Operator
keys:
0 _col0 (type: string)
1 _col0 (type: string)
$hdt$_2:t5
TableScan
alias: t5
Statistics: Num rows: 12927 Data size: 2430276 Basic stats: COMPLETE Column stats: NONE
Filter Operator
predicate: material_code is not null (type: boolean)
Statistics: Num rows: 12927 Data size: 2430276 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: material_code (type: string)
outputColumnNames: _col0
Statistics: Num rows: 12927 Data size: 2430276 Basic stats: COMPLETE Column stats: NONE
HashTable Sink Operator
keys:
0 COALESCE(_col0,_col2) (type: string)
1 _col0 (type: string)
Stage: Stage-5
Map Reduce
Map Operator Tree:
TableScan
alias: t1
Statistics: Num rows: 6159 Data size: 1724520 Basic stats: COMPLETE Column stats: NONE
Filter Operator
predicate: material_code is not null (type: boolean)
Statistics: Num rows: 6159 Data size: 1724520 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: material_code (type: string), wh_guid (type: string)
outputColumnNames: _col0, _col1
Statistics: Num rows: 6159 Data size: 1724520 Basic stats: COMPLETE Column stats: NONE
Map Join Operator
condition map:
Inner Join 0 to 1
keys:
0 _col0 (type: string)
1 _col0 (type: string)
outputColumnNames: _col0, _col1, _col2, _col3
Statistics: Num rows: 6774 Data size: 1896972 Basic stats: COMPLETE Column stats: NONE
Map Join Operator
condition map:
Inner Join 0 to 1
keys:
0 COALESCE(_col0,_col2) (type: string)
1 _col0 (type: string)
outputColumnNames: _col1, _col3
Statistics: Num rows: 14219 Data size: 2673303 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: COALESCE(_col1,_col3) (type: string)
outputColumnNames: _col0
Statistics: Num rows: 14219 Data size: 2673303 Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
Statistics: Num rows: 14219 Data size: 2673303 Basic stats: COMPLETE Column stats: NONE
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe