[HIVE-2732] Reduce Sink deduplication fails if the child reduce sink is followed by a join - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 0.10.0
Fix Version/s: 0.10.0
Component/s: Query Processor
Labels:
None

Hadoop Flags:

Reviewed

Description

set hive.optimize.reducededuplication=true;
set hive.auto.convert.join=true;
explain select * from (select * from src distribute by key sort by key) a join src b on a.key = b.key;

fails with the following exception

java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.SelectOperator cannot be cast to org.apache.hadoop.hive.ql.exec.ReduceSinkOperator
at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertMapJoin(MapJoinProcessor.java:313)
at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.genMapJoinOpAndLocalWork(MapJoinProcessor.java:226)
at org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver$CommonJoinTaskDispatcher.processCurrentTask(CommonJoinResolver.java:174)
at org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver$CommonJoinTaskDispatcher.dispatch(CommonJoinResolver.java:287)
at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:194)
at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:139)
at org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver.resolve(CommonJoinResolver.java:68)
at org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:72)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genMapRedTasks(SemanticAnalyzer.java:7019)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7312)
at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:48)
at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

If hive.auto.convert.join is set to false, it produces an incorrect plan where the two halves of the join are processed in two separate map reduce tasks, and the reducers of these two tasks both contain the join operator resulting in an exception.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

ASF.LICENSE.NOT.GRANTED--HIVE-2732.D1809.1.patch
21/Feb/12 01:49
5 kB
Phabricator
ASF.LICENSE.NOT.GRANTED--HIVE-2732.D1809.2.patch
07/May/12 01:42
10 kB
Phabricator

Issue Links

is related to

HIVE-2329 Not using map aggregation, fails to execute group-by after cluster-by with same key

Closed

Activity

People

Assignee:: Navis Ryu

Reporter:: Kevin Wilfong

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 21/Jan/12 00:44

Updated:: 10/Jan/13 19:53

Resolved:: 15/May/12 20:06