Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Not A Problem
-
None
-
None
-
None
Description
explain SELECT x.key, count(1) FROM src1 x JOIN src y ON (x.key = y.key) group by x.key;
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-2 depends on stages: Stage-1
Stage-0 is a root stage
The above query issues 2 map-reduce jobs.
The first MR job performs the join, whereas the second MR performs the group by.
Since the data is already sorted, the group by can be performed in the reducer of the join itself.
Attachments
Attachments
Issue Links
There are no Sub-Tasks for this issue.