[HIVE-1772] optimize join followed by a groupby - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Not A Problem
Affects Version/s: None
Fix Version/s: None
Component/s: Query Processor
Labels:
None

Description

explain SELECT x.key, count(1) FROM src1 x JOIN src y ON (x.key = y.key) group by x.key;

STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-2 depends on stages: Stage-1
Stage-0 is a root stage

The above query issues 2 map-reduce jobs.
The first MR job performs the join, whereas the second MR performs the group by.
Since the data is already sorted, the group by can be performed in the reducer of the join itself.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HIVE-1772.1.patch
04/Aug/11 06:15
14 kB
Navis Ryu

Issue Links

relates to

HIVE-3667 Umbrella jira for Correlation Optimizer

Open

HIVE-2206 add a new optimizer for query correlation discovery and optimization

Closed

HIVE-3430 group by followed by join with the same key should be optimized

Resolved

Sub-Tasks

There are no Sub-Tasks for this issue.

Activity

People

Assignee:: Unassigned

Reporter:: Namit Jain

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 05/Nov/10 23:29

Updated:: 18/Jul/13 14:57

Resolved:: 18/Jul/13 14:57