[HIVE-2284] bucketized map join should allow join key as a superset of bucketized columns - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.8.0
Component/s: None
Labels:
None

Hadoop Flags:

Reviewed

Description

Currently bucketized mapjoin only allow the join keys being exactly the same as bucketized columns. This is too restrictive and is missing some optimization opportunities.

If tables S and T are both bucketized on column A with the same # of buckets, and the query is something like:

<code>
SELECT /*+ MAPJOIN (S) */ ...
FROM S join T
ON (S.A = T.A AND S.B = T.B)
<code>

We should allow bucketized mapjoin since it's straightforward that bucket 1 from S join with bucket 2 from T on such join condition must be empty.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HIVE-2284.patch
15/Jul/11 17:06
11 kB
Ning Zhang

Activity

People

Assignee:: Ning Zhang

Reporter:: Ning Zhang

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Dates

Created:: 15/Jul/11 08:06

Updated:: 16/Dec/11 23:55

Resolved:: 16/Jul/11 06:42