[PIG-1218] Use distributed cache to store samples - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.7.0
Component/s: None
Labels:
None

Hadoop Flags:

Reviewed

Description

Currently, in the case of skew join and order by we use sample that is just written to the dfs (not distributed cache) and, as the result, get opened and copied around more than necessary. This impacts query performance and also places unnecesary load on the name node

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

PIG-1218_2.patch
18/Feb/10 20:27
28 kB
Richard Ding
PIG-1218_3.patch
18/Feb/10 21:37
31 kB
Richard Ding
PIG-1218.patch
10/Feb/10 23:19
34 kB
Richard Ding

Activity

People

Assignee:: Richard Ding

Reporter:: Olga Natkovich

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 03/Feb/10 19:22

Updated:: 14/May/10 06:46

Resolved:: 19/Feb/10 23:49