[PIG-3878] Improve parallelism of union and join - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Invalid
Affects Version/s: None
Fix Version/s: 0.14.0
Component/s: tez
Labels:
None

Description

Currently if user has no parallel clause specified, then it defaults to 1 and it is bad for performance. MR does not have this issue as for each job number of mappers are determined by input splits and number for reducers by InputSizeReducerEstimator. Automatic reducer parallelism for Tez in general will be handled in separate jiras. But a quick workaround can be done for joins and unions by setting the parallelism of the reduce task to be sum of join tasks till ARP is put in and better estimation is done.

Attachments

Activity

People

Assignee:: Rohini Palaniswamy

Reporter:: Rohini Palaniswamy

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 10/Apr/14 13:33

Updated:: 24/Jul/14 18:43

Resolved:: 24/Jul/14 18:43