[PIG-539] unable to control parallelism of Map tasks - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: impl
Labels:
None
Environment:

local execution + hadoop execution

Description

I put "PARALLEL 1" following every statement in my pig script, and it still executes maps with more than 1 parallel task. This is a major problem because for one of my operations I need to have a serialized (non-parallel) map.

Probably the semantics of parallelism should be as follows:
1. group pig operators into map/reduce stages
2. for each stage, take the minimum of the "Parallel" directives given by the user for statements executed as part of that stage

(We'll have to decide on a rule for statements that use the combiner, which execute partially on the map side and partially on the reduce side ...)

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Christopher Olston

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Dates

Created:: 20/Nov/08 18:04

Updated:: 20/Nov/08 19:51