Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-2779

Refactoring the code for setting number of reducers

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.11
    • Component/s: None
    • Labels:
      None

      Description

      As PIG-2652 observed, currently the code for setting number of reducers is a little messy. MapReduceOper.requestedParallelism seems being misused in some plases, and now we support runtime estimation of #reducer which further complicates the problem.

      For example, if we specify parallel 1 for the order-by, the estimated #reducer will be used. If we specify parallel 2 while it estimates 4, order-by will fail due to "Illegal partition for Null". If we specify parallel 4 while it estimates 2, then some reducers will have nothing to do.

        Attachments

        1. PIG-2779.0.patch
          9 kB
          Jie Li
        2. PIG-2779.1.patch
          18 kB
          Jie Li
        3. PIG-2779.2.patch
          18 kB
          Jie Li
        4. PIG-2779.3.patch
          20 kB
          Jie Li
        5. PIG-2779.4.patch
          20 kB
          Jie Li
        6. PIG-2779.5.patch
          31 kB
          Bill Graham
        7. PIG-2779.6.patch
          31 kB
          Bill Graham
        8. TestNumberOfReducers.java
          4 kB
          Jie Li
        9. TestNumberOfReducers.java
          4 kB
          Jie Li

          Issue Links

            Activity

              People

              • Assignee:
                jay23jack Jie Li
                Reporter:
                jay23jack Jie Li
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: