Pig
  1. Pig
  2. PIG-1648

Split combination may return too many block locations to map/reduce framework

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.8.0
    • Fix Version/s: 0.8.0
    • Component/s: None
    • Labels:
      None

      Description

      For instance, if a small split has block locations h1, h2 and h3; another small split has h1, h3, h4. After combination, the composite split contains 4 block locations. If the number of component splits is big, then the number of block locations could be big too. In fact, the number of block locations serves as a hint to M/R as the best hosts this composite split should be run on so the list should contain a short list, say 5, of the hosts that contain the most data in this composite split.

        Activity

        Yan Zhou created issue -
        Hide
        Yan Zhou added a comment -

        Top 5 locations with most data will be used. This has been agreed upon by the M/R dev.

        Show
        Yan Zhou added a comment - Top 5 locations with most data will be used. This has been agreed upon by the M/R dev.
        Yan Zhou made changes -
        Field Original Value New Value
        Attachment PIG-1648.patch [ 12455851 ]
        Hide
        Yan Zhou added a comment -

        test-patch results:

        [exec] +1 overall.
        [exec]
        [exec] +1 @author. The patch does not contain any @author tags.
        [exec]
        [exec] +1 tests included. The patch appears to include 3 new or modified tests.
        [exec]
        [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
        [exec]
        [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
        [exec]
        [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
        [exec]
        [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.

        test-core tests pass too.

        Show
        Yan Zhou added a comment - test-patch results: [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. test-core tests pass too.
        Hide
        Richard Ding added a comment -

        +1

        Show
        Richard Ding added a comment - +1
        Yan Zhou made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        Yan Zhou added a comment -

        Patch committed to both trunk and the 0.8 branch.

        Show
        Yan Zhou added a comment - Patch committed to both trunk and the 0.8 branch.
        Yan Zhou made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Hide
        Thejas M Nair added a comment -

        (Some clarification about what this patch does, after discussing with Yan -)
        This change does not alter the number of splits combined together, it just alters the number of block locations sent to MR . As the description says, this information is used by MR framework only for the deciding where to schedule the map job.

        Show
        Thejas M Nair added a comment - (Some clarification about what this patch does, after discussing with Yan -) This change does not alter the number of splits combined together, it just alters the number of block locations sent to MR . As the description says, this information is used by MR framework only for the deciding where to schedule the map job.
        Olga Natkovich made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Yan Zhou
            Reporter:
            Yan Zhou
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development