Pig
  1. Pig
  2. PIG-954

Skewed join fails when pig.skewedjoin.reduce.memusage is not configured

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.4.0
    • Component/s: None
    • Labels:
      None

      Description

      query fails if pig.skewedjoin.reduce.memusage is not configured.

      1. PIG-954.patch2
        5 kB
        Ying He
      2. PIG-954.patch
        3 kB
        Ying He

        Issue Links

          Activity

          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Patch Available Patch Available
          21h 34m 1 Olga Natkovich 11/Sep/09 20:44
          Patch Available Patch Available Resolved Resolved
          2h 50m 1 Olga Natkovich 11/Sep/09 23:34
          Resolved Resolved Closed Closed
          193d 22h 38m 1 Alan Gates 24/Mar/10 22:13
          Alan Gates made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Olga Natkovich made changes -
          Assignee Ying He [ yinghe ]
          Olga Natkovich made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Fix Version/s 0.4.0 [ 12314042 ]
          Resolution Fixed [ 1 ]
          Hide
          Olga Natkovich added a comment -

          patch committed. Thanks, Ying for a quick fix!

          Show
          Olga Natkovich added a comment - patch committed. Thanks, Ying for a quick fix!
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12419336/PIG-954.patch2
          against trunk revision 814016.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/5/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/5/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/5/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12419336/PIG-954.patch2 against trunk revision 814016. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/5/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/5/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/5/console This message is automatically generated.
          Ying He made changes -
          Description Fragmented replicated join has a few limitations:
           - One of the tables needs to be loaded into memory
           - Join is limited to two tables

          Skewed join partitions the table and joins the records in the reduce phase. It computes a histogram of the key space to account for skewing in the input records. Further, it adjusts the number of reducers depending on the key distribution.

          We need to implement the skewed join in pig.
          query fails if pig.skewedjoin.reduce.memusage is not configured.
          Hide
          Olga Natkovich added a comment -

          +1 on the code changes. Need to wait for test results

          Show
          Olga Natkovich added a comment - +1 on the code changes. Need to wait for test results
          Olga Natkovich made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Ying He made changes -
          Attachment PIG-954.patch2 [ 12419336 ]
          Hide
          Ying He added a comment -

          add JUnit test

          Show
          Ying He added a comment - add JUnit test
          Ying He made changes -
          Attachment PIG-954.patch [ 12419236 ]
          Ying He made changes -
          Attachment PIG-954.patch [ 12419242 ]
          Hide
          Ying He added a comment -

          use final variable to define the default value of pig.skewedjoin.reduce.memusage

          Show
          Ying He added a comment - use final variable to define the default value of pig.skewedjoin.reduce.memusage
          Ying He made changes -
          Attachment PIG-954.patch [ 12419236 ]
          Hide
          Ying He added a comment -

          use default value if pig.skewedjoin.reduce.memusage is not configured in pig property file

          Show
          Ying He added a comment - use default value if pig.skewedjoin.reduce.memusage is not configured in pig property file
          Hide
          Ying He added a comment -

          the sampling job fails when pig.skewedjoin.reduce.memusage is not configured in pig property file.

          Show
          Ying He added a comment - the sampling job fails when pig.skewedjoin.reduce.memusage is not configured in pig property file.
          Ying He made changes -
          Field Original Value New Value
          Link This issue is a clone of PIG-792 [ PIG-792 ]
          Ying He created issue -

            People

            • Assignee:
              Ying He
              Reporter:
              Ying He
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development