Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.7.0
    • Component/s: None
    • Labels:
      None

      Description

      Currently, the values for the replicated side of the data are placed in a spillable bag (POFRJoin near line 275). This does not make sense because the whole point of the optimization is that the data on one side fits into memory. We already have a non-spillable bag implemented (NonSpillableDataBag.java) and we need to change FRJoin code to use it. And of course need to do lots of testing to make sure that we don't spill but die instead when we run out of memory

        Activity

        Hide
        Ankit Modi added a comment -

        This patch does not have any tests. Creating a test would be creating a big file about 250 MB and testing it.

        I have ran some tests in similar fashion.

        Show
        Ankit Modi added a comment - This patch does not have any tests. Creating a test would be creating a big file about 250 MB and testing it. I have ran some tests in similar fashion.
        Hide
        Ankit Modi added a comment -

        Tests I ran were using two files

        file format
        f1: random chararray(100)
        f2: random int

        leftside file contained 100 tuples and right side file contain 3million tuples.

        Code

        A = load 'leftsidefrjoin.txt' as ( key, value);
        B = load 'rightsidefrjoin.txt' as (key, value);
        C = join A by key left, B by key using "repl";
        --- Fragmented input and replicated input
        store C into 'output';
        

        This generated following error

        FATAL org.apache.hadoop.mapred.TaskTracker: Error running child : java.lang.OutOfMemoryError: GC overhead limit exceeded
        	at java.util.ArrayList.<init>(ArrayList.java:112)
        	at org.apache.pig.data.DefaultTuple.<init>(DefaultTuple.java:63)
        	at org.apache.pig.data.DefaultTupleFactory.newTuple(DefaultTupleFactory.java:35)
        	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.constructLROutput(POLocalRearrange.java:369)
        	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:288)
        	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFRJoin.setUpHashMap(POFRJoin.java:351)
        	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFRJoin.getNext(POFRJoin.java:211)
        	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:250)
        	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:241)
        	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.map(PigMapOnly.java:65)
        	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
        	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
        	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
        	at org.apache.hadoop.mapred.Child.main(Child.java:170)
        

        I ran the same job with same records on left hand side and 100K records on right hand side. The job completed successfully.

        Show
        Ankit Modi added a comment - Tests I ran were using two files file format f1: random chararray(100) f2: random int leftside file contained 100 tuples and right side file contain 3million tuples. Code A = load 'leftsidefrjoin.txt' as ( key, value); B = load 'rightsidefrjoin.txt' as (key, value); C = join A by key left, B by key using "repl"; --- Fragmented input and replicated input store C into 'output'; This generated following error FATAL org.apache.hadoop.mapred.TaskTracker: Error running child : java.lang.OutOfMemoryError: GC overhead limit exceeded at java.util.ArrayList.<init>(ArrayList.java:112) at org.apache.pig.data.DefaultTuple.<init>(DefaultTuple.java:63) at org.apache.pig.data.DefaultTupleFactory.newTuple(DefaultTupleFactory.java:35) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.constructLROutput(POLocalRearrange.java:369) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:288) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFRJoin.setUpHashMap(POFRJoin.java:351) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFRJoin.getNext(POFRJoin.java:211) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:250) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:241) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.map(PigMapOnly.java:65) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) I ran the same job with same records on left hand side and 100K records on right hand side. The job completed successfully.
        Hide
        Ankit Modi added a comment -

        This patch does not have any unit tests.

        Show
        Ankit Modi added a comment - This patch does not have any unit tests.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12427716/frjoin-nonspill.patch
        against trunk revision 889346.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no tests are needed for this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/118/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/118/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/118/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12427716/frjoin-nonspill.patch against trunk revision 889346. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/118/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/118/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/118/console This message is automatically generated.
        Hide
        Olga Natkovich added a comment -

        Test failures are not due to this patch. Also, I don't believe it is easy to test with an automatic test but I believe Ankit tested it manually.

        I will review the code and run test-commit + FRJoin tests before committing the patch.

        Show
        Olga Natkovich added a comment - Test failures are not due to this patch. Also, I don't believe it is easy to test with an automatic test but I believe Ankit tested it manually. I will review the code and run test-commit + FRJoin tests before committing the patch.
        Hide
        Olga Natkovich added a comment -

        patch committed. Thanks, Ankit!

        Show
        Olga Natkovich added a comment - patch committed. Thanks, Ankit!

          People

          • Assignee:
            Ankit Modi
            Reporter:
            Olga Natkovich
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development