Pig
  1. Pig
  2. PIG-1217

[piggybank] evaluation.util.Top is broken

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 0.3.0, 0.4.0, site, 0.5.0, 0.6.0, 0.7.0
    • Fix Version/s: 0.7.0
    • Component/s: None
    • Labels:
      None

      Description

      The Top udf has been broken for a while, due to an incorrect implementation of getArgToFuncMapping.

      1. fix_top_udf.diff
        20 kB
        Dmitriy V. Ryaboy
      2. fix_top_udf.diff
        21 kB
        Dmitriy V. Ryaboy
      3. fix_top_udf.diff
        21 kB
        Dmitriy V. Ryaboy

        Activity

        Hide
        Alan Gates added a comment -

        Removed line 72 of the test per Dmitriy request. With that change, tests went fine. Patch checked in.

        Show
        Alan Gates added a comment - Removed line 72 of the test per Dmitriy request. With that change, tests went fine. Patch checked in.
        Hide
        Dmitriy V. Ryaboy added a comment -

        Oops, right, that check is actually no longer valid based on what you said about how the data gets fed into Initial.
        Can you delete line 72 of TestTop and rerun?
        Or I can repost a patch.

        Show
        Dmitriy V. Ryaboy added a comment - Oops, right, that check is actually no longer valid based on what you said about how the data gets fed into Initial. Can you delete line 72 of TestTop and rerun? Or I can repost a patch.
        Hide
        Alan Gates added a comment -

        When I ran the piggybank tests with this patch I got:

        Testsuite: org.apache.pig.piggybank.test.evaluation.util.TestTop
        Tests run: 2, Failures: 1, Errors: 0, Time elapsed: 0.044 sec
        
        Testcase: testTopExec took 0.017 sec
        Testcase: testTopAlgebraic took 0.006 sec
            FAILED
        Value 0 exceeded the expected limit
        junit.framework.AssertionFailedError: Value 0 exceeded the expected limit
            at org.apache.pig.piggybank.test.evaluation.util.TestTop.checkItemsGT(Unknown Source)
            at org.apache.pig.piggybank.test.evaluation.util.TestTop.testTopAlgebraic(Unknown Source)
        
        Show
        Alan Gates added a comment - When I ran the piggybank tests with this patch I got: Testsuite: org.apache.pig.piggybank.test.evaluation.util.TestTop Tests run: 2, Failures: 1, Errors: 0, Time elapsed: 0.044 sec Testcase: testTopExec took 0.017 sec Testcase: testTopAlgebraic took 0.006 sec FAILED Value 0 exceeded the expected limit junit.framework.AssertionFailedError: Value 0 exceeded the expected limit at org.apache.pig.piggybank.test.evaluation.util.TestTop.checkItemsGT(Unknown Source) at org.apache.pig.piggybank.test.evaluation.util.TestTop.testTopAlgebraic(Unknown Source)
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12435416/fix_top_udf.diff
        against trunk revision 908324.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/207/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/207/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/207/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12435416/fix_top_udf.diff against trunk revision 908324. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/207/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/207/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/207/console This message is automatically generated.
        Hide
        Dmitriy V. Ryaboy added a comment -

        Simplified Initial per Alan's comments (just returning the tuple doesn't work, btw).
        Also made it a bit safer around nulls.

        Show
        Dmitriy V. Ryaboy added a comment - Simplified Initial per Alan's comments (just returning the tuple doesn't work, btw). Also made it a bit safer around nulls.
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12434839/fix_top_udf.diff
        against trunk revision 906326.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/200/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/200/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/200/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12434839/fix_top_udf.diff against trunk revision 906326. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/200/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/200/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/200/console This message is automatically generated.
        Hide
        Dmitriy V. Ryaboy added a comment -

        I see, thanks for the tip. How does this work with tuple reuse – can I just return the input tuple, or do I need to copy the contents to a new tuple in Top.Initial() ?

        No worries about 0.6, I'd rather it finally go out than try to get something like this in at the last moment.

        Show
        Dmitriy V. Ryaboy added a comment - I see, thanks for the tip. How does this work with tuple reuse – can I just return the input tuple, or do I need to copy the contents to a new tuple in Top.Initial() ? No worries about 0.6, I'd rather it finally go out than try to get something like this in at the last moment.
        Hide
        Alan Gates added a comment -

        In general, looks good. A comment on Top.Initial. If you do something like

        B = group A ...
        C = foreach B generate myudf(A);

        and myudf is algebraic, you are guaranteed to only get one record at a time in the Initial function because Pig doesn't do any collecting of the keys. That is, even if ten records in a row have the same key Pig won't detect that and collate them into the bag before calling Initial. We take advantage of that in a number of the built in functions (eg COUNT) to make the processing of Initial easier. You may want to do the same here.

        As far as getting it into 0.6 release, I think Olga was trying to roll the package today or tomorrow, so we may be out of time.

        Show
        Alan Gates added a comment - In general, looks good. A comment on Top.Initial. If you do something like B = group A ... C = foreach B generate myudf(A); and myudf is algebraic, you are guaranteed to only get one record at a time in the Initial function because Pig doesn't do any collecting of the keys. That is, even if ten records in a row have the same key Pig won't detect that and collate them into the bag before calling Initial. We take advantage of that in a number of the built in functions (eg COUNT) to make the processing of Initial easier. You may want to do the same here. As far as getting it into 0.6 release, I think Olga was trying to roll the package today or tomorrow, so we may be out of time.
        Hide
        Dmitriy V. Ryaboy added a comment -

        Huh. I wonder what Hudson tested – I accidentally attached an old version of the unit test, which doesn't even compile with the new Top. But Hudson passed contrib tests, and managed to fail on core tests.

        Show
        Dmitriy V. Ryaboy added a comment - Huh. I wonder what Hudson tested – I accidentally attached an old version of the unit test, which doesn't even compile with the new Top. But Hudson passed contrib tests, and managed to fail on core tests.
        Hide
        Dmitriy V. Ryaboy added a comment -

        The test failures appear to be unrelated to this change. Please review.

        Show
        Dmitriy V. Ryaboy added a comment - The test failures appear to be unrelated to this change. Please review.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12434746/fix_top_udf.diff
        against trunk revision 906326.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/198/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/198/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/198/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12434746/fix_top_udf.diff against trunk revision 906326. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/198/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/198/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/198/console This message is automatically generated.
        Hide
        Dmitriy V. Ryaboy added a comment -

        a patch that actually works

        Show
        Dmitriy V. Ryaboy added a comment - a patch that actually works
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12434689/fix_top_udf.diff
        against trunk revision 905377.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/189/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/189/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/189/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12434689/fix_top_udf.diff against trunk revision 905377. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/189/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/189/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/189/console This message is automatically generated.
        Hide
        Dmitriy V. Ryaboy added a comment -

        Attaching a version for pig 0.5

        Show
        Dmitriy V. Ryaboy added a comment - Attaching a version for pig 0.5
        Hide
        Dmitriy V. Ryaboy added a comment -

        Attached patch fixes the issue.

        While I was in there, I also made it algebraic, and reduced the amount of reflection this function required (it used to call instanceof twice for every comparison operation, several of which may be needed for every tuple).

        This patch should apply to 0.6 as well.

        Show
        Dmitriy V. Ryaboy added a comment - Attached patch fixes the issue. While I was in there, I also made it algebraic, and reduced the amount of reflection this function required (it used to call instanceof twice for every comparison operation, several of which may be needed for every tuple). This patch should apply to 0.6 as well.

          People

          • Assignee:
            Dmitriy V. Ryaboy
            Reporter:
            Dmitriy V. Ryaboy
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development