Pig
  1. Pig
  2. PIG-1428

Make a StatusReporter singleton available for incrementing counters

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.7.0
    • Fix Version/s: 0.8.0
    • Component/s: None
    • Labels:
      None
    • Patch Info:
      Patch Available
    • Tags:
      pig-0.7.1

      Description

      Without this getter method, its not possible to get counters, report progress etc. from UDFs.

      1. npe.patch
        0.7 kB
        Daniel Dai
      2. PIG-1428.patch
        8 kB
        Dmitriy V. Ryaboy
      3. PIG-1428.patch
        3 kB
        Dmitriy V. Ryaboy
      4. PIG-1428.patch
        3 kB
        Dmitriy V. Ryaboy

        Activity

        Hide
        Dmitriy V. Ryaboy added a comment -

        No tests, as this is trivial.

        Show
        Dmitriy V. Ryaboy added a comment - No tests, as this is trivial.
        Hide
        Dmitriy V. Ryaboy added a comment -

        please review if this gets no -1s other than lack of tests.

        Show
        Dmitriy V. Ryaboy added a comment - please review if this gets no -1s other than lack of tests.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12445985/PIG-1428.patch
        against trunk revision 949057.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no tests are needed for this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 1 new Findbugs warnings.

        -1 release audit. The applied patch generated 386 release audit warnings (more than the trunk's current 385 warnings).

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h1.grid.sp2.yahoo.net/17/testReport/
        Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h1.grid.sp2.yahoo.net/17/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h1.grid.sp2.yahoo.net/17/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h1.grid.sp2.yahoo.net/17/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12445985/PIG-1428.patch against trunk revision 949057. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 1 new Findbugs warnings. -1 release audit. The applied patch generated 386 release audit warnings (more than the trunk's current 385 warnings). +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h1.grid.sp2.yahoo.net/17/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h1.grid.sp2.yahoo.net/17/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h1.grid.sp2.yahoo.net/17/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h1.grid.sp2.yahoo.net/17/console This message is automatically generated.
        Hide
        Dmitriy V. Ryaboy added a comment -

        Findbugs is quite right to call me out on the synchronization thing. I am not sure why the setter needs to by synchronized; I am even less sure the getter should be. Seems like this would add one more lock every time we want to increment a counter or write a log line, which is unfortunate (I assume those objects handle their own concurrency issues). Can Richard or Pradeep comment on that?

        Show
        Dmitriy V. Ryaboy added a comment - Findbugs is quite right to call me out on the synchronization thing. I am not sure why the setter needs to by synchronized; I am even less sure the getter should be. Seems like this would add one more lock every time we want to increment a counter or write a log line, which is unfortunate (I assume those objects handle their own concurrency issues). Can Richard or Pradeep comment on that?
        Hide
        Richard Ding added a comment -

        Remove the keyword synchronized from the getter will fix this findbugs warning. Setters are called only once per map/reduce task (during the setup).

        Show
        Richard Ding added a comment - Remove the keyword synchronized from the getter will fix this findbugs warning. Setters are called only once per map/reduce task (during the setup).
        Hide
        Dmitriy V. Ryaboy added a comment -

        removed the synchronized keyword

        Show
        Dmitriy V. Ryaboy added a comment - removed the synchronized keyword
        Hide
        Dmitriy V. Ryaboy added a comment -

        I notice that the issue has been discussed before in PIG-889, and Santosh argued (convincingly) that adding this method to PigLogger might not make sense. Santosh, would you like to suggest a different place to put this functionality? I am not married to using this method, it's just the path of least resistance.

        Show
        Dmitriy V. Ryaboy added a comment - I notice that the issue has been discussed before in PIG-889 , and Santosh argued (convincingly) that adding this method to PigLogger might not make sense. Santosh, would you like to suggest a different place to put this functionality? I am not married to using this method, it's just the path of least resistance.
        Hide
        Dmitriy V. Ryaboy added a comment -

        trying to tickle hudson

        Show
        Dmitriy V. Ryaboy added a comment - trying to tickle hudson
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12446095/PIG-1428.patch
        against trunk revision 952098.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no tests are needed for this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        -1 release audit. The applied patch generated 383 release audit warnings (more than the trunk's current 382 warnings).

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/332/testReport/
        Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/332/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/332/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/332/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12446095/PIG-1428.patch against trunk revision 952098. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 release audit. The applied patch generated 383 release audit warnings (more than the trunk's current 382 warnings). +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/332/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/332/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/332/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/332/console This message is automatically generated.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12446095/PIG-1428.patch
        against trunk revision 949057.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no tests are needed for this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        -1 release audit. The applied patch generated 379 release audit warnings (more than the trunk's current 378 warnings).

        -1 core tests. The patch failed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/318/testReport/
        Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/318/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/318/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/318/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12446095/PIG-1428.patch against trunk revision 949057. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 release audit. The applied patch generated 379 release audit warnings (more than the trunk's current 378 warnings). -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/318/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/318/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/318/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/318/console This message is automatically generated.
        Hide
        Ashutosh Chauhan added a comment -

        So, I read through PIG-889. It seems that there never was a documented way to use counters, reporters etc from UDFs, Load/Store Funcs. Actually, there is a hacky way to do it, which exists in DefaultAbstractBag.java

            protected void incSpillCount(Enum counter) {
                // Increment the spill count
                // warn is a misnomer. The function updates the counter. If the update
                // fails, it dumps a warning
                PigHadoopLogger.getInstance().warn(this, "Spill counter incremented", counter);
            }
        

        But in PIG-889 Santhosh has argued against for this (mis)use of PigLogger. I think we need to provide a formal way to Pig users to access counters, reporters from our interfaces (UDFs, L/S) as PigHadoopLogger is designed for error-handling (warning aggregation in particular) and not for this purpose. And we shall mark this class as Internal only, before some one starts using it. With the same argument, above method where Pig is internally making use of its own Counters is flawed and needs to be corrected.

        Show
        Ashutosh Chauhan added a comment - So, I read through PIG-889 . It seems that there never was a documented way to use counters, reporters etc from UDFs, Load/Store Funcs. Actually, there is a hacky way to do it, which exists in DefaultAbstractBag.java protected void incSpillCount(Enum counter) { // Increment the spill count // warn is a misnomer. The function updates the counter. If the update // fails, it dumps a warning PigHadoopLogger.getInstance().warn( this , "Spill counter incremented" , counter); } But in PIG-889 Santhosh has argued against for this (mis)use of PigLogger. I think we need to provide a formal way to Pig users to access counters, reporters from our interfaces (UDFs, L/S) as PigHadoopLogger is designed for error-handling (warning aggregation in particular) and not for this purpose. And we shall mark this class as Internal only, before some one starts using it. With the same argument, above method where Pig is internally making use of its own Counters is flawed and needs to be corrected.
        Hide
        Ashutosh Chauhan added a comment -

        I propose a slightly different approach here. Instead of adding getPigStatusReporter() to PigLogger() interface and the corresponding implementation in PigHadoopLogger, we can add a static singleton method in PigStatusReporter and also add a setContext( TaskInputOutputContext context) We can then set the context in map() and reduce() functions and users will have full access of the reporter object through the static method. This will allow us to keep error logging different then status reporting.

        Show
        Ashutosh Chauhan added a comment - I propose a slightly different approach here. Instead of adding getPigStatusReporter() to PigLogger() interface and the corresponding implementation in PigHadoopLogger, we can add a static singleton method in PigStatusReporter and also add a setContext( TaskInputOutputContext context) We can then set the context in map() and reduce() functions and users will have full access of the reporter object through the static method. This will allow us to keep error logging different then status reporting.
        Hide
        Dmitriy V. Ryaboy added a comment -

        Once more, with feeling.
        This implements Ashutosh's suggestion of making PigStatusReporter maintain a singleton and expose a public getInstance() method.

        Show
        Dmitriy V. Ryaboy added a comment - Once more, with feeling. This implements Ashutosh's suggestion of making PigStatusReporter maintain a singleton and expose a public getInstance() method.
        Hide
        Richard Ding added a comment -

        +1

        Show
        Richard Ding added a comment - +1
        Hide
        Dmitriy V. Ryaboy added a comment -

        Committed to trunk.
        We may want to consider this for a 0.7.1, if such a thing comes about, as in a sense it's addressing a regression.

        I tagged this issue with "pig-0.7.1" so we can find it later if we decide a dot-release is warranted.

        Show
        Dmitriy V. Ryaboy added a comment - Committed to trunk. We may want to consider this for a 0.7.1, if such a thing comes about, as in a sense it's addressing a regression. I tagged this issue with "pig-0.7.1" so we can find it later if we decide a dot-release is warranted.
        Hide
        Dmitriy V. Ryaboy added a comment -

        Committed to 0.7.

        Show
        Dmitriy V. Ryaboy added a comment - Committed to 0.7.
        Hide
        Olga Natkovich added a comment -

        Looks like the change broke unit tests. Hudson only runs tests against the trunk so we need to run the manually against other branches

        Show
        Olga Natkovich added a comment - Looks like the change broke unit tests. Hudson only runs tests against the trunk so we need to run the manually against other branches
        Hide
        Dmitriy V. Ryaboy added a comment -

        I ran the new and changed tests manually before committing, but not the whole set (didn't have 12 hours to spare). Which tests are failing for you?

        Show
        Dmitriy V. Ryaboy added a comment - I ran the new and changed tests manually before committing, but not the whole set (didn't have 12 hours to spare). Which tests are failing for you?
        Hide
        Olga Natkovich added a comment -

        Daniel will publish tests that failed. We do need to make sure that we run all the tests especially on branches before committing because branches need to stay stable.

        Show
        Olga Natkovich added a comment - Daniel will publish tests that failed. We do need to make sure that we run all the tests especially on branches before committing because branches need to stay stable.
        Hide
        Dmitriy V. Ryaboy added a comment -

        Found the culprit, will commit fix within ~ 20 mins assuming tests pass.

        Show
        Dmitriy V. Ryaboy added a comment - Found the culprit, will commit fix within ~ 20 mins assuming tests pass.
        Hide
        Daniel Dai added a comment -

        TestDataBag for 0.7 branch fail after check in. Attach the patch. This fix is already checked in.

        Show
        Daniel Dai added a comment - TestDataBag for 0.7 branch fail after check in. Attach the patch. This fix is already checked in.
        Hide
        Dmitriy V. Ryaboy added a comment -

        yeah that's the patch I have, verbatim. Sorry about breaking the build again.

        Show
        Dmitriy V. Ryaboy added a comment - yeah that's the patch I have, verbatim. Sorry about breaking the build again.

          People

          • Assignee:
            Dmitriy V. Ryaboy
            Reporter:
            Ashutosh Chauhan
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development