Pig
  1. Pig
  2. PIG-1955

PhysicalOperator has a member variable (non-static) Log object that is non-transient, this causes serialization errors

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.8.0, 0.9.0
    • Fix Version/s: 0.9.0
    • Component/s: impl
    • Labels:
      None
    • Patch Info:
      Patch Available
    • Hadoop Flags:
      Reviewed
    • Release Note:
      fix Serialzeable bug in PhysicalOperator, mark log as transient

      Description

      I found this while trying to write unit tests. Creating a local PigServer to test my LoadFunc caused a serialization of the PhysicalOperator class, which failed due to:
      ..
      Caused by: java.io.NotSerializableException: org.apache.commons.logging.impl.Log4JCategoryLog
      ..

      this is easily fixed by adding the transient keyword to the definition of log.

      e.g.

      on trunk:
      private final transient Log log = LogFactory.getLog(getClass());
      on the 0.8 tag:
      private transient Log log = LogFactory.getLog(getClass());

      1. 1955-static-0_8_1.patch
        20 kB
        Jeremy Hanna
      2. 1955-static-0_8_1.patch
        20 kB
        Jeremy Hanna
      3. 1955-static.patch
        20 kB
        Woody Anderson
      4. 1955-po.patch
        21 kB
        Woody Anderson
      5. 1955.patch
        0.7 kB
        Woody Anderson

        Activity

        Hide
        Woody Anderson added a comment -

        this doesn't have all of the unit test trimmings, but is that all really needed to mark a logger as transient?

        Show
        Woody Anderson added a comment - this doesn't have all of the unit test trimmings, but is that all really needed to mark a logger as transient?
        Hide
        Woody Anderson added a comment -

        Ok. Unfortunately, this issue is more pervasive than i originally thought.

        The 'simple' fix that is attached makes the PO logger transient protected and removes the loggers from all subclasses which are defined often incorrectly (non-transient members) and inconsistently.

        Personally, when i define loggers i always make them private and STATIC so that there is no getClass() call. this makes finding the class where the log line resides in source code much simpler to find. I dislike loggers that define themselves with getClass() b/c logging code in A.java will report as class B in output if class B extends class A.

        I did not change this behavior b/c perhaps someone has their reasons for doing what they did. I did however remove some of the static loggers simply to ensure consistency (the majority were done with member variables).

        The change to static private is also not such a big deal if anyone agrees we should consistently go that way instead.

        Show
        Woody Anderson added a comment - Ok. Unfortunately, this issue is more pervasive than i originally thought. The 'simple' fix that is attached makes the PO logger transient protected and removes the loggers from all subclasses which are defined often incorrectly (non-transient members) and inconsistently. Personally, when i define loggers i always make them private and STATIC so that there is no getClass() call. this makes finding the class where the log line resides in source code much simpler to find. I dislike loggers that define themselves with getClass() b/c logging code in A.java will report as class B in output if class B extends class A. I did not change this behavior b/c perhaps someone has their reasons for doing what they did. I did however remove some of the static loggers simply to ensure consistency (the majority were done with member variables). The change to static private is also not such a big deal if anyone agrees we should consistently go that way instead.
        Hide
        Richard Ding added a comment -

        I think we should make loggers static (and final). The problem with transient is that it doesn't work the way you expect. As it is transient, the logger would not be serialized, and when the object is deserialized, the logger is initialized to null (default value).

        Show
        Richard Ding added a comment - I think we should make loggers static (and final). The problem with transient is that it doesn't work the way you expect. As it is transient, the logger would not be serialized, and when the object is deserialized, the logger is initialized to null (default value).
        Hide
        Woody Anderson added a comment -

        Agreed, i prefer the static approach.

        Show
        Woody Anderson added a comment - Agreed, i prefer the static approach.
        Hide
        Dmitriy V. Ryaboy added a comment -

        Guys, does this look like a manifestation of the same bug? Email on the pig user list: http://pig.markmail.org/thread/4s76qv3eo6ivszee

        Show
        Dmitriy V. Ryaboy added a comment - Guys, does this look like a manifestation of the same bug? Email on the pig user list: http://pig.markmail.org/thread/4s76qv3eo6ivszee
        Hide
        Richard Ding added a comment -

        It's different. Log4JLogger implements Serializable interface. The error in the email happened during the deserialization. It looks like a class mismatch between client and server.

        Show
        Richard Ding added a comment - It's different. Log4JLogger implements Serializable interface. The error in the email happened during the deserialization. It looks like a class mismatch between client and server.
        Hide
        Richard Ding added a comment -

        Patch looks good. I'll commit the patch after running test-patch and unit tests.

        Show
        Richard Ding added a comment - Patch looks good. I'll commit the patch after running test-patch and unit tests.
        Hide
        Richard Ding added a comment -

        Test-patch result:

             [exec] -1 overall.  
             [exec] 
             [exec]     +1 @author.  The patch does not contain any @author tags.
             [exec] 
             [exec]     -1 tests included.  The patch doesn't appear to include any new or modified tests.
             [exec]                         Please justify why no tests are needed for this patch.
             [exec] 
             [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
             [exec] 
             [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
             [exec] 
             [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
             [exec] 
             [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
        

        Unit tests pass (with update of one golden file).

        Show
        Richard Ding added a comment - Test-patch result: [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no tests are needed for this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. Unit tests pass (with update of one golden file).
        Hide
        Richard Ding added a comment -

        Patch committed to trunk. Thanks Woody!

        Show
        Richard Ding added a comment - Patch committed to trunk. Thanks Woody!
        Hide
        Jeremy Hanna added a comment -

        Not sure if there are any updates to 0.8.x planned, but if so, would be nice to backport since it's a pretty clean backport. I can attach an updated patch if so. (There were just a few places where the hunk fails locally, but applying the changes manually worked)

        Show
        Jeremy Hanna added a comment - Not sure if there are any updates to 0.8.x planned, but if so, would be nice to backport since it's a pretty clean backport. I can attach an updated patch if so. (There were just a few places where the hunk fails locally, but applying the changes manually worked)
        Hide
        Alan Gates added a comment -

        I'm not aware of any plans to make a 0.8.2 release, but you can upload a patch here in case others want the same fix.

        Show
        Alan Gates added a comment - I'm not aware of any plans to make a 0.8.2 release, but you can upload a patch here in case others want the same fix.
        Hide
        Jeremy Hanna added a comment -

        Attaching a patch that cleanly applies against 0.8.1. I tested it with what was locally failing and it fixed the problem.

        Show
        Jeremy Hanna added a comment - Attaching a patch that cleanly applies against 0.8.1. I tested it with what was locally failing and it fixed the problem.
        Hide
        Jeremy Hanna added a comment -

        Granting license.

        Show
        Jeremy Hanna added a comment - Granting license.

          People

          • Assignee:
            Woody Anderson
            Reporter:
            Woody Anderson
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development