Hadoop Common
  1. Hadoop Common
  2. HADOOP-8619

WritableComparator must implement no-arg constructor

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 3.0.0
    • Fix Version/s: 0.23.0
    • Component/s: io
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Because of reasons listed here: http://findbugs.sourceforge.net/bugDescriptions.html#SE_COMPARATOR_SHOULD_BE_SERIALIZABLE

      comparators should be serializable. To make deserialization work, it is required that all superclasses have no-arg constructor. http://findbugs.sourceforge.net/bugDescriptions.html#SE_NO_SUITABLE_CONSTRUCTOR

      Simply add no=arg constructor to WritableComparator.

      1. 8619-0.patch
        4 kB
        Chris Douglas
      2. writable-comparator.txt
        0.9 kB
        Radim Kolar

        Activity

        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk #1180 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1180/)
        HADOOP-8619. WritableComparator must implement no-arg constructor. Contributed by Chris Douglas. (Revision 1378120)

        Result = SUCCESS
        suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1378120
        Files :

        • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
        • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/WritableComparator.java
        • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/serializer/TestWritableSerialization.java
        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #1180 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1180/ ) HADOOP-8619 . WritableComparator must implement no-arg constructor. Contributed by Chris Douglas. (Revision 1378120) Result = SUCCESS suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1378120 Files : /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/WritableComparator.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/serializer/TestWritableSerialization.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk #1149 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1149/)
        HADOOP-8619. WritableComparator must implement no-arg constructor. Contributed by Chris Douglas. (Revision 1378120)

        Result = FAILURE
        suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1378120
        Files :

        • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
        • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/WritableComparator.java
        • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/serializer/TestWritableSerialization.java
        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #1149 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1149/ ) HADOOP-8619 . WritableComparator must implement no-arg constructor. Contributed by Chris Douglas. (Revision 1378120) Result = FAILURE suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1378120 Files : /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/WritableComparator.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/serializer/TestWritableSerialization.java
        Hide
        Radim Kolar added a comment -

        I need 2.1.0-alpha branch as well. I have plans to test it soon after i will fix compile error.

        Show
        Radim Kolar added a comment - I need 2.1.0-alpha branch as well. I have plans to test it soon after i will fix compile error.
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk-Commit #2678 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2678/)
        HADOOP-8619. WritableComparator must implement no-arg constructor. Contributed by Chris Douglas. (Revision 1378120)

        Result = FAILURE
        suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1378120
        Files :

        • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
        • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/WritableComparator.java
        • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/serializer/TestWritableSerialization.java
        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #2678 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2678/ ) HADOOP-8619 . WritableComparator must implement no-arg constructor. Contributed by Chris Douglas. (Revision 1378120) Result = FAILURE suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1378120 Files : /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/WritableComparator.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/serializer/TestWritableSerialization.java
        Hide
        Suresh Srinivas added a comment -

        Committed the patch to 0.23 and branch-2 as well.

        Show
        Suresh Srinivas added a comment - Committed the patch to 0.23 and branch-2 as well.
        Hide
        Radim Kolar added a comment -

        I need fix in 0.23 and 2.1.0-alpha

        Show
        Radim Kolar added a comment - I need fix in 0.23 and 2.1.0-alpha
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Common-trunk-Commit #2649 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2649/)
        HADOOP-8619. WritableComparator must implement no-arg constructor. Contributed by Chris Douglas. (Revision 1378120)

        Result = SUCCESS
        suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1378120
        Files :

        • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
        • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/WritableComparator.java
        • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/serializer/TestWritableSerialization.java
        Show
        Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #2649 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2649/ ) HADOOP-8619 . WritableComparator must implement no-arg constructor. Contributed by Chris Douglas. (Revision 1378120) Result = SUCCESS suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1378120 Files : /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/WritableComparator.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/serializer/TestWritableSerialization.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk-Commit #2713 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2713/)
        HADOOP-8619. WritableComparator must implement no-arg constructor. Contributed by Chris Douglas. (Revision 1378120)

        Result = SUCCESS
        suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1378120
        Files :

        • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
        • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/WritableComparator.java
        • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/serializer/TestWritableSerialization.java
        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #2713 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2713/ ) HADOOP-8619 . WritableComparator must implement no-arg constructor. Contributed by Chris Douglas. (Revision 1378120) Result = SUCCESS suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1378120 Files : /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/WritableComparator.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/serializer/TestWritableSerialization.java
        Hide
        Suresh Srinivas added a comment -

        +1 for the patch.

        I committed the patch to trunk. Thank you Chris.

        @Radim, Fix Version/s was set to 3.0.0, 2.0.0-alpha and 0.23. I am not sure if you intended that to indicate that you need this change on those branches. If you did, let me know.

        Also, in future, to indicate other branches that need a change, use Target Version/s field.

        Show
        Suresh Srinivas added a comment - +1 for the patch. I committed the patch to trunk. Thank you Chris. @Radim, Fix Version/s was set to 3.0.0, 2.0.0-alpha and 0.23. I am not sure if you intended that to indicate that you need this change on those branches. If you did, let me know. Also, in future, to indicate other branches that need a change, use Target Version/s field.
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12540087/8619-0.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/1275//testReport/
        Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/1275//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12540087/8619-0.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/1275//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/1275//console This message is automatically generated.
        Hide
        Steve Loughran added a comment -

        Before committing this, I'm going to add one more bit of homework:

        add a test that serializes then deserializes something of the std. writables.

        Why? Your goal is to ser/deser things -the ctor is just a means to that end. Add a test of the desired behaviour and you can be confident that it doesn't break in future.

        Show
        Steve Loughran added a comment - Before committing this, I'm going to add one more bit of homework: add a test that serializes then deserializes something of the std. writables. Why? Your goal is to ser/deser things -the ctor is just a means to that end. Add a test of the desired behaviour and you can be confident that it doesn't break in future.
        Hide
        Steve Loughran added a comment -

        I don't see this breaking anything, and it lets people chose how they serialise stuff, that's their choice.

        +1

        Show
        Steve Loughran added a comment - I don't see this breaking anything, and it lets people chose how they serialise stuff, that's their choice. +1
        Hide
        Radim Kolar added a comment -

        I know how to use Writeable interface. I coded about 110k lines in hadoop of code in last 8 months. Its unusable beyond easy stuff. If Writable is that good, why you are not using it for serializing communication in YARN. You use avro and protobuf. Isnt one serialization approach sufficient for usage? It is, but its not effective. Thats why you are not using Writable everywhere.

        But my point is not to discuss if Writable sucks or not. Point is why you are trying on purpose put obstacles in using other serialization frameworks. You call it "protection" but its some kind of protection as you can get by communist government. It was proved by history that communist protection failed, it created sick society where everybody were poor.

        Do not try to lock developers into one right thing. Give them freedom.

        I do not have infinite time in hand. I could not wait years until you guys decide, maybe we should commit these 3 lines of code. I need to act now, my project needs it.

        Show
        Radim Kolar added a comment - I know how to use Writeable interface. I coded about 110k lines in hadoop of code in last 8 months. Its unusable beyond easy stuff. If Writable is that good, why you are not using it for serializing communication in YARN. You use avro and protobuf. Isnt one serialization approach sufficient for usage? It is, but its not effective. Thats why you are not using Writable everywhere. But my point is not to discuss if Writable sucks or not. Point is why you are trying on purpose put obstacles in using other serialization frameworks. You call it "protection" but its some kind of protection as you can get by communist government. It was proved by history that communist protection failed, it created sick society where everybody were poor. Do not try to lock developers into one right thing. Give them freedom. I do not have infinite time in hand. I could not wait years until you guys decide, maybe we should commit these 3 lines of code. I need to act now, my project needs it.
        Hide
        Brandon Li added a comment -

        Hi Radim,

        Semantically, the patch is a good patch. It's not necessary to patch it for the following reason:

        ...all referenced objects must be deserializable, otherwise there is no way to reconstruct object tree back...

        We are aware of that. One of the reasons that Interface "Writeable" was used from the early days is it's simple/lightweight. If you are interested in how it's used, you can take a look of some examples, such as org.apache.hadoop.io.WritableComparable and the classes implementing it.

        Usually it may not be needed to make an interface/class be able to support several serialization approaches when one of them is sufficient for the usage.

        Additionally, leaving it not patched can prevent developers form using Java Serialization in Hadoop by mistake. With that said, we can always come back apply this patch when we see Java Serialization is a must in the future. Agree?

        Show
        Brandon Li added a comment - Hi Radim, Semantically, the patch is a good patch. It's not necessary to patch it for the following reason: ...all referenced objects must be deserializable, otherwise there is no way to reconstruct object tree back... We are aware of that. One of the reasons that Interface "Writeable" was used from the early days is it's simple/lightweight. If you are interested in how it's used, you can take a look of some examples, such as org.apache.hadoop.io.WritableComparable and the classes implementing it. Usually it may not be needed to make an interface/class be able to support several serialization approaches when one of them is sufficient for the usage. Additionally, leaving it not patched can prevent developers form using Java Serialization in Hadoop by mistake. With that said, we can always come back apply this patch when we see Java Serialization is a must in the future. Agree?
        Hide
        Radim Kolar added a comment -

        Java serialization do not capture constructor arguments into output stream. More detail about protocol is here http://docs.oracle.com/javase/6/docs/platform/serialization/spec/protocol.html

        If you create own constructor, java will not create default zero arg constructor http://xahlee.info/java-a-day/constructer.html - section "Java Technicality: Default Constructors"

        Show
        Radim Kolar added a comment - Java serialization do not capture constructor arguments into output stream. More detail about protocol is here http://docs.oracle.com/javase/6/docs/platform/serialization/spec/protocol.html If you create own constructor, java will not create default zero arg constructor http://xahlee.info/java-a-day/constructer.html - section "Java Technicality: Default Constructors"
        Hide
        Radim Kolar added a comment -

        There is difference between think and know. To make object serialization work, all referenced objects must be deserializable, otherwise there is no way to reconstruct object tree back. You can not deserialize object if you can no create it. Plain and simple.

        Lets take another approach: Hadoop is using findbugs, isnt it? It means that you guys have some trust and confidence in findbugs authors. Then trust them if they are telling you that deserialization do not works if you cant call constructor.

        Show
        Radim Kolar added a comment - There is difference between think and know. To make object serialization work, all referenced objects must be deserializable, otherwise there is no way to reconstruct object tree back. You can not deserialize object if you can no create it. Plain and simple. Lets take another approach: Hadoop is using findbugs, isnt it? It means that you guys have some trust and confidence in findbugs authors. Then trust them if they are telling you that deserialization do not works if you cant call constructor.
        Hide
        Yanbo Liang added a comment -

        @Radim Kolar
        There are different ways to serialize data include Writable, Thrift and Protocol Buffer. But I don't think it's necessary to serialize and transport comparator through the wire. Because in both sides of the communication, they should have the same code which include the comparator code. I think the comparator is part of the code rather than the data.

        Show
        Yanbo Liang added a comment - @Radim Kolar There are different ways to serialize data include Writable, Thrift and Protocol Buffer. But I don't think it's necessary to serialize and transport comparator through the wire. Because in both sides of the communication, they should have the same code which include the comparator code. I think the comparator is part of the code rather than the data.
        Hide
        Radim Kolar added a comment -

        Hadoop serialization framework aka Writable is not only possible way for serializing object. We use Thrift on top on Hadoop Writables.

        Without serializable comparator its not possible to have serializable collections with ordering.

        Show
        Radim Kolar added a comment - Hadoop serialization framework aka Writable is not only possible way for serializing object. We use Thrift on top on Hadoop Writables. Without serializable comparator its not possible to have serializable collections with ordering.
        Hide
        Brandon Li added a comment -

        ..., I think this issue is not necessary.

        Agree to resolve it as "won't fix".

        Show
        Brandon Li added a comment - ..., I think this issue is not necessary. Agree to resolve it as "won't fix".
        Hide
        Yanbo Liang added a comment -

        In Hadoop, the serialization/deserialization is implemented by Writable interface. I wonder whether there are some requirements to use the Java serialization system. If all the serialization system is accomplished by Writable interface, I think this issue is not necessary.

        Show
        Yanbo Liang added a comment - In Hadoop, the serialization/deserialization is implemented by Writable interface. I wonder whether there are some requirements to use the Java serialization system. If all the serialization system is accomplished by Writable interface, I think this issue is not necessary.
        Hide
        Brandon Li added a comment -

        Thanks for finding the issue. The patch should fix the problem.

        Actually in Hadoop core components, instead of java Serialization, interface "Writable" is wildly used to do in-memory(or on disk persistence) serialization and Protobuf is used for remote communication.

        Show
        Brandon Li added a comment - Thanks for finding the issue. The patch should fix the problem. Actually in Hadoop core components, instead of java Serialization, interface "Writable" is wildly used to do in-memory(or on disk persistence) serialization and Protobuf is used for remote communication.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12537747/writable-comparator.txt
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/1215//testReport/
        Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/1215//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12537747/writable-comparator.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/1215//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/1215//console This message is automatically generated.

          People

          • Assignee:
            Chris Douglas
            Reporter:
            Radim Kolar
          • Votes:
            0 Vote for this issue
            Watchers:
            12 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development