Hadoop Common
  1. Hadoop Common
  2. HADOOP-6224

Add a method to WritableUtils performing a bounded read of a String

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: io
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      MAPREDUCE-318 needs to sanity check vint-length-encoded Strings read from a byte stream. It would be appropriate to add this as a general utility method.

      1. hadoop-6224.patch
        4 kB
        Jothi Padmanabhan
      2. hadoop-6224.patch
        1 kB
        Jothi Padmanabhan
      3. common.patch
        3 kB
        Jothi Padmanabhan

        Activity

        Hide
        Jothi Padmanabhan added a comment -

        Straight Forward Patch

        Show
        Jothi Padmanabhan added a comment - Straight Forward Patch
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12418118/common.patch
        against trunk revision 808415.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/632/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/632/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/632/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/632/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12418118/common.patch against trunk revision 808415. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/632/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/632/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/632/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/632/console This message is automatically generated.
        Hide
        Jothi Padmanabhan added a comment -

        bq, Please justify why no new tests are needed for this patch.

        This functionality will be comprehensively tested by MAPREDUCE-318

        Show
        Jothi Padmanabhan added a comment - bq, Please justify why no new tests are needed for this patch. This functionality will be comprehensively tested by MAPREDUCE-318
        Hide
        Devaraj Das added a comment -

        I just committed this. Thanks, Jothi and Arun!

        Show
        Devaraj Das added a comment - I just committed this. Thanks, Jothi and Arun!
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Common-trunk-Commit #3 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/3/)
        . Adds methods to read strings safely, makes the Buffer class in DataOutputBuffer public, and introduces public constructors there. These changes are required for MAPREDUCE-318. Contributed by Jothi Padmanabhan and Arun Murthy.

        Show
        Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #3 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/3/ ) . Adds methods to read strings safely, makes the Buffer class in DataOutputBuffer public, and introduces public constructors there. These changes are required for MAPREDUCE-318 . Contributed by Jothi Padmanabhan and Arun Murthy.
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Common-trunk #80 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk/80/)
        . Adds methods to read strings safely, makes the Buffer class in DataOutputBuffer public, and introduces public constructors there. These changes are required for MAPREDUCE-318. Contributed by Jothi Padmanabhan and Arun Murthy.

        Show
        Hudson added a comment - Integrated in Hadoop-Common-trunk #80 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk/80/ ) . Adds methods to read strings safely, makes the Buffer class in DataOutputBuffer public, and introduces public constructors there. These changes are required for MAPREDUCE-318 . Contributed by Jothi Padmanabhan and Arun Murthy.
        Hide
        Jothi Padmanabhan added a comment -

        I ran HDFS and MapRed tests using this jar, they all ran fine except for the known test failures, TestRecoveryManager and TestKillSubProcessesWithLinuxTaskController

        Show
        Jothi Padmanabhan added a comment - I ran HDFS and MapRed tests using this jar, they all ran fine except for the known test failures, TestRecoveryManager and TestKillSubProcessesWithLinuxTaskController
        Hide
        Hong Tang added a comment -

        I am actually a bit surprised by the patch.

        • One minor point: Would it be better readStringSafely move to WritableUtils?
        • One serious question: I am not sure why we need to open up the internal Buffer class of DataOutputBuffer, or need to support the construction based on an external byte[]. For the least, the ctor should clearly identify the ownership of the byte[]. But based on the implementation of DataOutputBuffer, it seems that the ownership has to be transferred to DataOutputBuffer and the caller of this ctor should not retain the reference to byte[].
        Show
        Hong Tang added a comment - I am actually a bit surprised by the patch. One minor point: Would it be better readStringSafely move to WritableUtils? One serious question: I am not sure why we need to open up the internal Buffer class of DataOutputBuffer, or need to support the construction based on an external byte[]. For the least, the ctor should clearly identify the ownership of the byte[]. But based on the implementation of DataOutputBuffer, it seems that the ownership has to be transferred to DataOutputBuffer and the caller of this ctor should not retain the reference to byte[].
        Hide
        Jothi Padmanabhan added a comment -

        Actually, you are right – the ownership of the internal buffer gets muddled with this patch.
        What we actually need in MAPREDUCE-318 was a way to create and handle streams based on memory buffers. I created HADOOP-6226 to handle this and will revert the changes to DataOutputBuffer in that patch.

        Show
        Jothi Padmanabhan added a comment - Actually, you are right – the ownership of the internal buffer gets muddled with this patch. What we actually need in MAPREDUCE-318 was a way to create and handle streams based on memory buffers. I created HADOOP-6226 to handle this and will revert the changes to DataOutputBuffer in that patch.
        Hide
        Devaraj Das added a comment -

        One minor point: Would it be better readStringSafely move to WritableUtils?

        Re-looking at the patch, i think it makes sense. +1

        One serious question: I am not sure why we need to open up the internal Buffer class of DataOutputBuffer, or need to support the construction based on an external byte[]. For the least, the ctor should clearly identify the ownership of the byte[]. But based on the implementation of DataOutputBuffer, it seems that the ownership has to be transferred to DataOutputBuffer and the caller of this ctor should not retain the reference to byte[].

        HADOOP-6226 was opened to address this. I propose that we actually close that jira, and have the implementation in the mapreduce package as part of MAPREDUCE-318

        Show
        Devaraj Das added a comment - One minor point: Would it be better readStringSafely move to WritableUtils? Re-looking at the patch, i think it makes sense. +1 One serious question: I am not sure why we need to open up the internal Buffer class of DataOutputBuffer, or need to support the construction based on an external byte[]. For the least, the ctor should clearly identify the ownership of the byte[]. But based on the implementation of DataOutputBuffer, it seems that the ownership has to be transferred to DataOutputBuffer and the caller of this ctor should not retain the reference to byte[]. HADOOP-6226 was opened to address this. I propose that we actually close that jira, and have the implementation in the mapreduce package as part of MAPREDUCE-318
        Hide
        Jothi Padmanabhan added a comment -

        Patch that moves readStringSafely to WritableUitls and reverts changes to DataOutputBuffer

        Show
        Jothi Padmanabhan added a comment - Patch that moves readStringSafely to WritableUitls and reverts changes to DataOutputBuffer
        Hide
        Konstantin Boudnik added a comment -
        • Using class name qualifier
          + int length = WritableUtils.readVInt(in);
          seems to be unnecessary since the method belongs to WritableUtils now
        • I'd suggest to change the exception argument's message from
          +      throw new IllegalArgumentException("String size was " + length + 
          +                                         ", which is outside of 0.." +
          +                                         maxLength);
          

          to

          +      throw new IllegalArgumentException("String size was " + length + 
          +                                         ", which is out of 0.." +
          +                                         maxLength + " range.");
          
        Show
        Konstantin Boudnik added a comment - Using class name qualifier + int length = WritableUtils.readVInt(in); seems to be unnecessary since the method belongs to WritableUtils now I'd suggest to change the exception argument's message from + throw new IllegalArgumentException("String size was " + length + + ", which is outside of 0.." + + maxLength); to + throw new IllegalArgumentException("String size was " + length + + ", which is out of 0.." + + maxLength + " range.");
        Hide
        Chris Douglas added a comment -

        Patch that moves readStringSafely to WritableUitls and reverts changes to DataOutputBuffer

        We shouldn't add layers of patches for discrete issues. I reverted the original patch. A patch adding the new method to WritableUtils makes sense.

        Show
        Chris Douglas added a comment - Patch that moves readStringSafely to WritableUitls and reverts changes to DataOutputBuffer We shouldn't add layers of patches for discrete issues. I reverted the original patch. A patch adding the new method to WritableUtils makes sense.
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Common-trunk-Commit #5 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/5/)
        Revert

        Show
        Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #5 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/5/ ) Revert
        Hide
        Tsz Wo Nicholas Sze added a comment -

        > I'd suggest to change the exception argument's message from ...
        The variable length indeed is the number of bytes, which could be different from the string size, depending on the encoding. We should make it clear in the exception message.

        +   * @param maxLength the largest acceptable length of string
        

        Similarly, maxLength is indeed the largest acceptable length of a encoded string.

        Show
        Tsz Wo Nicholas Sze added a comment - > I'd suggest to change the exception argument's message from ... The variable length indeed is the number of bytes, which could be different from the string size, depending on the encoding. We should make it clear in the exception message. + * @param maxLength the largest acceptable length of string Similarly, maxLength is indeed the largest acceptable length of a encoded string.
        Hide
        Jothi Padmanabhan added a comment -

        Patch incorporating the review comments

        Show
        Jothi Padmanabhan added a comment - Patch incorporating the review comments
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12418332/hadoop-6224.patch
        against trunk revision 810329.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/5/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/5/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/5/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/5/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12418332/hadoop-6224.patch against trunk revision 810329. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/5/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/5/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/5/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/5/console This message is automatically generated.
        Hide
        Chris Douglas added a comment -

        +1

        I committed this. Thanks, Jothi!

        Show
        Chris Douglas added a comment - +1 I committed this. Thanks, Jothi!
        Hide
        Chris Douglas added a comment -

        Updated summary/description

        (committed with the understanding that MAPREDUCE-830 will test this, including negative cases)

        Show
        Chris Douglas added a comment - Updated summary/description (committed with the understanding that MAPREDUCE-830 will test this, including negative cases)
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Common-trunk-Commit #7 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/7/)
        . Add a method to WritableUtils supported a bounded read of an
        encoded String. Contributed by Jothi Padmanabhan

        Show
        Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #7 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/7/ ) . Add a method to WritableUtils supported a bounded read of an encoded String. Contributed by Jothi Padmanabhan
        Hide
        Chris Douglas added a comment -

        Whoops; I meant MAPREDUCE-318 will test this

        Show
        Chris Douglas added a comment - Whoops; I meant MAPREDUCE-318 will test this
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Common-trunk #82 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk/82/)
        . Add a method to WritableUtils supported a bounded read of an
        encoded String. Contributed by Jothi Padmanabhan
        Revert

        Show
        Hudson added a comment - Integrated in Hadoop-Common-trunk #82 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk/82/ ) . Add a method to WritableUtils supported a bounded read of an encoded String. Contributed by Jothi Padmanabhan Revert
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk-Commit #11 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/11/)
        Checking in the updated common jar that has the updates for HADOOP-6226 and .

        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #11 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/11/ ) Checking in the updated common jar that has the updates for HADOOP-6226 and .
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk #70 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/70/)
        Checking in the updated common jar that has the updates for HADOOP-6226 and .

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #70 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/70/ ) Checking in the updated common jar that has the updates for HADOOP-6226 and .
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk-Commit #12 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/12/)
        Checking in the updated common jar that has the updates for HADOOP-6226 and .

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #12 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/12/ ) Checking in the updated common jar that has the updates for HADOOP-6226 and .
        Hide
        Hudson added a comment -

        Integrated in Hdfs-Patch-h5.grid.sp2.yahoo.net #8 (See http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/8/)

        Show
        Hudson added a comment - Integrated in Hdfs-Patch-h5.grid.sp2.yahoo.net #8 (See http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/8/ )
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk #71 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk/71/)
        Checking in the updated common jar that has the updates for HADOOP-6226 and .

        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #71 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk/71/ ) Checking in the updated common jar that has the updates for HADOOP-6226 and .

          People

          • Assignee:
            Jothi Padmanabhan
            Reporter:
            Jothi Padmanabhan
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development