Hadoop Common
  1. Hadoop Common
  2. HADOOP-6004

BlockLocation deserialization is incorrect

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: fs
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      The readFields method of BlockLocation does not correctly allocate new arrays for topologyPaths and hosts. Patch shortly.

      1. HADOOP-6004.patch
        4 kB
        Jakob Homan
      2. HADOOP-6004.patch
        4 kB
        Jakob Homan

        Activity

        Hide
        Hudson added a comment -
        Show
        Hudson added a comment - Integrated in Hadoop-trunk #872 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/872/ )
        Hide
        Tsz Wo Nicholas Sze added a comment -

        I have committed this. Thanks, Jakob!

        Show
        Tsz Wo Nicholas Sze added a comment - I have committed this. Thanks, Jakob!
        Hide
        Jakob Homan added a comment -

        Hudson is too slow. Ran unit tests locally. All pass except known-bad.

             [exec] +1 overall.  
             [exec] 
             [exec]     +1 @author.  The patch does not contain any @author tags.
             [exec] 
             [exec]     +1 tests included.  The patch appears to include 2 new or modified tests.
             [exec] 
             [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
             [exec] 
             [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
             [exec] 
             [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
             [exec] 
             [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
             [exec] 
             [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
        Show
        Jakob Homan added a comment - Hudson is too slow. Ran unit tests locally. All pass except known-bad. [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 2 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
        Hide
        Jakob Homan added a comment -

        re-submitting to hudson.

        Show
        Jakob Homan added a comment - re-submitting to hudson.
        Hide
        Jakob Homan added a comment -

        canceling patch to get submit for hudson again.

        Show
        Jakob Homan added a comment - canceling patch to get submit for hudson again.
        Hide
        Jakob Homan added a comment -

        I know. I moved the text section so it would be consistent with the other two sections. My comment was towards a more general fix of the code.

        Show
        Jakob Homan added a comment - I know. I moved the text section so it would be consistent with the other two sections. My comment was towards a more general fix of the code.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        > ... A better refactoring would be ...

        My suggestion is not a code refactoring: it reduces the number of object creations since "new Text()" is not needed.

        Show
        Tsz Wo Nicholas Sze added a comment - > ... A better refactoring would be ... My suggestion is not a code refactoring: it reduces the number of object creations since "new Text()" is not needed.
        Hide
        Jakob Homan added a comment -

        Quite right. A better refactoring would be to pull out the whole:

            int numHosts = in.readInt();
            this.hosts = new String[numHosts];
            for (int i = 0; i < numHosts; i++) {
              Text host = new Text();
              host.readFields(in);
              hosts[i] = host.toString();
            }
        

        block into a separate method, since it's repeated three times for each of the sections...

        Show
        Jakob Homan added a comment - Quite right. A better refactoring would be to pull out the whole: int numHosts = in.readInt(); this.hosts = new String[numHosts]; for (int i = 0; i < numHosts; i++) { Text host = new Text(); host.readFields(in); hosts[i] = host.toString(); } block into a separate method, since it's repeated three times for each of the sections...
        Hide
        Tsz Wo Nicholas Sze added a comment -

        +1 patch looks good.

        A nit: It is better to replace the existing codes

        Text t = new Text();
        t.readFields(in);
        variable = t.toString();
        

        by

        variable = Text.readString(in);
        

        for preventing object creation.

        Show
        Tsz Wo Nicholas Sze added a comment - +1 patch looks good. A nit: It is better to replace the existing codes Text t = new Text(); t.readFields(in); variable = t.toString(); by variable = Text.readString(in); for preventing object creation.
        Hide
        Jakob Homan added a comment -

        D'oh. Yes, it should be. Good catch. Uploaded new version.

        The BlockLocation class could be improved as there is an implicit requirement that the length of the hosts, names and topologies all should be the same length, which currently is not verified or enforced. A better implementation would be a list of tuples of

        {host, name, topology}

        with accompanying offset and length.

        Show
        Jakob Homan added a comment - D'oh. Yes, it should be. Good catch. Uploaded new version. The BlockLocation class could be improved as there is an implicit requirement that the length of the hosts, names and topologies all should be the same length, which currently is not verified or enforced. A better implementation would be a list of tuples of {host, name, topology} with accompanying offset and length.
        Hide
        Tsz Wo Nicholas Sze added a comment -
        +    topologyPaths = new String[numHosts];
        

        Should it use numTops?

        Show
        Tsz Wo Nicholas Sze added a comment - + topologyPaths = new String [numHosts]; Should it use numTops?
        Hide
        Jakob Homan added a comment -

        What kind of issues would this problem cause?

        Since the arrays were not resized before trying to assign new values, if you were trying to deserialize a BL with multiple hosts or topologyPaths, you would get an array index out of bounds exception, which is a runtime exception and wouldn't have been caught. Since this happened, there must be no code that's trying to do. The deserialization is still broken and should be fixed though.

        Is only trunk affected?

        I checked back to the 18.3 code and this problem is there. It would probably be a good idea to do a patch for that version as well.

        Show
        Jakob Homan added a comment - What kind of issues would this problem cause? Since the arrays were not resized before trying to assign new values, if you were trying to deserialize a BL with multiple hosts or topologyPaths, you would get an array index out of bounds exception, which is a runtime exception and wouldn't have been caught. Since this happened, there must be no code that's trying to do. The deserialization is still broken and should be fixed though. Is only trunk affected? I checked back to the 18.3 code and this problem is there. It would probably be a good idea to do a patch for that version as well.
        Hide
        Jakob Homan added a comment -

        Failing contrib test is known-bad testhdfsproxy. Patch is good to go.

        Show
        Jakob Homan added a comment - Failing contrib test is known-bad testhdfsproxy. Patch is good to go.
        Hide
        Bryan Duxbury added a comment -

        What kind of issues would this problem cause? Is only trunk affected?

        Show
        Bryan Duxbury added a comment - What kind of issues would this problem cause? Is only trunk affected?
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12410245/HADOOP-6004.patch
        against trunk revision 783672.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 2 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        -1 contrib tests. The patch failed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/490/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/490/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/490/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/490/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12410245/HADOOP-6004.patch against trunk revision 783672. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/490/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/490/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/490/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/490/console This message is automatically generated.
        Hide
        Jakob Homan added a comment -

        submitting patch.

        Show
        Jakob Homan added a comment - submitting patch.
        Hide
        Jakob Homan added a comment -
             [exec] +1 overall.  
             [exec] 
             [exec]     +1 @author.  The patch does not contain any @author tags.
             [exec] 
             [exec]     +1 tests included.  The patch appears to include 2 new or modified tests.
             [exec] 
             [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
             [exec] 
             [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
             [exec] 
             [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
             [exec] 
             [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
             [exec] 
             [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.

        Patch fixes bug where new array is not allocated during deserialization and provides new unit test to verify.

        Is there any reason BlockLocation is in fs rather than hdfs package?

        Show
        Jakob Homan added a comment - [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 2 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. Patch fixes bug where new array is not allocated during deserialization and provides new unit test to verify. Is there any reason BlockLocation is in fs rather than hdfs package?

          People

          • Assignee:
            Jakob Homan
            Reporter:
            Jakob Homan
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development