Hadoop Common
  1. Hadoop Common
  2. HADOOP-3472

MapFile.Reader getClosest() function returns incorrect results when before is true

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.16.0, 0.16.1, 0.16.2, 0.16.3, 0.16.4, 0.17.0
    • Fix Version/s: 0.17.1
    • Component/s: io
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      The MapFile.Reader getClosest() method returns incorrect results due to an error in seekInternal(). The test case in trunk sets the index interval of the test MapFile to 1 which obscures the issue. There are several other errors in the test case as well that assert incorrect behavior for getClosest().

      I've got this fixed and tested, and will attach a patch.

      1. hadoop_mapfile.2.patch
        5 kB
        Todd Lipcon
      2. hadoop_mapfile.patch
        5 kB
        Todd Lipcon

        Activity

        Hide
        Todd Lipcon added a comment -

        Patch for unit test and MapFile.java

        Show
        Todd Lipcon added a comment - Patch for unit test and MapFile.java
        Hide
        Todd Lipcon added a comment -

        Oops - I had tab characters in the previous patch. Here's the same code but with no tabs.

        Show
        Todd Lipcon added a comment - Oops - I had tab characters in the previous patch. Here's the same code but with no tabs.
        Hide
        stack added a comment -

        +1 on patch. I'll commit this in a day or two unless objections.

        First I asserted there was indeed a problem with getClosest looking for the key that comes BEFORE the passed key when the index interval is not 1. In TestMapFile I did the following:

        Index: src/test/org/apache/hadoop/io/TestMapFile.java
        ===================================================================
        --- src/test/org/apache/hadoop/io/TestMapFile.java      (revision 662508)
        +++ src/test/org/apache/hadoop/io/TestMapFile.java      (working copy)
        @@ -39,11 +39,11 @@
             FileSystem fs = FileSystem.getLocal(conf);
             Path qualifiedDirName = fs.makeQualified(dirName);
             // Make an index entry for each insertion.
        -    MapFile.Writer.setIndexInterval(conf, 1);
        +    MapFile.Writer.setIndexInterval(conf, 3);
             MapFile.Writer writer = new MapFile.Writer(conf, fs,
               qualifiedDirName.toString(), Text.class, Text.class);
             // Assert that the index interval is 1
        -    assertEquals(1, writer.getIndexInterval());
        +    assertEquals(3, writer.getIndexInterval());
             // Add entries up to 100 in intervals of ten.
             final int FIRST_KEY = 10;
             for (int i = FIRST_KEY; i < 100; i += 10) {
        @@ -62,7 +62,8 @@
             assertTrue(closest.equals(new Text("60")));
             // Get closest that falls before the passed key: 50
             closest = (Text)reader.getClosest(key, value, true);
        -    assertTrue(closest.equals(new Text("50")));
        +    assertTrue("Closest should be 50 but is " + closest,
        +      closest.equals(new Text("50")));
             // Test get closest when we pass explicit key
             final Text TWENTY = new Text("20");
             closest = (Text)reader.getClosest(TWENTY, value);
        

        Test fails with:

        Tests run: 1, Failures: 1, Errors: 0, Time elapsed: 0.297 sec
        ------------- Standard Output ---------------
        2008-06-02 11:35:30,889 WARN  util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(51)) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
        ------------- ---------------- ---------------
        
        Testcase: testGetClosest took 0.295 sec
                FAILED
        Closest should be 50 but is 40
        junit.framework.AssertionFailedError: Closest should be 50 but is 40
                at org.apache.hadoop.io.TestMapFile.testGetClosest(TestMapFile.java:65)
        

        I then applied the patch. It makes more extensive changes to the test than the patch pasted above cleaning up TestTrue assertions replacing them with clearer TestEquals and adding assertions that focus on the bug the patch fixes. The test passes.

        Show
        stack added a comment - +1 on patch. I'll commit this in a day or two unless objections. First I asserted there was indeed a problem with getClosest looking for the key that comes BEFORE the passed key when the index interval is not 1. In TestMapFile I did the following: Index: src/test/org/apache/hadoop/io/TestMapFile.java =================================================================== --- src/test/org/apache/hadoop/io/TestMapFile.java (revision 662508) +++ src/test/org/apache/hadoop/io/TestMapFile.java (working copy) @@ -39,11 +39,11 @@ FileSystem fs = FileSystem.getLocal(conf); Path qualifiedDirName = fs.makeQualified(dirName); // Make an index entry for each insertion. - MapFile.Writer.setIndexInterval(conf, 1); + MapFile.Writer.setIndexInterval(conf, 3); MapFile.Writer writer = new MapFile.Writer(conf, fs, qualifiedDirName.toString(), Text.class, Text.class); // Assert that the index interval is 1 - assertEquals(1, writer.getIndexInterval()); + assertEquals(3, writer.getIndexInterval()); // Add entries up to 100 in intervals of ten. final int FIRST_KEY = 10; for ( int i = FIRST_KEY; i < 100; i += 10) { @@ -62,7 +62,8 @@ assertTrue(closest.equals( new Text( "60" ))); // Get closest that falls before the passed key: 50 closest = (Text)reader.getClosest(key, value, true ); - assertTrue(closest.equals( new Text( "50" ))); + assertTrue( "Closest should be 50 but is " + closest, + closest.equals( new Text( "50" ))); // Test get closest when we pass explicit key final Text TWENTY = new Text( "20" ); closest = (Text)reader.getClosest(TWENTY, value); Test fails with: Tests run: 1, Failures: 1, Errors: 0, Time elapsed: 0.297 sec ------------- Standard Output --------------- 2008-06-02 11:35:30,889 WARN util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(51)) - Unable to load native -hadoop library for your platform... using builtin-java classes where applicable ------------- ---------------- --------------- Testcase: testGetClosest took 0.295 sec FAILED Closest should be 50 but is 40 junit.framework.AssertionFailedError: Closest should be 50 but is 40 at org.apache.hadoop.io.TestMapFile.testGetClosest(TestMapFile.java:65) I then applied the patch. It makes more extensive changes to the test than the patch pasted above cleaning up TestTrue assertions replacing them with clearer TestEquals and adding assertions that focus on the bug the patch fixes. The test passes.
        Hide
        stack added a comment -

        I don't see this in the hudson queue.... retrying 'patch available'

        Show
        stack added a comment - I don't see this in the hudson queue.... retrying 'patch available'
        Hide
        stack added a comment -

        Resubmit.

        Also intend backporting this patch to 0.17 branch.

        Show
        stack added a comment - Resubmit. Also intend backporting this patch to 0.17 branch.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12383150/hadoop_mapfile.2.patch
        against trunk revision 662913.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        -1 patch. The patch command could not apply the patch.

        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2554/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12383150/hadoop_mapfile.2.patch against trunk revision 662913. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2554/console This message is automatically generated.
        Hide
        stack added a comment -

        Retry. See if patch builds are working again.

        Show
        stack added a comment - Retry. See if patch builds are working again.
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12383150/hadoop_mapfile.2.patch
        against trunk revision 662913.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2562/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2562/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2562/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2562/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12383150/hadoop_mapfile.2.patch against trunk revision 662913. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2562/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2562/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2562/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2562/console This message is automatically generated.
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12383150/hadoop_mapfile.2.patch
        against trunk revision 662913.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2563/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2563/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2563/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2563/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12383150/hadoop_mapfile.2.patch against trunk revision 662913. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2563/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2563/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2563/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2563/console This message is automatically generated.
        Hide
        stack added a comment -

        Committed to TRUNK and 0.17 branch. Thanks for the patch Todd.

        Show
        stack added a comment - Committed to TRUNK and 0.17 branch. Thanks for the patch Todd.
        Hide
        Hudson added a comment -
        Show
        Hudson added a comment - Integrated in Hadoop-trunk #523 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/523/ )

          People

          • Assignee:
            stack
            Reporter:
            Todd Lipcon
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development