Hadoop Common
  1. Hadoop Common
  2. HADOOP-4697

KFS::getBlockLocations() fails with files having multiple blocks

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.18.0
    • Fix Version/s: 0.19.1
    • Component/s: fs
    • Labels:
      None
    • Environment:

      Hadoop on KFS

    • Hadoop Flags:
      Reviewed

      Description

      getBlockLocations() on KFS fail with the following stack trace for large files (with multiple blocks).

       java.lang.IllegalArgumentException: Offset 67108864 is outside of file
       (0..67108863)
              at
       org.apache.hadoop.mapred.FileInputFormat.getBlockIndex(FileInputFormat.java:336)
              at
       org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:248)
              at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
              at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1026)
              at org.apache.hadoop.examples.WordCount.run(WordCount.java:149)
              at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
              at org.apache.hadoop.examples.WordCount.main(WordCount.java:155)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at
       sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
              at
       sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
              at java.lang.reflect.Method.invoke(Method.java:597)
              at
       org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
              at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:141)
              at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:54)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      

      blkStart was not updated properly.

      1. patch.4697
        0.8 kB
        Sriram Rao

        Activity

        Hide
        Sriram Rao added a comment -

        A patch that fixes the issue is attached.

        No new test is included: this bug was found with a Hadoop+KFS deployment; it was tested in that deployment and verified. Verifying this issue requires a Hadoop+KFS deployment and this is done elsewhere.

        Show
        Sriram Rao added a comment - A patch that fixes the issue is attached. No new test is included: this bug was found with a Hadoop+KFS deployment; it was tested in that deployment and verified. Verifying this issue requires a Hadoop+KFS deployment and this is done elsewhere.
        Hide
        Lohit Vijayarenu added a comment -

        +1. Looks good.

        Show
        Lohit Vijayarenu added a comment - +1. Looks good.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12394365/patch.4697
        against trunk revision 719787.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no tests are needed for this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3630/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3630/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3630/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3630/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12394365/patch.4697 against trunk revision 719787. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3630/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3630/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3630/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3630/console This message is automatically generated.
        Hide
        Chris Douglas added a comment -

        I just committed this. Thanks, Sriram

        Show
        Chris Douglas added a comment - I just committed this. Thanks, Sriram
        Hide
        Hudson added a comment -

        Integrated in Hadoop-trunk #670 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/670/)
        . Fix getBlockLocations in KosmosFileSystem to handle multiple
        blocks correctly. Contributed by Sriram Rao.

        Show
        Hudson added a comment - Integrated in Hadoop-trunk #670 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/670/ ) . Fix getBlockLocations in KosmosFileSystem to handle multiple blocks correctly. Contributed by Sriram Rao.

          People

          • Assignee:
            Sriram Rao
            Reporter:
            Lohit Vijayarenu
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development