Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1908

DistributedRaidFileSystem does not handle ChecksumException correctly

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.22.0
    • Component/s: contrib/raid
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      ChecksumException reports the offset of corruption within a block,
      whereas DistributedRaidFileSystem.setAlternateLocations was expecting it
      to report the offset of corruption within the file.

      The best way of dealing with a missing block/corrupt block is to just
      use the current seek offset in the file as the position of corruption.

      1. MAPREDUCE-1908.2.patch
        6 kB
        Ramkumar Vadali
      2. MAPREDUCE-1908.patch
        5 kB
        Ramkumar Vadali

        Activity

        Hide
        Ramkumar Vadali added a comment -

        This patch make DistributedRaidFileSystem find the location of missing/corrupt data correctly.

        Show
        Ramkumar Vadali added a comment - This patch make DistributedRaidFileSystem find the location of missing/corrupt data correctly.
        Hide
        Ramkumar Vadali added a comment -

        ant test result:

        All but org.apache.hadoop.streaming.TestUlimit passed. TestUlimit failure is unrelated to this.

        ant test-patch result:

        [exec] +1 overall.
        [exec]
        [exec] +1 @author. The patch does not contain any @author tags.
        [exec]
        [exec] +1 tests included. The patch appears to include 2 new or modified tests.
        [exec]
        [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
        [exec]
        [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
        [exec]
        [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
        [exec]
        [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
        [exec]
        [exec] +1 system tests framework. The patch passed system tests framework compile.
        [exec]
        [exec]
        [exec]
        [exec]
        [exec] ======================================================================
        [exec] ======================================================================
        [exec] Finished build.
        [exec] ======================================================================
        [exec] ======================================================================

        Show
        Ramkumar Vadali added a comment - ant test result: All but org.apache.hadoop.streaming.TestUlimit passed. TestUlimit failure is unrelated to this. ant test-patch result: [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 2 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] +1 system tests framework. The patch passed system tests framework compile. [exec] [exec] [exec] [exec] [exec] ====================================================================== [exec] ====================================================================== [exec] Finished build. [exec] ====================================================================== [exec] ======================================================================
        Hide
        Scott Chen added a comment -

        Ram: Can you add one more unit test that corrupt more than one block in the stripe?

        Show
        Scott Chen added a comment - Ram: Can you add one more unit test that corrupt more than one block in the stripe?
        Hide
        Ramkumar Vadali added a comment -

        Modified test to corrupt two blocks in the same stripe and ensure failure.
        The test found an additional issue - need to disable caching to force the use of DFS.

        Show
        Ramkumar Vadali added a comment - Modified test to corrupt two blocks in the same stripe and ensure failure. The test found an additional issue - need to disable caching to force the use of DFS.
        Hide
        Scott Chen added a comment -

        +1 Looks good to me. Let's wait for your test results.

        Show
        Scott Chen added a comment - +1 Looks good to me. Let's wait for your test results.
        Hide
        Ramkumar Vadali added a comment -

        Test results under src/contrib/raid:

        test-junit:
        [junit] WARNING: multiple versions of ant detected in path for junit
        [junit] jar:file:/home/rvadali/local/external/ant/lib/ant.jar!/org/apache/tools/ant/Project.class
        [junit] and jar:file:/home/rvadali/.ivy2/cache/ant/ant/jars/ant-1.6.5.jar!/org/apache/tools/ant/Project.class
        [junit] Running org.apache.hadoop.hdfs.TestRaidDfs
        [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 14.223 sec
        [junit] Running org.apache.hadoop.raid.TestRaidHar
        [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 91.264 sec
        [junit] Running org.apache.hadoop.raid.TestRaidNode
        [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 62.755 sec
        [junit] Running org.apache.hadoop.raid.TestRaidPurge
        [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 23.959 sec

        test:

        BUILD SUCCESSFUL
        Total time: 3 minutes 25 seconds

        result of ant test-patch:

        [exec]
        [exec]
        [exec]
        [exec] +1 overall.
        [exec]
        [exec] +1 @author. The patch does not contain any @author tags.
        [exec]
        [exec] +1 tests included. The patch appears to include 2 new or modified tests.
        [exec]
        [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
        [exec]
        [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
        [exec]
        [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
        [exec]
        [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
        [exec]
        [exec] +1 system tests framework. The patch passed system tests framework compile.
        [exec]
        [exec]
        [exec]
        [exec]
        [exec] ======================================================================
        [exec] ======================================================================
        [exec] Finished build.
        [exec] ======================================================================
        [exec] ======================================================================
        [exec]
        [exec]

        BUILD SUCCESSFUL
        Total time: 18 minutes 27 seconds

        Show
        Ramkumar Vadali added a comment - Test results under src/contrib/raid: test-junit: [junit] WARNING: multiple versions of ant detected in path for junit [junit] jar: file:/home/rvadali/local/external/ant/lib/ant.jar!/org/apache/tools/ant/Project.class [junit] and jar: file:/home/rvadali/.ivy2/cache/ant/ant/jars/ant-1.6.5.jar!/org/apache/tools/ant/Project.class [junit] Running org.apache.hadoop.hdfs.TestRaidDfs [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 14.223 sec [junit] Running org.apache.hadoop.raid.TestRaidHar [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 91.264 sec [junit] Running org.apache.hadoop.raid.TestRaidNode [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 62.755 sec [junit] Running org.apache.hadoop.raid.TestRaidPurge [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 23.959 sec test: BUILD SUCCESSFUL Total time: 3 minutes 25 seconds result of ant test-patch: [exec] [exec] [exec] [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 2 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] +1 system tests framework. The patch passed system tests framework compile. [exec] [exec] [exec] [exec] [exec] ====================================================================== [exec] ====================================================================== [exec] Finished build. [exec] ====================================================================== [exec] ====================================================================== [exec] [exec] BUILD SUCCESSFUL Total time: 18 minutes 27 seconds
        Hide
        Scott Chen added a comment -

        Looks good.
        Let's wait for a few days to see if we get further comments. Then I will commit it.

        Show
        Scott Chen added a comment - Looks good. Let's wait for a few days to see if we get further comments. Then I will commit it.
        Hide
        Scott Chen added a comment -

        I just committed this. Thanks, Ram.

        Show
        Scott Chen added a comment - I just committed this. Thanks, Ram.

          People

          • Assignee:
            Ramkumar Vadali
            Reporter:
            Ramkumar Vadali
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development