Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2150

RaidNode should periodically fix corrupt blocks

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.22.0
    • Component/s: contrib/raid
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      (Recreating HDFS-1171 since RAID is in mapreduce)

      RaidNode currently does not fix missing blocks. The missing blocks have to be fixed using RaidShell.
      This task proposes that recovery be more automated:
      1. RaidNode periodically fetches a list of corrupt files from the NameNode
      2. If the corrupt files can be fixed using RAID, it should generate the block.
      3. Choose a datanode and send the block contents along with checksum to the datanode.

      1. MAPREDUCE-2150.patch
        31 kB
        Ramkumar Vadali
      2. MAPREDUCE-2150.2.patch
        31 kB
        Ramkumar Vadali

        Issue Links

          Activity

          Hide
          Scott Chen added a comment -

          I just committed this. Thanks Ram!

          Show
          Scott Chen added a comment - I just committed this. Thanks Ram!
          Hide
          Scott Chen added a comment -

          +1 The code looks good to me.

          Show
          Scott Chen added a comment - +1 The code looks good to me.
          Hide
          Ramkumar Vadali added a comment -

          Scott, this patch passes tests. Could you please review it?

          Show
          Ramkumar Vadali added a comment - Scott, this patch passes tests. Could you please review it?
          Hide
          Ramkumar Vadali added a comment -

          Test results:

          ant test-patch:

               [exec] 
               [exec] BUILD SUCCESSFUL
               [exec] Total time: 1 minute 48 seconds
               [exec] 
               [exec] 
               [exec] 
               [exec] 
               [exec] +1 overall.  
               [exec] 
               [exec]     +1 @author.  The patch does not contain any @author tags.
               [exec] 
               [exec]     +1 tests included.  The patch appears to include 2 new or modified tests.
               [exec] 
               [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
               [exec] 
               [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
               [exec] 
               [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
               [exec] 
               [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
               [exec] 
               [exec]     +1 system tests framework.  The patch passed system tests framework compile.
               [exec] 
               [exec] 
               [exec] 
               [exec] 
               [exec] ======================================================================
               [exec] ======================================================================
               [exec]     Finished build.
               [exec] ======================================================================
               [exec] ======================================================================
               [exec] 
               [exec] 
          
          

          ant test under src/contrib/raid:

          
          test-junit:
              [junit] WARNING: multiple versions of ant detected in path for junit 
              [junit]          jar:file:/home/rvadali/local/external/ant/lib/ant.jar!/org/apache/tools/ant/Project.class
              [junit]      and jar:file:/home/rvadali/.ivy2/cache/ant/ant/jars/ant-1.6.5.jar!/org/apache/tools/ant/Project.class
              [junit] Running org.apache.hadoop.hdfs.TestRaidDfs
              [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 26.641 sec
              [junit] Running org.apache.hadoop.raid.TestBlockFixer
              [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 160.923 sec
              [junit] Running org.apache.hadoop.raid.TestDirectoryTraversal
              [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 9.201 sec
              [junit] Running org.apache.hadoop.raid.TestHarIndexParser
              [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.052 sec
              [junit] Running org.apache.hadoop.raid.TestRaidHar
              [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 70.449 sec
              [junit] Running org.apache.hadoop.raid.TestRaidNode
              [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 410.562 sec
              [junit] Running org.apache.hadoop.raid.TestRaidPurge
              [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 44.504 sec
              [junit] Running org.apache.hadoop.raid.TestRaidShell
              [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 23.106 sec
          
          test:
          
          BUILD SUCCESSFUL
          
          
          Show
          Ramkumar Vadali added a comment - Test results: ant test-patch: [exec] [exec] BUILD SUCCESSFUL [exec] Total time: 1 minute 48 seconds [exec] [exec] [exec] [exec] [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 2 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] +1 system tests framework. The patch passed system tests framework compile. [exec] [exec] [exec] [exec] [exec] ====================================================================== [exec] ====================================================================== [exec] Finished build. [exec] ====================================================================== [exec] ====================================================================== [exec] [exec] ant test under src/contrib/raid: test-junit: [junit] WARNING: multiple versions of ant detected in path for junit [junit] jar:file:/home/rvadali/local/external/ant/lib/ant.jar!/org/apache/tools/ant/Project.class [junit] and jar:file:/home/rvadali/.ivy2/cache/ant/ant/jars/ant-1.6.5.jar!/org/apache/tools/ant/Project.class [junit] Running org.apache.hadoop.hdfs.TestRaidDfs [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 26.641 sec [junit] Running org.apache.hadoop.raid.TestBlockFixer [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 160.923 sec [junit] Running org.apache.hadoop.raid.TestDirectoryTraversal [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 9.201 sec [junit] Running org.apache.hadoop.raid.TestHarIndexParser [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.052 sec [junit] Running org.apache.hadoop.raid.TestRaidHar [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 70.449 sec [junit] Running org.apache.hadoop.raid.TestRaidNode [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 410.562 sec [junit] Running org.apache.hadoop.raid.TestRaidPurge [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 44.504 sec [junit] Running org.apache.hadoop.raid.TestRaidShell [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 23.106 sec test: BUILD SUCCESSFUL
          Hide
          Ramkumar Vadali added a comment -

          Fixing a test failure

          Show
          Ramkumar Vadali added a comment - Fixing a test failure
          Hide
          Ramkumar Vadali added a comment -

          This patch uses DFSck to get a list of corrupt files periodically and fixes source/parity/parity-har blocks automatically.

          Show
          Ramkumar Vadali added a comment - This patch uses DFSck to get a list of corrupt files periodically and fixes source/parity/parity-har blocks automatically.

            People

            • Assignee:
              Ramkumar Vadali
              Reporter:
              Ramkumar Vadali
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development