Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2132

Need a command line option in RaidShell to fix blocks using raid

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.22.0
    • Component/s: contrib/raid
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      RaidShell currently has an option to recover a file and return the path to the recovered file. The administrator can then rename the recovered file to the damaged file.

      The problem with this is that the file metadata is altered, specifically the modification time. Instead we need a way to just repair the damaged blocks and send the fixed blocks to a data node.

      Once this is done, we can put automation around it.

      1. MAPREDUCE-2132.patch
        72 kB
        Ramkumar Vadali
      2. MAPREDUCE-2132.2.patch
        73 kB
        Ramkumar Vadali
      3. MAPREDUCE-2132.3.patch
        73 kB
        Ramkumar Vadali

        Activity

        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk #643 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk/643/)

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #643 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk/643/ )
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk-Commit #523 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/523/)

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #523 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/523/ )
        Hide
        Scott Chen added a comment -

        Note that we may need to refactor this patch once HDFS-1461 is resolved.

        Show
        Scott Chen added a comment - Note that we may need to refactor this patch once HDFS-1461 is resolved.
        Hide
        Scott Chen added a comment -

        I just committed the patch, Thanks Ram.

        Show
        Scott Chen added a comment - I just committed the patch, Thanks Ram.
        Hide
        Scott Chen added a comment -

        Thanks for the change. I will commit this later today.

        Show
        Scott Chen added a comment - Thanks for the change. I will commit this later today.
        Hide
        Ramkumar Vadali added a comment -

        Reverted unintentional change to CHANGES.txt

        Show
        Ramkumar Vadali added a comment - Reverted unintentional change to CHANGES.txt
        Hide
        Scott Chen added a comment -

        Hey Ram: Can you remove the change in CHANGES.txt.

        -    MAPREDUCE-2140. Regenerate fair scheduler design doc PDF. (matei)
        -
         
        Show
        Scott Chen added a comment - Hey Ram: Can you remove the change in CHANGES.txt. - MAPREDUCE-2140. Regenerate fair scheduler design doc PDF. (matei) -
        Hide
        Scott Chen added a comment -

        +1 The patch looks good.

        Show
        Scott Chen added a comment - +1 The patch looks good.
        Hide
        Ramkumar Vadali added a comment -

        The latest patch passes testing:

        [exec]
        [exec] +1 overall.
        [exec]
        [exec] +1 @author. The patch does not contain any @author tags.
        [exec]
        [exec] +1 tests included. The patch appears to include 7 new or modified tests.
        [exec]
        [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
        [exec]
        [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
        [exec]
        [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
        [exec]
        [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
        [exec]
        [exec] +1 system tests framework. The patch passed system tests framework compile.
        [exec]
        [exec]
        [exec]
        [exec]
        [exec] ======================================================================
        [exec] ======================================================================
        [exec] Finished build.
        [exec] ======================================================================
        [exec] ======================================================================
        [exec]

        test-junit:
        [junit] WARNING: multiple versions of ant detected in path for junit
        [junit] jar:file:/home/rvadali/local/external/ant/lib/ant.jar!/org/apache/tools/ant/Project.class
        [junit] and jar:file:/home/rvadali/.ivy2/cache/ant/ant/jars/ant-1.6.5.jar!/org/apache/tools/ant/Project.class
        [junit] Running org.apache.hadoop.hdfs.TestRaidDfs
        [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 31.133 sec
        [junit] Running org.apache.hadoop.raid.TestDirectoryTraversal
        [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 8.116 sec
        [junit] Running org.apache.hadoop.raid.TestHarIndexParser
        [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.052 sec
        [junit] Running org.apache.hadoop.raid.TestRaidHar
        [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 68.637 sec
        [junit] Running org.apache.hadoop.raid.TestRaidNode
        [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 523.123 sec
        [junit] Running org.apache.hadoop.raid.TestRaidPurge
        [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 43.853 sec
        [junit] Running org.apache.hadoop.raid.TestRaidShell
        [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 24.41 sec

        test:

        BUILD SUCCESSFUL
        Total time: 11 minutes 55 seconds

        Show
        Ramkumar Vadali added a comment - The latest patch passes testing: [exec] [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 7 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] +1 system tests framework. The patch passed system tests framework compile. [exec] [exec] [exec] [exec] [exec] ====================================================================== [exec] ====================================================================== [exec] Finished build. [exec] ====================================================================== [exec] ====================================================================== [exec] test-junit: [junit] WARNING: multiple versions of ant detected in path for junit [junit] jar: file:/home/rvadali/local/external/ant/lib/ant.jar!/org/apache/tools/ant/Project.class [junit] and jar: file:/home/rvadali/.ivy2/cache/ant/ant/jars/ant-1.6.5.jar!/org/apache/tools/ant/Project.class [junit] Running org.apache.hadoop.hdfs.TestRaidDfs [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 31.133 sec [junit] Running org.apache.hadoop.raid.TestDirectoryTraversal [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 8.116 sec [junit] Running org.apache.hadoop.raid.TestHarIndexParser [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.052 sec [junit] Running org.apache.hadoop.raid.TestRaidHar [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 68.637 sec [junit] Running org.apache.hadoop.raid.TestRaidNode [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 523.123 sec [junit] Running org.apache.hadoop.raid.TestRaidPurge [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 43.853 sec [junit] Running org.apache.hadoop.raid.TestRaidShell [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 24.41 sec test: BUILD SUCCESSFUL Total time: 11 minutes 55 seconds
        Hide
        Ramkumar Vadali added a comment -

        Incorporated Scott's comments.
        Scott, using endsWith() does not work since the corrupt file is under the RAID HAR directory, so the path will end with part-xxxx

        Show
        Ramkumar Vadali added a comment - Incorporated Scott's comments. Scott, using endsWith() does not work since the corrupt file is under the RAID HAR directory, so the path will end with part-xxxx
        Hide
        Scott Chen added a comment -

        Thanks for the good work, Ram. This is really nice.
        I only have some minor comments.

        +    if (pathStr.contains(RaidNode.HAR_SUFFIX)) {
        

        Can we use endsWith() to make it more specific?

        +    Path indexFile = new Path(harDirectory + "/_index");
        

        Can we use some constant like HAR_INDEX_FILENAME here?

        It seems the current test fixes only the source block.
        Is it possible that you can add a test case that fix the parity file?

        Show
        Scott Chen added a comment - Thanks for the good work, Ram. This is really nice. I only have some minor comments. + if (pathStr.contains(RaidNode.HAR_SUFFIX)) { Can we use endsWith() to make it more specific? + Path indexFile = new Path(harDirectory + "/_index" ); Can we use some constant like HAR_INDEX_FILENAME here? It seems the current test fixes only the source block. Is it possible that you can add a test case that fix the parity file?
        Hide
        Ramkumar Vadali added a comment -

        Test results:

        ant test:

        
        test-junit:
            [junit] WARNING: multiple versions of ant detected in path for junit
            [junit]          jar:file:/home/rvadali/local/external/ant/lib/ant.jar!/org/apache/tools/ant/Project.class
            [junit]      and jar:file:/home/rvadali/.ivy2/cache/ant/ant/jars/ant-1.6.5.jar!/org/apache/tools/ant/Project.class
            [junit] Running org.apache.hadoop.hdfs.TestRaidDfs
            [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 32.409 sec
            [junit] Running org.apache.hadoop.raid.TestDirectoryTraversal
            [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 8.053 sec
            [junit] Running org.apache.hadoop.raid.TestHarIndexParser
            [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.051 sec
            [junit] Running org.apache.hadoop.raid.TestRaidHar
            [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 68.709 sec
            [junit] Running org.apache.hadoop.raid.TestRaidNode
            [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 463.784 sec
            [junit] Running org.apache.hadoop.raid.TestRaidPurge
            [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 44.057 sec
            [junit] Running org.apache.hadoop.raid.TestRaidShell
            [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 22.813 sec
        
        test:
        
        BUILD SUCCESSFUL
        Total time: 10 minutes 58 seconds
        

        ant test-patch

             [exec] 
             [exec] +1 overall.  
             [exec] 
             [exec]     +1 @author.  The patch does not contain any @author tags.
             [exec] 
             [exec]     +1 tests included.  The patch appears to include 7 new or modified tests.
             [exec] 
             [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
             [exec] 
             [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
             [exec] 
             [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
             [exec] 
             [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
             [exec] 
             [exec]     +1 system tests framework.  The patch passed system tests framework compile.
             [exec] 
             [exec] 
             [exec] 
             [exec] 
             [exec] ======================================================================
             [exec] ======================================================================
             [exec]     Finished build.
             [exec] ======================================================================
             [exec] ======================================================================
             [exec] 
             [exec] 
        
        BUILD SUCCESSFUL
        Total time: 20 minutes 3 seconds
        
        Show
        Ramkumar Vadali added a comment - Test results: ant test: test-junit: [junit] WARNING: multiple versions of ant detected in path for junit [junit] jar:file:/home/rvadali/local/external/ant/lib/ant.jar!/org/apache/tools/ant/Project.class [junit] and jar:file:/home/rvadali/.ivy2/cache/ant/ant/jars/ant-1.6.5.jar!/org/apache/tools/ant/Project.class [junit] Running org.apache.hadoop.hdfs.TestRaidDfs [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 32.409 sec [junit] Running org.apache.hadoop.raid.TestDirectoryTraversal [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 8.053 sec [junit] Running org.apache.hadoop.raid.TestHarIndexParser [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.051 sec [junit] Running org.apache.hadoop.raid.TestRaidHar [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 68.709 sec [junit] Running org.apache.hadoop.raid.TestRaidNode [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 463.784 sec [junit] Running org.apache.hadoop.raid.TestRaidPurge [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 44.057 sec [junit] Running org.apache.hadoop.raid.TestRaidShell [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 22.813 sec test: BUILD SUCCESSFUL Total time: 10 minutes 58 seconds ant test-patch [exec] [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 7 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] +1 system tests framework. The patch passed system tests framework compile. [exec] [exec] [exec] [exec] [exec] ====================================================================== [exec] ====================================================================== [exec] Finished build. [exec] ====================================================================== [exec] ====================================================================== [exec] [exec] BUILD SUCCESSFUL Total time: 20 minutes 3 seconds

          People

          • Assignee:
            Ramkumar Vadali
            Reporter:
            Ramkumar Vadali
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development