Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1892

RaidNode can allow layered policies more efficiently

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.22.0
    • Component/s: contrib/raid
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      The RaidNode policy file can have layered policies that can cover a file more than once. To avoid processing a file multiple times (for RAIDing), RaidNode maintains a list of processed files that is used to avoid duplicate processing attempts.

      This is problematic in that a large number of processed files could cause the RaidNode to run out of memory.

      This task proposes a better method of detecting processed files. The method is based on the observation that a more selective policy will have a better match with a file name than a less selective one. Specifically, the more selective policy will have a longer common prefix with the file name.

      So to detect if a file has already been processed, the RaidNode only needs to maintain a list of processed policies and compare the lengths of the common prefixes. If the file has a longer common prefix with one of the processed policies than with the current policy, it can be assumed to be processed already.

      1. MAPREDUCE-1892.patch
        32 kB
        Ramkumar Vadali
      2. MAPREDUCE-1892.patch
        32 kB
        Ramkumar Vadali

        Activity

        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk #643 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk/643/)

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #643 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk/643/ )
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk-Commit #527 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/527/)
        MAPREDUCE-1892. RaidNode can allow layered policies more efficiently.
        (Ramkumar Vadali via schen)

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #527 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/527/ ) MAPREDUCE-1892 . RaidNode can allow layered policies more efficiently. (Ramkumar Vadali via schen)
        Hide
        Scott Chen added a comment -

        I just committed this. Thanks Ram.

        Show
        Scott Chen added a comment - I just committed this. Thanks Ram.
        Hide
        Ramkumar Vadali added a comment -

        Removed unused field.

        This patch passes ant test and ant test-patch.

        Show
        Ramkumar Vadali added a comment - Removed unused field. This patch passes ant test and ant test-patch.
        Hide
        Scott Chen added a comment -
        +    List<PolicyInfo> allPolicies = null;
        

        We can remove this field because it is not used.

        +1 Looks good to me.

        Show
        Scott Chen added a comment - + List<PolicyInfo> allPolicies = null ; We can remove this field because it is not used. +1 Looks good to me.
        Hide
        Ramkumar Vadali added a comment -

        This patch implements the proposal described.

        TEST RESULTS:

        ant test-patch:

             [exec] +1 overall.
             [exec]
             [exec]     +1 @author.  The patch does not contain any @author tags.
             [exec]
             [exec]     +1 tests included.  The patch appears to include 6 new or modified tests.
             [exec]
             [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
             [exec]
             [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
             [exec]
             [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
             [exec]
             [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
             [exec]
             [exec]     +1 system tests framework.  The patch passed system tests framework compile.
             [exec]
             [exec]
             [exec]
             [exec]
             [exec] ======================================================================
             [exec] ======================================================================
             [exec]     Finished build.
             [exec] ======================================================================
             [exec] ======================================================================
             [exec]
             [exec]
        
        BUILD SUCCESSFUL
        Total time: 16 minutes 14 seconds
        

        ant test under src/contrib/raid:

        test-junit:
            [junit] WARNING: multiple versions of ant detected in path for junit
            [junit]          jar:file:/home/rvadali/local/external/ant/lib/ant.jar!/org/apache/tools/ant/Project.class
            [junit]      and jar:file:/home/rvadali/.ivy2/cache/ant/ant/jars/ant-1.6.5.jar!/org/apache/tools/ant/Project.class
            [junit] Running org.apache.hadoop.hdfs.TestRaidDfs
            [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 42.868 sec
            [junit] Running org.apache.hadoop.raid.TestBlockFixer
            [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 135.269 sec
            [junit] Running org.apache.hadoop.raid.TestDirectoryTraversal
            [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 8.923 sec
            [junit] Running org.apache.hadoop.raid.TestErasureCodes
            [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 24.949 sec
            [junit] Running org.apache.hadoop.raid.TestGaloisField
            [junit] Tests run: 7, Failures: 0, Errors: 0, Time elapsed: 0.397 sec
            [junit] Running org.apache.hadoop.raid.TestHarIndexParser
            [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.052 sec
            [junit] Running org.apache.hadoop.raid.TestRaidFilter
            [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 4.476 sec
            [junit] Running org.apache.hadoop.raid.TestRaidHar
            [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 69.123 sec
            [junit] Running org.apache.hadoop.raid.TestRaidNode
            [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 466.8 sec
            [junit] Running org.apache.hadoop.raid.TestRaidPurge
            [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 108.928 sec
            [junit] Running org.apache.hadoop.raid.TestRaidShell
            [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 25.628 sec
        
        test:
        
        BUILD SUCCESSFUL
        Total time: 15 minutes 6 seconds
        
        
        Show
        Ramkumar Vadali added a comment - This patch implements the proposal described. TEST RESULTS: ant test-patch: [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 6 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] +1 system tests framework. The patch passed system tests framework compile. [exec] [exec] [exec] [exec] [exec] ====================================================================== [exec] ====================================================================== [exec] Finished build. [exec] ====================================================================== [exec] ====================================================================== [exec] [exec] BUILD SUCCESSFUL Total time: 16 minutes 14 seconds ant test under src/contrib/raid: test-junit: [junit] WARNING: multiple versions of ant detected in path for junit [junit] jar:file:/home/rvadali/local/external/ant/lib/ant.jar!/org/apache/tools/ant/Project.class [junit] and jar:file:/home/rvadali/.ivy2/cache/ant/ant/jars/ant-1.6.5.jar!/org/apache/tools/ant/Project.class [junit] Running org.apache.hadoop.hdfs.TestRaidDfs [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 42.868 sec [junit] Running org.apache.hadoop.raid.TestBlockFixer [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 135.269 sec [junit] Running org.apache.hadoop.raid.TestDirectoryTraversal [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 8.923 sec [junit] Running org.apache.hadoop.raid.TestErasureCodes [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 24.949 sec [junit] Running org.apache.hadoop.raid.TestGaloisField [junit] Tests run: 7, Failures: 0, Errors: 0, Time elapsed: 0.397 sec [junit] Running org.apache.hadoop.raid.TestHarIndexParser [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.052 sec [junit] Running org.apache.hadoop.raid.TestRaidFilter [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 4.476 sec [junit] Running org.apache.hadoop.raid.TestRaidHar [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 69.123 sec [junit] Running org.apache.hadoop.raid.TestRaidNode [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 466.8 sec [junit] Running org.apache.hadoop.raid.TestRaidPurge [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 108.928 sec [junit] Running org.apache.hadoop.raid.TestRaidShell [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 25.628 sec test: BUILD SUCCESSFUL Total time: 15 minutes 6 seconds

          People

          • Assignee:
            Ramkumar Vadali
            Reporter:
            Ramkumar Vadali
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development