Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-4095

Avoid sharing AllocatorPerContext object in LocalDirAllocator between ShuffleHandler and LocalDirsHandlerService.

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.8.0, 3.0.0-alpha1
    • Component/s: nodemanager
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Currently ShuffleHandler and LocalDirsHandlerService share AllocatorPerContext object in LocalDirAllocator for configuration NM_LOCAL_DIRS because AllocatorPerContext are stored in a static TreeMap with configuration name as key

        private static Map <String, AllocatorPerContext> contexts = 
                       new TreeMap<String, AllocatorPerContext>();
      

      LocalDirsHandlerService and ShuffleHandler both create a LocalDirAllocator using NM_LOCAL_DIRS. Even they don't use the same Configuration object, but they will use the same AllocatorPerContext object. Also LocalDirsHandlerService may change NM_LOCAL_DIRS value in its Configuration object to exclude full and bad local dirs, ShuffleHandler always uses the original NM_LOCAL_DIRS value in its Configuration object. So every time AllocatorPerContext#confChanged is called by ShuffleHandler after LocalDirsHandlerService, AllocatorPerContext need be reinitialized because NM_LOCAL_DIRS value is changed. This will cause some overhead.

            String newLocalDirs = conf.get(contextCfgItemName);
            if (!newLocalDirs.equals(savedLocalDirs)) {
      

      So it will be a good improvement to not share the same AllocatorPerContext instance between ShuffleHandler and LocalDirsHandlerService.

      1. YARN-4095.000.patch
        7 kB
        zhihai xu
      2. YARN-4095.001.patch
        5 kB
        zhihai xu

        Issue Links

          Activity

          Hide
          zxu zhihai xu added a comment -

          I attached a patch YARN-4095.000.patch, which used a new configuration NM_GOOD_LOCAL_DIRS to create LocalDirAllocator in LocalDirsHandlerService to store the good local dirs. So we can avoid using the same configuration name to create LocalDirAllocator between ShuffleHandler and LocalDirsHandlerService. I also created a new configuration NM_GOOD_LOG_DIRS to match NM_GOOD_LOCAL_DIRS.

          Show
          zxu zhihai xu added a comment - I attached a patch YARN-4095 .000.patch, which used a new configuration NM_GOOD_LOCAL_DIRS to create LocalDirAllocator in LocalDirsHandlerService to store the good local dirs. So we can avoid using the same configuration name to create LocalDirAllocator between ShuffleHandler and LocalDirsHandlerService . I also created a new configuration NM_GOOD_LOG_DIRS to match NM_GOOD_LOCAL_DIRS.
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 17m 33s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
          +1 javac 7m 51s There were no new javac warning messages.
          +1 javadoc 10m 17s There were no new javadoc warning messages.
          +1 release audit 0m 21s The applied patch does not increase the total number of release audit warnings.
          -1 checkstyle 1m 24s The applied patch generated 1 new checkstyle issues (total was 211, now 211).
          +1 whitespace 0m 1s The patch has no lines that end in whitespace.
          +1 install 1m 31s mvn install still works.
          +1 eclipse:eclipse 0m 35s The patch built with eclipse:eclipse.
          +1 findbugs 2m 53s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          -1 yarn tests 0m 22s Tests failed in hadoop-yarn-api.
          -1 yarn tests 7m 34s Tests failed in hadoop-yarn-server-nodemanager.
              50m 35s  



          Reason Tests
          Failed unit tests hadoop.yarn.conf.TestYarnConfigurationFields
            hadoop.yarn.server.nodemanager.TestNodeStatusUpdaterForLabels



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12753220/YARN-4095.000.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / cf83156
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/8948/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
          hadoop-yarn-api test log https://builds.apache.org/job/PreCommit-YARN-Build/8948/artifact/patchprocess/testrun_hadoop-yarn-api.txt
          hadoop-yarn-server-nodemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8948/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8948/testReport/
          Java 1.7.0_55
          uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/8948/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 17m 33s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 7m 51s There were no new javac warning messages. +1 javadoc 10m 17s There were no new javadoc warning messages. +1 release audit 0m 21s The applied patch does not increase the total number of release audit warnings. -1 checkstyle 1m 24s The applied patch generated 1 new checkstyle issues (total was 211, now 211). +1 whitespace 0m 1s The patch has no lines that end in whitespace. +1 install 1m 31s mvn install still works. +1 eclipse:eclipse 0m 35s The patch built with eclipse:eclipse. +1 findbugs 2m 53s The patch does not introduce any new Findbugs (version 3.0.0) warnings. -1 yarn tests 0m 22s Tests failed in hadoop-yarn-api. -1 yarn tests 7m 34s Tests failed in hadoop-yarn-server-nodemanager.     50m 35s   Reason Tests Failed unit tests hadoop.yarn.conf.TestYarnConfigurationFields   hadoop.yarn.server.nodemanager.TestNodeStatusUpdaterForLabels Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12753220/YARN-4095.000.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / cf83156 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/8948/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt hadoop-yarn-api test log https://builds.apache.org/job/PreCommit-YARN-Build/8948/artifact/patchprocess/testrun_hadoop-yarn-api.txt hadoop-yarn-server-nodemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8948/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8948/testReport/ Java 1.7.0_55 uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/8948/console This message was automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 19m 20s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
          +1 javac 7m 46s There were no new javac warning messages.
          +1 javadoc 9m 50s There were no new javadoc warning messages.
          +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings.
          -1 checkstyle 1m 46s The applied patch generated 1 new checkstyle issues (total was 211, now 211).
          +1 whitespace 0m 0s The patch has no lines that end in whitespace.
          +1 install 1m 29s mvn install still works.
          +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse.
          +1 findbugs 4m 21s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 yarn tests 0m 23s Tests passed in hadoop-yarn-api.
          +1 yarn tests 1m 58s Tests passed in hadoop-yarn-common.
          -1 yarn tests 7m 29s Tests failed in hadoop-yarn-server-nodemanager.
              56m 7s  



          Reason Tests
          Failed unit tests hadoop.yarn.server.nodemanager.TestNodeStatusUpdaterForLabels



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12753223/YARN-4095.000.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / cf83156
          checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/8949/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
          hadoop-yarn-api test log https://builds.apache.org/job/PreCommit-YARN-Build/8949/artifact/patchprocess/testrun_hadoop-yarn-api.txt
          hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8949/artifact/patchprocess/testrun_hadoop-yarn-common.txt
          hadoop-yarn-server-nodemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8949/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8949/testReport/
          Java 1.7.0_55
          uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/8949/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 19m 20s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 7m 46s There were no new javac warning messages. +1 javadoc 9m 50s There were no new javadoc warning messages. +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings. -1 checkstyle 1m 46s The applied patch generated 1 new checkstyle issues (total was 211, now 211). +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 29s mvn install still works. +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse. +1 findbugs 4m 21s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 yarn tests 0m 23s Tests passed in hadoop-yarn-api. +1 yarn tests 1m 58s Tests passed in hadoop-yarn-common. -1 yarn tests 7m 29s Tests failed in hadoop-yarn-server-nodemanager.     56m 7s   Reason Tests Failed unit tests hadoop.yarn.server.nodemanager.TestNodeStatusUpdaterForLabels Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12753223/YARN-4095.000.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / cf83156 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/8949/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt hadoop-yarn-api test log https://builds.apache.org/job/PreCommit-YARN-Build/8949/artifact/patchprocess/testrun_hadoop-yarn-api.txt hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8949/artifact/patchprocess/testrun_hadoop-yarn-common.txt hadoop-yarn-server-nodemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8949/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8949/testReport/ Java 1.7.0_55 uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/8949/console This message was automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -



          +1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 16m 23s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
          +1 javac 7m 51s There were no new javac warning messages.
          +1 javadoc 10m 2s There were no new javadoc warning messages.
          +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings.
          +1 checkstyle 0m 36s There were no new checkstyle issues.
          +1 whitespace 0m 0s The patch has no lines that end in whitespace.
          +1 install 1m 28s mvn install still works.
          +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse.
          +1 findbugs 1m 12s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 yarn tests 7m 49s Tests passed in hadoop-yarn-server-nodemanager.
              46m 21s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12761363/YARN-4095.001.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 3a9c707
          hadoop-yarn-server-nodemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9226/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
          Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9226/testReport/
          Java 1.7.0_55
          uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/9226/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 pre-patch 16m 23s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 7m 51s There were no new javac warning messages. +1 javadoc 10m 2s There were no new javadoc warning messages. +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 0m 36s There were no new checkstyle issues. +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 28s mvn install still works. +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse. +1 findbugs 1m 12s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 yarn tests 7m 49s Tests passed in hadoop-yarn-server-nodemanager.     46m 21s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12761363/YARN-4095.001.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 3a9c707 hadoop-yarn-server-nodemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9226/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9226/testReport/ Java 1.7.0_55 uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/9226/console This message was automatically generated.
          Hide
          zxu zhihai xu added a comment -

          Hi [~Jason Lowe], Could you help review the patch? thanks

          Show
          zxu zhihai xu added a comment - Hi [~Jason Lowe] , Could you help review the patch? thanks
          Hide
          zxu zhihai xu added a comment -

          Hi Jason Lowe, Could you help review the patch? thanks

          Show
          zxu zhihai xu added a comment - Hi Jason Lowe , Could you help review the patch? thanks
          Hide
          zxu zhihai xu added a comment -

          The first patch put NM_GOOD_LOCAL_DIRS and NM_GOOD_LOG_DIRS in YarnConfiguration.java, the second patch moved them to LocalDirsHandlerService.java, since they are only used inside LocalDirsHandlerService.

          Show
          zxu zhihai xu added a comment - The first patch put NM_GOOD_LOCAL_DIRS and NM_GOOD_LOG_DIRS in YarnConfiguration.java, the second patch moved them to LocalDirsHandlerService.java, since they are only used inside LocalDirsHandlerService .
          Hide
          jlowe Jason Lowe added a comment -

          +1 lgtm. Committing this.

          Show
          jlowe Jason Lowe added a comment - +1 lgtm. Committing this.
          Hide
          jlowe Jason Lowe added a comment -

          Thanks zhihai xu! I committed this to trunk and branch-2.

          Show
          jlowe Jason Lowe added a comment - Thanks zhihai xu ! I committed this to trunk and branch-2.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #8503 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8503/)
          YARN-4095. Avoid sharing AllocatorPerContext object in LocalDirAllocator between ShuffleHandler and LocalDirsHandlerService. Contributed by Zhihai Xu (jlowe: rev c890c51a916894a985439497b8a44e8eee82d762)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LocalDirsHandlerService.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLocalDirsHandlerService.java
          • hadoop-yarn-project/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #8503 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8503/ ) YARN-4095 . Avoid sharing AllocatorPerContext object in LocalDirAllocator between ShuffleHandler and LocalDirsHandlerService. Contributed by Zhihai Xu (jlowe: rev c890c51a916894a985439497b8a44e8eee82d762) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LocalDirsHandlerService.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLocalDirsHandlerService.java hadoop-yarn-project/CHANGES.txt
          Hide
          zxu zhihai xu added a comment -

          Thanks Jason Lowe for reviewing and committing the patch!

          Show
          zxu zhihai xu added a comment - Thanks Jason Lowe for reviewing and committing the patch!
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #427 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/427/)
          YARN-4095. Avoid sharing AllocatorPerContext object in LocalDirAllocator between ShuffleHandler and LocalDirsHandlerService. Contributed by Zhihai Xu (jlowe: rev c890c51a916894a985439497b8a44e8eee82d762)

          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LocalDirsHandlerService.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLocalDirsHandlerService.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #427 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/427/ ) YARN-4095 . Avoid sharing AllocatorPerContext object in LocalDirAllocator between ShuffleHandler and LocalDirsHandlerService. Contributed by Zhihai Xu (jlowe: rev c890c51a916894a985439497b8a44e8eee82d762) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LocalDirsHandlerService.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLocalDirsHandlerService.java
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Yarn-trunk #1167 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/1167/)
          YARN-4095. Avoid sharing AllocatorPerContext object in LocalDirAllocator between ShuffleHandler and LocalDirsHandlerService. Contributed by Zhihai Xu (jlowe: rev c890c51a916894a985439497b8a44e8eee82d762)

          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LocalDirsHandlerService.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLocalDirsHandlerService.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Yarn-trunk #1167 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/1167/ ) YARN-4095 . Avoid sharing AllocatorPerContext object in LocalDirAllocator between ShuffleHandler and LocalDirsHandlerService. Contributed by Zhihai Xu (jlowe: rev c890c51a916894a985439497b8a44e8eee82d762) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LocalDirsHandlerService.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLocalDirsHandlerService.java
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #435 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/435/)
          YARN-4095. Avoid sharing AllocatorPerContext object in LocalDirAllocator between ShuffleHandler and LocalDirsHandlerService. Contributed by Zhihai Xu (jlowe: rev c890c51a916894a985439497b8a44e8eee82d762)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LocalDirsHandlerService.java
          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLocalDirsHandlerService.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #435 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/435/ ) YARN-4095 . Avoid sharing AllocatorPerContext object in LocalDirAllocator between ShuffleHandler and LocalDirsHandlerService. Contributed by Zhihai Xu (jlowe: rev c890c51a916894a985439497b8a44e8eee82d762) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LocalDirsHandlerService.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLocalDirsHandlerService.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #2346 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2346/)
          YARN-4095. Avoid sharing AllocatorPerContext object in LocalDirAllocator between ShuffleHandler and LocalDirsHandlerService. Contributed by Zhihai Xu (jlowe: rev c890c51a916894a985439497b8a44e8eee82d762)

          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLocalDirsHandlerService.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LocalDirsHandlerService.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2346 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2346/ ) YARN-4095 . Avoid sharing AllocatorPerContext object in LocalDirAllocator between ShuffleHandler and LocalDirsHandlerService. Contributed by Zhihai Xu (jlowe: rev c890c51a916894a985439497b8a44e8eee82d762) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLocalDirsHandlerService.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LocalDirsHandlerService.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk #2373 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2373/)
          YARN-4095. Avoid sharing AllocatorPerContext object in LocalDirAllocator between ShuffleHandler and LocalDirsHandlerService. Contributed by Zhihai Xu (jlowe: rev c890c51a916894a985439497b8a44e8eee82d762)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LocalDirsHandlerService.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLocalDirsHandlerService.java
          • hadoop-yarn-project/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #2373 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2373/ ) YARN-4095 . Avoid sharing AllocatorPerContext object in LocalDirAllocator between ShuffleHandler and LocalDirsHandlerService. Contributed by Zhihai Xu (jlowe: rev c890c51a916894a985439497b8a44e8eee82d762) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LocalDirsHandlerService.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLocalDirsHandlerService.java hadoop-yarn-project/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #408 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/408/)
          YARN-4095. Avoid sharing AllocatorPerContext object in LocalDirAllocator between ShuffleHandler and LocalDirsHandlerService. Contributed by Zhihai Xu (jlowe: rev c890c51a916894a985439497b8a44e8eee82d762)

          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLocalDirsHandlerService.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LocalDirsHandlerService.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #408 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/408/ ) YARN-4095 . Avoid sharing AllocatorPerContext object in LocalDirAllocator between ShuffleHandler and LocalDirsHandlerService. Contributed by Zhihai Xu (jlowe: rev c890c51a916894a985439497b8a44e8eee82d762) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLocalDirsHandlerService.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LocalDirsHandlerService.java
          Hide
          Feng Yuan Feng Yuan added a comment - - edited

          zhihai xu,thanks your patch for this issue.
          Excuse me, i am not very clear the goal this patch achieve.Such as avoid the heap memory leak like YARN-6277?
          because in:

                String newLocalDirs = conf.get(contextCfgItemName);
                if (!newLocalDirs.equals(savedLocalDirs)) {
          

          it create massive LocalFileSystem objects and cache them.
          If your purpose is fix this heap memory leak. I guess i will understand this issue completetly.
          And i have a idea, now that the issue is caused by the configuration is different in two place.
          And i notice that ShuffleHandler use a another conf object by clone(conf) method,how about let "SH" use the same conf?
          This leads to several benefits:
          1. ShuffleHandler service will timely know which disk is over-used(>95%),and will not write data to it,avoid some map output
          work to a overload disk and break by error "no space left...".
          2. if we could think over the implementation model in your patch, IMHO i feel it is not very grace just add a new name of local-dir.
          Thx.

          Show
          Feng Yuan Feng Yuan added a comment - - edited zhihai xu ,thanks your patch for this issue. Excuse me, i am not very clear the goal this patch achieve.Such as avoid the heap memory leak like YARN-6277 ? because in: String newLocalDirs = conf.get(contextCfgItemName); if (!newLocalDirs.equals(savedLocalDirs)) { it create massive LocalFileSystem objects and cache them. If your purpose is fix this heap memory leak. I guess i will understand this issue completetly. And i have a idea, now that the issue is caused by the configuration is different in two place. And i notice that ShuffleHandler use a another conf object by clone(conf) method,how about let "SH" use the same conf? This leads to several benefits: 1. ShuffleHandler service will timely know which disk is over-used(>95%),and will not write data to it,avoid some map output work to a overload disk and break by error "no space left...". 2. if we could think over the implementation model in your patch, IMHO i feel it is not very grace just add a new name of local-dir. Thx.
          Hide
          zxu zhihai xu added a comment -

          Feng Yuan, I think, For ShuffleHandler, we always want it to access all local directories which also include full local directories, since output data from mappers may be at the full local directories. Otherwise shuffle may fail due to data or index file can't be found in the good local directories.

                    Path indexFileName = lDirAlloc.getLocalPathToRead(
                        attemptBase + "/" + INDEX_FILE_NAME, conf);
                    Path mapOutputFileName = lDirAlloc.getLocalPathToRead(
                        attemptBase + "/" + DATA_FILE_NAME, conf);
              public Path getLocalPathToRead(String pathStr,
                  Configuration conf) throws IOException {
                Context ctx = confChanged(conf);
                int numDirs = ctx.localDirs.length;
                int numDirsSearched = 0;
                //remove the leading slash from the path (to make sure that the uri
                //resolution results in a valid path on the dir being checked)
                if (pathStr.startsWith("/")) {
                  pathStr = pathStr.substring(1);
                }
                while (numDirsSearched < numDirs) {
                  Path file = new Path(ctx.localDirs[numDirsSearched], pathStr);
                  if (ctx.localFS.exists(file)) {
                    return file;
                  }
                  numDirsSearched++;
                }
          
                //no path found
                throw new DiskErrorException ("Could not find " + pathStr +" in any of" +
                " the configured local directories");
              }
          

          I think This may be also the reason why we didn't want to use the same configuration between ShuffleHandler and LocalDirHandlerService.

          Show
          zxu zhihai xu added a comment - Feng Yuan , I think, For ShuffleHandler, we always want it to access all local directories which also include full local directories, since output data from mappers may be at the full local directories. Otherwise shuffle may fail due to data or index file can't be found in the good local directories. Path indexFileName = lDirAlloc.getLocalPathToRead( attemptBase + "/" + INDEX_FILE_NAME, conf); Path mapOutputFileName = lDirAlloc.getLocalPathToRead( attemptBase + "/" + DATA_FILE_NAME, conf); public Path getLocalPathToRead( String pathStr, Configuration conf) throws IOException { Context ctx = confChanged(conf); int numDirs = ctx.localDirs.length; int numDirsSearched = 0; //remove the leading slash from the path (to make sure that the uri //resolution results in a valid path on the dir being checked) if (pathStr.startsWith( "/" )) { pathStr = pathStr.substring(1); } while (numDirsSearched < numDirs) { Path file = new Path(ctx.localDirs[numDirsSearched], pathStr); if (ctx.localFS.exists(file)) { return file; } numDirsSearched++; } //no path found throw new DiskErrorException ( "Could not find " + pathStr + " in any of" + " the configured local directories" ); } I think This may be also the reason why we didn't want to use the same configuration between ShuffleHandler and LocalDirHandlerService.

            People

            • Assignee:
              zxu zhihai xu
              Reporter:
              zxu zhihai xu
            • Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development