Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-2730

DefaultContainerExecutor runs only one localizer at a time

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 2.4.0
    • Fix Version/s: 2.6.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      We are seeing that when one of the localizerRunner stuck, the rest of the localizerRunners are blocked. We should remove the synchronized modifier.
      The synchronized modifier appears to have been added by https://issues.apache.org/jira/browse/MAPREDUCE-3537
      It could be removed if Localizer doesn't depend on current directory

      1. YARN-2730.v1.patch
        10 kB
        Siqi Li
      2. YARN-2730.v2.patch
        2 kB
        Siqi Li
      3. YARN-2730.v3.patch
        3 kB
        Siqi Li

        Issue Links

          Activity

          Hide
          l201514 Siqi Li added a comment -

          in case of YARN-2714, all localizerRunner thread are getting stuck, and every task landed on this node will get stuck. If we remove the synchronized modifier, it will resolve this problem

          Show
          l201514 Siqi Li added a comment - in case of YARN-2714 , all localizerRunner thread are getting stuck, and every task landed on this node will get stuck. If we remove the synchronized modifier, it will resolve this problem
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12676480/YARN-2730.v1.patch
          against trunk revision 3b12fd6.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          -1 findbugs. The patch appears to introduce 1 new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5507//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/5507//artifact/patchprocess/newPatchFindbugsWarningshadoop-common.html
          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5507//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12676480/YARN-2730.v1.patch against trunk revision 3b12fd6. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. -1 findbugs . The patch appears to introduce 1 new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5507//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/5507//artifact/patchprocess/newPatchFindbugsWarningshadoop-common.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5507//console This message is automatically generated.
          Hide
          l201514 Siqi Li added a comment -

          Jason Lowe Hi Jason, do you have some time to take a look at this patch?

          Show
          l201514 Siqi Li added a comment - Jason Lowe Hi Jason, do you have some time to take a look at this patch?
          Hide
          jlowe Jason Lowe added a comment -

          Thanks for the patch, Siqi.

          We could go two ways with this. We should be able to solve it without modifying FileContext at all by having startLocalizer create a clone like this:

            FileContext localizerFc = FileContext.getFileContext(lfs.getDefaultFilesytem(), getConf());
            localizerFc.setUMask(lfs.getUMask());
          

          Or we could add a cloning ability to FileContext directly, like this patch. If we continue with that route then we need to clone everything in case the clone method is used elsewhere. The workingDir is not copied, and thus it isn't a true clone. One might want to clone a filecontext just to modify the umask, for example, and therefore callers would expect the working directory to be preserved during the clone.

          TestFileContext: Configuration and StringUtils are unused imports.

          Nit: some original lines that were formatted for 80 columns no longer are after the changes.

          Show
          jlowe Jason Lowe added a comment - Thanks for the patch, Siqi. We could go two ways with this. We should be able to solve it without modifying FileContext at all by having startLocalizer create a clone like this: FileContext localizerFc = FileContext.getFileContext(lfs.getDefaultFilesytem(), getConf()); localizerFc.setUMask(lfs.getUMask()); Or we could add a cloning ability to FileContext directly, like this patch. If we continue with that route then we need to clone everything in case the clone method is used elsewhere. The workingDir is not copied, and thus it isn't a true clone. One might want to clone a filecontext just to modify the umask, for example, and therefore callers would expect the working directory to be preserved during the clone. TestFileContext: Configuration and StringUtils are unused imports. Nit: some original lines that were formatted for 80 columns no longer are after the changes.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12678526/YARN-2730.v2.patch
          against trunk revision f1a149e.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5663//testReport/
          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5663//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12678526/YARN-2730.v2.patch against trunk revision f1a149e. +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5663//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5663//console This message is automatically generated.
          Hide
          l201514 Siqi Li added a comment -

          Thanks Jason Lowe for your feedback, I have updated the patch using the first approached you mentioned above.

          Show
          l201514 Siqi Li added a comment - Thanks Jason Lowe for your feedback, I have updated the patch using the first approached you mentioned above.
          Hide
          jlowe Jason Lowe added a comment -

          Thanks for updating the patch, Siqi. In the latest patch the startLocalizer method is still synchronized, so the original problem remains.

          Show
          jlowe Jason Lowe added a comment - Thanks for updating the patch, Siqi. In the latest patch the startLocalizer method is still synchronized, so the original problem remains.
          Hide
          l201514 Siqi Li added a comment -

          Sorry about that, I have updated the patch with the synchronized keyword removed

          Show
          l201514 Siqi Li added a comment - Sorry about that, I have updated the patch with the synchronized keyword removed
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12678985/YARN-2730.v3.patch
          against trunk revision 67f13b5.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5699//testReport/
          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5699//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12678985/YARN-2730.v3.patch against trunk revision 67f13b5. +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5699//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5699//console This message is automatically generated.
          Hide
          jlowe Jason Lowe added a comment -

          +1 for the latest patch, committing this.

          Show
          jlowe Jason Lowe added a comment - +1 for the latest patch, committing this.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #6425 (See https://builds.apache.org/job/Hadoop-trunk-Commit/6425/)
          YARN-2730. DefaultContainerExecutor runs only one localizer at a time. Contributed by Siqi Li (jlowe: rev 6157ace5475fff8d2513fd3cd99134b532b0b406)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DefaultContainerExecutor.java
          • hadoop-yarn-project/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #6425 (See https://builds.apache.org/job/Hadoop-trunk-Commit/6425/ ) YARN-2730 . DefaultContainerExecutor runs only one localizer at a time. Contributed by Siqi Li (jlowe: rev 6157ace5475fff8d2513fd3cd99134b532b0b406) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DefaultContainerExecutor.java hadoop-yarn-project/CHANGES.txt
          Hide
          jlowe Jason Lowe added a comment -

          Thanks, Siqi! I committed this to trunk, branch-2, and branch-2.6.

          Show
          jlowe Jason Lowe added a comment - Thanks, Siqi! I committed this to trunk, branch-2, and branch-2.6.
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Yarn-trunk #733 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/733/)
          YARN-2730. DefaultContainerExecutor runs only one localizer at a time. Contributed by Siqi Li (jlowe: rev 6157ace5475fff8d2513fd3cd99134b532b0b406)

          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DefaultContainerExecutor.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Yarn-trunk #733 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/733/ ) YARN-2730 . DefaultContainerExecutor runs only one localizer at a time. Contributed by Siqi Li (jlowe: rev 6157ace5475fff8d2513fd3cd99134b532b0b406) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DefaultContainerExecutor.java
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Hdfs-trunk #1922 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1922/)
          YARN-2730. DefaultContainerExecutor runs only one localizer at a time. Contributed by Siqi Li (jlowe: rev 6157ace5475fff8d2513fd3cd99134b532b0b406)

          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DefaultContainerExecutor.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Hdfs-trunk #1922 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1922/ ) YARN-2730 . DefaultContainerExecutor runs only one localizer at a time. Contributed by Siqi Li (jlowe: rev 6157ace5475fff8d2513fd3cd99134b532b0b406) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DefaultContainerExecutor.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk #1947 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1947/)
          YARN-2730. DefaultContainerExecutor runs only one localizer at a time. Contributed by Siqi Li (jlowe: rev 6157ace5475fff8d2513fd3cd99134b532b0b406)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DefaultContainerExecutor.java
          • hadoop-yarn-project/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1947 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1947/ ) YARN-2730 . DefaultContainerExecutor runs only one localizer at a time. Contributed by Siqi Li (jlowe: rev 6157ace5475fff8d2513fd3cd99134b532b0b406) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DefaultContainerExecutor.java hadoop-yarn-project/CHANGES.txt

            People

            • Assignee:
              l201514 Siqi Li
              Reporter:
              l201514 Siqi Li
            • Votes:
              0 Vote for this issue
              Watchers:
              15 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development