Hadoop YARN
  1. Hadoop YARN
  2. YARN-742

Log aggregation causes a lot of redundant setPermission calls

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.23.7, 2.0.4-alpha
    • Fix Version/s: 2.1.0-beta, 0.23.9
    • Component/s: nodemanager
    • Labels:
      None

      Description

      In one of our clusters, namenode RPC is spending 45% of its time on serving setPermission calls. Further investigation has revealed that most calls are redundantly made on /mapred/logs/<user>/logs. Also mkdirs calls are made before this.

      1. YARN-742.patch
        12 kB
        Jason Lowe
      2. YARN-742-1.branch-0.23.patch
        15 kB
        Jason Lowe
      3. YARN-742-1.patch
        12 kB
        Jason Lowe

        Activity

        Hide
        Kihwal Lee added a comment -

        I've noticed a new kind of HDFS integration failures recently. This time, it is caused by TestNNThroughputBenchmark exiting prematurely. Will file a jira if not reported already.

        Show
        Kihwal Lee added a comment - I've noticed a new kind of HDFS integration failures recently. This time, it is caused by TestNNThroughputBenchmark exiting prematurely. Will file a jira if not reported already.
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-0.23-Build #630 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/630/)
        YARN-742. Log aggregation causes a lot of redundant setPermission calls. Contributed by Jason Lowe. (Revision 1489981)

        Result = SUCCESS
        kihwal : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1489981
        Files :

        • /hadoop/common/branches/branch-0.23/hadoop-yarn-project/CHANGES.txt
        • /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/LogAggregationService.java
        • /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java
        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-0.23-Build #630 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/630/ ) YARN-742 . Log aggregation causes a lot of redundant setPermission calls. Contributed by Jason Lowe. (Revision 1489981) Result = SUCCESS kihwal : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1489981 Files : /hadoop/common/branches/branch-0.23/hadoop-yarn-project/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/LogAggregationService.java /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java
        Hide
        Kihwal Lee added a comment -

        Thanks for the patch, Jason. I've committed this to trunk, branch-2, branch-2.1.0-bata and branch-0.23.

        Show
        Kihwal Lee added a comment - Thanks for the patch, Jason. I've committed this to trunk, branch-2, branch-2.1.0-bata and branch-0.23.
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk #1447 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1447/)
        YARN-742. Log aggregation causes a lot of redundant setPermission calls. Contributed by Jason Lowe. (Revision 1489596)

        Result = SUCCESS
        kihwal : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1489596
        Files :

        • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/LogAggregationService.java
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java
        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #1447 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1447/ ) YARN-742 . Log aggregation causes a lot of redundant setPermission calls. Contributed by Jason Lowe. (Revision 1489596) Result = SUCCESS kihwal : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1489596 Files : /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/LogAggregationService.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk #1421 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1421/)
        YARN-742. Log aggregation causes a lot of redundant setPermission calls. Contributed by Jason Lowe. (Revision 1489596)

        Result = FAILURE
        kihwal : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1489596
        Files :

        • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/LogAggregationService.java
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java
        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #1421 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1421/ ) YARN-742 . Log aggregation causes a lot of redundant setPermission calls. Contributed by Jason Lowe. (Revision 1489596) Result = FAILURE kihwal : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1489596 Files : /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/LogAggregationService.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Yarn-trunk #231 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/231/)
        YARN-742. Log aggregation causes a lot of redundant setPermission calls. Contributed by Jason Lowe. (Revision 1489596)

        Result = SUCCESS
        kihwal : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1489596
        Files :

        • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/LogAggregationService.java
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java
        Show
        Hudson added a comment - Integrated in Hadoop-Yarn-trunk #231 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/231/ ) YARN-742 . Log aggregation causes a lot of redundant setPermission calls. Contributed by Jason Lowe. (Revision 1489596) Result = SUCCESS kihwal : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1489596 Files : /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/LogAggregationService.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12586204/YARN-742-1.branch-0.23.patch
        against trunk revision .

        -1 patch. The patch command could not apply the patch.

        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1117//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12586204/YARN-742-1.branch-0.23.patch against trunk revision . -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1117//console This message is automatically generated.
        Hide
        Jason Lowe added a comment -

        Thanks for the review, Kihwal, and apologies with the patch issues. Here's a patch for branch-0.23

        Show
        Jason Lowe added a comment - Thanks for the review, Kihwal, and apologies with the patch issues. Here's a patch for branch-0.23
        Hide
        Kihwal Lee added a comment -

        Jason, the test failed to merge in branch-0.23.

        Show
        Kihwal Lee added a comment - Jason, the test failed to merge in branch-0.23.
        Hide
        Hudson added a comment -

        Integrated in Hadoop-trunk-Commit #3858 (See https://builds.apache.org/job/Hadoop-trunk-Commit/3858/)
        YARN-742. Log aggregation causes a lot of redundant setPermission calls. Contributed by Jason Lowe. (Revision 1489596)

        Result = SUCCESS
        kihwal : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1489596
        Files :

        • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/LogAggregationService.java
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java
        Show
        Hudson added a comment - Integrated in Hadoop-trunk-Commit #3858 (See https://builds.apache.org/job/Hadoop-trunk-Commit/3858/ ) YARN-742 . Log aggregation causes a lot of redundant setPermission calls. Contributed by Jason Lowe. (Revision 1489596) Result = SUCCESS kihwal : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1489596 Files : /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/LogAggregationService.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12586157/YARN-742-1.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1112//testReport/
        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1112//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12586157/YARN-742-1.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1112//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1112//console This message is automatically generated.
        Hide
        Jason Lowe added a comment -

        Updated the patch.

        Show
        Jason Lowe added a comment - Updated the patch.
        Hide
        Kihwal Lee added a comment -

        There were other changes made in trunk so the patch doesn't apply any more. Please update the patch.

        Show
        Kihwal Lee added a comment - There were other changes made in trunk so the patch doesn't apply any more. Please update the patch.
        Hide
        Kihwal Lee added a comment -

        +1 the patch looks good.

        Show
        Kihwal Lee added a comment - +1 the patch looks good.
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12585968/YARN-742.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1086//testReport/
        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1086//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12585968/YARN-742.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1086//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1086//console This message is automatically generated.
        Hide
        Jason Lowe added a comment -

        Patch to walk back up the app log dir path to check for the existence of a directory before blindly proceeding to create it. It gets out early if it finds a path that exists.

        In addition to the unit test, I manually tested this on a single-node cluster verifying the directories are created iff they are missing and permissions are set iff necessary due to a too-restrictive umask.

        Show
        Jason Lowe added a comment - Patch to walk back up the app log dir path to check for the existence of a directory before blindly proceeding to create it. It gets out early if it finds a path that exists. In addition to the unit test, I manually tested this on a single-node cluster verifying the directories are created iff they are missing and permissions are set iff necessary due to a too-restrictive umask.
        Hide
        Jason Lowe added a comment -

        No, this is a 0.23 cluster, and YARN-24 did not go into branch-0.23.

        The problem is not verifyAndCreateRemoteLogDir, rather it's createAppDir. That unconditionally tries to mkdir and setPermission each of the three log levels (user, user/logs, and user/logs/appID). The mkdir isn't so bad since it already exists, but the setPermission always occurs and that causes a write operation on the namenode. That's three write operations per application, per node. In this cluster's case, that's a lot of operations due to the average number of nodes used by the applications and number of applications per day.

        Show
        Jason Lowe added a comment - No, this is a 0.23 cluster, and YARN-24 did not go into branch-0.23. The problem is not verifyAndCreateRemoteLogDir, rather it's createAppDir. That unconditionally tries to mkdir and setPermission each of the three log levels (user, user/logs, and user/logs/appID). The mkdir isn't so bad since it already exists, but the setPermission always occurs and that causes a write operation on the namenode. That's three write operations per application, per node. In this cluster's case, that's a lot of operations due to the average number of nodes used by the applications and number of applications per day.
        Hide
        Sandy Ryza added a comment -

        Does the cluster have YARN-24?

        Show
        Sandy Ryza added a comment - Does the cluster have YARN-24 ?

          People

          • Assignee:
            Jason Lowe
            Reporter:
            Kihwal Lee
          • Votes:
            1 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development