Details

    • Type: Sub-task Sub-task
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 2.0.2-alpha, 0.23.4
    • Fix Version/s: 3.0.0, 2.0.3-alpha, 0.23.5
    • Component/s: None
    • Labels:
      None

      Description

      When the log aggregation is on, write to each aggregated container log causes hflush() to be called. For large clusters, this can creates a lot of fsync() calls for namenode.

      We have seen 6-7x increase in the average number of fsync operations compared to 1.0.x on a large busy cluster. Over 99% of fsync ops were for log aggregation writing to tmp files.

      1. yarn-202.patch
        0.7 kB
        Kihwal Lee

        Activity

        Hide
        Kihwal Lee added a comment -

        This problem will probably go away if we can leave out hflush() from LogWriter#append().

        Show
        Kihwal Lee added a comment - This problem will probably go away if we can leave out hflush() from LogWriter#append().
        Hide
        Kihwal Lee added a comment -

        The patch takes out hflush(). I think this is okay, but will appreciate other people's thought on this.

        Show
        Kihwal Lee added a comment - The patch takes out hflush(). I think this is okay, but will appreciate other people's thought on this.
        Hide
        Robert Joseph Evans added a comment -

        I think removing the flush is fine. The file will get closed when the application finishes, so the only issue is that if the NM crashes badly more logs may be lost then before. I am +1. It a small change that reduces the load on the NN. I'll check it in.

        Show
        Robert Joseph Evans added a comment - I think removing the flush is fine. The file will get closed when the application finishes, so the only issue is that if the NM crashes badly more logs may be lost then before. I am +1. It a small change that reduces the load on the NN. I'll check it in.
        Hide
        Robert Joseph Evans added a comment -

        I should have said +1 pending Jenkins . Kicking Jenkins now.

        Show
        Robert Joseph Evans added a comment - I should have said +1 pending Jenkins . Kicking Jenkins now.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12552193/yarn-202.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-YARN-Build/134//testReport/
        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/134//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12552193/yarn-202.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/134//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/134//console This message is automatically generated.
        Hide
        Robert Joseph Evans added a comment -

        Jenkins looks fine so I am checking it in now.

        Show
        Robert Joseph Evans added a comment - Jenkins looks fine so I am checking it in now.
        Hide
        Robert Joseph Evans added a comment -

        Thanks Kihwal,

        I put this into trunk, branch-2, and branch-0.23

        Show
        Robert Joseph Evans added a comment - Thanks Kihwal, I put this into trunk, branch-2, and branch-0.23
        Hide
        Hudson added a comment -

        Integrated in Hadoop-trunk-Commit #2962 (See https://builds.apache.org/job/Hadoop-trunk-Commit/2962/)
        YARN-202. Log Aggregation generates a storm of fsync() for namenode (Kihwal Lee via bobby) (Revision 1406269)

        Result = SUCCESS
        bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1406269
        Files :

        • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogFormat.java
        Show
        Hudson added a comment - Integrated in Hadoop-trunk-Commit #2962 (See https://builds.apache.org/job/Hadoop-trunk-Commit/2962/ ) YARN-202 . Log Aggregation generates a storm of fsync() for namenode (Kihwal Lee via bobby) (Revision 1406269) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1406269 Files : /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogFormat.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Yarn-trunk #29 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/29/)
        YARN-202. Log Aggregation generates a storm of fsync() for namenode (Kihwal Lee via bobby) (Revision 1406269)

        Result = SUCCESS
        bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1406269
        Files :

        • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogFormat.java
        Show
        Hudson added a comment - Integrated in Hadoop-Yarn-trunk #29 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/29/ ) YARN-202 . Log Aggregation generates a storm of fsync() for namenode (Kihwal Lee via bobby) (Revision 1406269) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1406269 Files : /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogFormat.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-0.23-Build #428 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/428/)
        svn merge -c 1406269 FIXES: YARN-202. Log Aggregation generates a storm of fsync() for namenode (Kihwal Lee via bobby) (Revision 1406271)

        Result = SUCCESS
        bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1406271
        Files :

        • /hadoop/common/branches/branch-0.23/hadoop-yarn-project/CHANGES.txt
        • /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogFormat.java
        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-0.23-Build #428 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/428/ ) svn merge -c 1406269 FIXES: YARN-202 . Log Aggregation generates a storm of fsync() for namenode (Kihwal Lee via bobby) (Revision 1406271) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1406271 Files : /hadoop/common/branches/branch-0.23/hadoop-yarn-project/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogFormat.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk #1219 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1219/)
        YARN-202. Log Aggregation generates a storm of fsync() for namenode (Kihwal Lee via bobby) (Revision 1406269)

        Result = SUCCESS
        bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1406269
        Files :

        • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogFormat.java
        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #1219 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1219/ ) YARN-202 . Log Aggregation generates a storm of fsync() for namenode (Kihwal Lee via bobby) (Revision 1406269) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1406269 Files : /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogFormat.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk #1249 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1249/)
        YARN-202. Log Aggregation generates a storm of fsync() for namenode (Kihwal Lee via bobby) (Revision 1406269)

        Result = FAILURE
        bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1406269
        Files :

        • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogFormat.java
        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #1249 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1249/ ) YARN-202 . Log Aggregation generates a storm of fsync() for namenode (Kihwal Lee via bobby) (Revision 1406269) Result = FAILURE bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1406269 Files : /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogFormat.java

          People

          • Assignee:
            Kihwal Lee
            Reporter:
            Kihwal Lee
          • Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development