Hadoop YARN
  1. Hadoop YARN
  2. YARN-2251

Avoid negative elapsed time in JHS/MRAM web UI and services

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.6.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Recently we observed a rare bug that an elapsed time of a reducer is going to be negative on JHS web UI and via REST APIs. While the real reason for this bug seems to be clock asynchronization on different hosts, the web frontend should have masked the negative values. However, in the current code, org.apache.hadoop.mapreduce.v2.app.webapp.dao.* only check whether the elapsed time is -1 or not.

      1. MAPREDUCE-5940.1.patch
        5 kB
        Zhijie Shen
      2. MAPREDUCE-5940.2.patch
        3 kB
        Zhijie Shen

        Issue Links

          Activity

          Hide
          Zhijie Shen added a comment -

          A straightforward change to insure all XXXInfo objects check whether the elapsed time is not negative instead of not -1. Test cases is not necessary.

          Show
          Zhijie Shen added a comment - A straightforward change to insure all XXXInfo objects check whether the elapsed time is not negative instead of not -1. Test cases is not necessary.
          Hide
          Zhijie Shen added a comment -

          I'll file a separate Jira about the global clock synchronization issue, and introduce the possible problem to result in negative elapsed time.

          Show
          Zhijie Shen added a comment - I'll file a separate Jira about the global clock synchronization issue, and introduce the possible problem to result in negative elapsed time.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12652340/MAPREDUCE-5940.1.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4686//testReport/
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4686//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12652340/MAPREDUCE-5940.1.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4686//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4686//console This message is automatically generated.
          Hide
          Varun Vasudev added a comment -

          +1, patch looks good.

          Show
          Varun Vasudev added a comment - +1, patch looks good.
          Hide
          Junping Du added a comment -

          Thanks Zhijie Shen to work on this and Varun Vasudev to review! This seems to be an interesting issue. I think a better/clean way to fix it is to update elapsed() method: If System.currentTimeMillis() < started, then we can return -1 or 0 instead (and log a warn that clock not getting synchronized). Thoughts?

          Show
          Junping Du added a comment - Thanks Zhijie Shen to work on this and Varun Vasudev to review! This seems to be an interesting issue. I think a better/clean way to fix it is to update elapsed() method: If System.currentTimeMillis() < started, then we can return -1 or 0 instead (and log a warn that clock not getting synchronized). Thoughts?
          Hide
          Junping Du added a comment -

          Also adding a test in TestTimes.java could be a good idea.

          Show
          Junping Du added a comment - Also adding a test in TestTimes.java could be a good idea.
          Hide
          Devaraj K added a comment -

          Silently making the elapsed time as 0 when it is negative may lead to hiding the bugs related to elapsed time. Adding a warning/info message before making it as 0 would help to diagnose/find out the issues if any.

          Show
          Devaraj K added a comment - Silently making the elapsed time as 0 when it is negative may lead to hiding the bugs related to elapsed time. Adding a warning/info message before making it as 0 would help to diagnose/find out the issues if any.
          Hide
          Junping Du added a comment -

          Silently making the elapsed time as 0 when it is negative may lead to hiding the bugs related to elapsed time. Adding a warning/info message before making it as 0 would help to diagnose/find out the issues if any.

          Agree. I have similar comments above.

          Show
          Junping Du added a comment - Silently making the elapsed time as 0 when it is negative may lead to hiding the bugs related to elapsed time. Adding a warning/info message before making it as 0 would help to diagnose/find out the issues if any. Agree. I have similar comments above.
          Hide
          Zhijie Shen added a comment -

          Thanks for review, Junping and Devaraj.

          If System.currentTimeMillis() < started, then we can return -1 or 0 instead

          IMHO, Times#elapsed is to computed the delta between two timestamps: started and finished. Given System.currentTimeMillis() < started <= finished, it still should be a valid case. To make sure the elapsed time should always be non-negative, we need to check started <= finished, and return -1 if not.

          (and log a warn that clock not getting synchronized)

          Adding a warning/info message before making it as 0 would help to diagnose/find out the issues if any.

          Also adding a test in TestTimes.java could be a good idea.

          Sounds a good idea. Will address it in the new patch.

          In addition, add a code comment to explicitly declare the behavior of Times#elapsed

          Show
          Zhijie Shen added a comment - Thanks for review, Junping and Devaraj. If System.currentTimeMillis() < started, then we can return -1 or 0 instead IMHO, Times#elapsed is to computed the delta between two timestamps: started and finished. Given System.currentTimeMillis() < started <= finished, it still should be a valid case. To make sure the elapsed time should always be non-negative, we need to check started <= finished, and return -1 if not. (and log a warn that clock not getting synchronized) Adding a warning/info message before making it as 0 would help to diagnose/find out the issues if any. Also adding a test in TestTimes.java could be a good idea. Sounds a good idea. Will address it in the new patch. In addition, add a code comment to explicitly declare the behavior of Times#elapsed
          Hide
          Junping Du added a comment -

          Kick off Jenkins test.
          Patch looks good to me. Devaraj K, are you OK with the new patch? If so, I will commit it once Jenkins +1.

          Show
          Junping Du added a comment - Kick off Jenkins test. Patch looks good to me. Devaraj K , are you OK with the new patch? If so, I will commit it once Jenkins +1.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12653860/MAPREDUCE-5940.2.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4712//testReport/
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4712//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12653860/MAPREDUCE-5940.2.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4712//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4712//console This message is automatically generated.
          Hide
          Devaraj K added a comment -

          +1, Latest patch looks good to me.

          Show
          Devaraj K added a comment - +1, Latest patch looks good to me.
          Hide
          Junping Du added a comment -

          Move to YARN project as all fix is happened in YARN side.

          Show
          Junping Du added a comment - Move to YARN project as all fix is happened in YARN side.
          Hide
          Junping Du added a comment -

          Kick off Jenkins test again.

          Show
          Junping Du added a comment - Kick off Jenkins test again.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12653860/MAPREDUCE-5940.2.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-YARN-Build/4201//testReport/
          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4201//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12653860/MAPREDUCE-5940.2.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/4201//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4201//console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12653860/MAPREDUCE-5940.2.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-YARN-Build/4202//testReport/
          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4202//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12653860/MAPREDUCE-5940.2.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/4202//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4202//console This message is automatically generated.
          Hide
          Junping Du added a comment -

          I have commit it to trunk and branch-2. Thanks Zhijie Shen for the patch!

          Show
          Junping Du added a comment - I have commit it to trunk and branch-2. Thanks Zhijie Shen for the patch!
          Hide
          Hudson added a comment -

          SUCCESS: Integrated in Hadoop-trunk-Commit #5826 (See https://builds.apache.org/job/Hadoop-trunk-Commit/5826/)
          YARN-2251. Avoid negative elapsed time in JHS/MRAM web UI and services (Contributed by Zhijie Shen) (junping_du: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607833)

          • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/Times.java
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestTimes.java
          Show
          Hudson added a comment - SUCCESS: Integrated in Hadoop-trunk-Commit #5826 (See https://builds.apache.org/job/Hadoop-trunk-Commit/5826/ ) YARN-2251 . Avoid negative elapsed time in JHS/MRAM web UI and services (Contributed by Zhijie Shen) (junping_du: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607833 ) /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/Times.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestTimes.java
          Hide
          Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #1794 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1794/)
          YARN-2251. Avoid negative elapsed time in JHS/MRAM web UI and services (Contributed by Zhijie Shen) (junping_du: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607833)

          • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/Times.java
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestTimes.java
          Show
          Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #1794 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1794/ ) YARN-2251 . Avoid negative elapsed time in JHS/MRAM web UI and services (Contributed by Zhijie Shen) (junping_du: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607833 ) /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/Times.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestTimes.java
          Hide
          Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk #1821 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1821/)
          YARN-2251. Avoid negative elapsed time in JHS/MRAM web UI and services (Contributed by Zhijie Shen) (junping_du: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607833)

          • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/Times.java
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestTimes.java
          Show
          Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1821 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1821/ ) YARN-2251 . Avoid negative elapsed time in JHS/MRAM web UI and services (Contributed by Zhijie Shen) (junping_du: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607833 ) /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/Times.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestTimes.java
          Hide
          Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk #604 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/604/)
          YARN-2251. Avoid negative elapsed time in JHS/MRAM web UI and services (Contributed by Zhijie Shen) (junping_du: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607833)

          • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/Times.java
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestTimes.java
          Show
          Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #604 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/604/ ) YARN-2251 . Avoid negative elapsed time in JHS/MRAM web UI and services (Contributed by Zhijie Shen) (junping_du: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607833 ) /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/Times.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestTimes.java

            People

            • Assignee:
              Zhijie Shen
              Reporter:
              Zhijie Shen
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development