Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.20.204.0, 0.20.205.0, 1.0.0, 0.23.0
    • Fix Version/s: 1.0.2, 0.23.2
    • Component/s: metrics
    • Labels:
      None

      Description

      The metrics serving thread and the periodic snapshot thread can deadlock.
      It happened a few times on one of namenodes we have. When it happens RPC works but the web ui and hftp stop working. I haven't look at the trunk too closely, but it might happen there too.

      1. hadoop-8050-trunk.patch.txt
        3 kB
        Kihwal Lee
      2. hadoop-8050-branch-1.patch.txt
        3 kB
        Kihwal Lee
      3. hadoop-8050-trunk.patch.txt
        3 kB
        Kihwal Lee
      4. hadoop-8050-branch-1.patch.txt
        3 kB
        Kihwal Lee
      5. hadoop-8050-trunk.patch.txt
        0.8 kB
        Kihwal Lee
      6. hadoop-8050-branch-1.patch.txt
        0.7 kB
        Kihwal Lee
      7. hadoop-8050-branch-1.patch.txt
        0.6 kB
        Kihwal Lee
      8. hadoop-8050.patch.txt
        2 kB
        Kihwal Lee

        Issue Links

          Activity

          Chris Nauroth made changes -
          Link This issue is related to HADOOP-10090 [ HADOOP-10090 ]
          Matt Foley made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Matt Foley made changes -
          Fix Version/s 1.0.2 [ 12320152 ]
          Fix Version/s 1.0.1 [ 12319501 ]
          Hide
          Matt Foley added a comment -

          Was committed to 1.0.2, not 1.0.1.

          Show
          Matt Foley added a comment - Was committed to 1.0.2, not 1.0.1.
          Kihwal Lee made changes -
          Status Reopened [ 4 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Hide
          Kihwal Lee added a comment -

          So this time, the mr trunk build went through okay, but 0.23 was not. I just filed MAPREDUCE-3894 to investigate the build issue. Since the issue is independent of this jira, I think we can close it.

          Show
          Kihwal Lee added a comment - So this time, the mr trunk build went through okay, but 0.23 was not. I just filed MAPREDUCE-3894 to investigate the build issue. Since the issue is independent of this jira, I think we can close it.
          Hide
          Robert Joseph Evans added a comment -

          I have seen a number of other builds fail with this, or with aborted lately. The failures look like they started around Feb 10th, but I don't know for sure. The first I saw of this, when looking through JIRA is MAPREDUCE-3852. But JQL cannot look for "RESULT = ABORTED" in a comment. It just pulls out everything that has result or = or ABORTED in it.

          Show
          Robert Joseph Evans added a comment - I have seen a number of other builds fail with this, or with aborted lately. The failures look like they started around Feb 10th, but I don't know for sure. The first I saw of this, when looking through JIRA is MAPREDUCE-3852 . But JQL cannot look for "RESULT = ABORTED" in a comment. It just pulls out everything that has result or = or ABORTED in it.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-0.23-Build #202 (See https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/202/)
          HADOOP-8050. Deadlock in metrics. Contributed by Kihwal Lee. (Revision 1291081)

          Result = FAILURE
          mattf : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1291081
          Files :

          • /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSourceAdapter.java
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-0.23-Build #202 (See https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/202/ ) HADOOP-8050 . Deadlock in metrics. Contributed by Kihwal Lee. (Revision 1291081) Result = FAILURE mattf : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1291081 Files : /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSourceAdapter.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #996 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/996/)
          HADOOP-8050. Deadlock in metrics. Contributed by Kihwal Lee. (Revision 1291084)

          Result = SUCCESS
          mattf : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1291084
          Files :

          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSourceAdapter.java
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #996 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/996/ ) HADOOP-8050 . Deadlock in metrics. Contributed by Kihwal Lee. (Revision 1291084) Result = SUCCESS mattf : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1291084 Files : /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSourceAdapter.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-0.23-Build #174 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/174/)
          HADOOP-8050. Deadlock in metrics. Contributed by Kihwal Lee. (Revision 1291081)

          Result = SUCCESS
          mattf : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1291081
          Files :

          • /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSourceAdapter.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-0.23-Build #174 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/174/ ) HADOOP-8050 . Deadlock in metrics. Contributed by Kihwal Lee. (Revision 1291081) Result = SUCCESS mattf : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1291081 Files : /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSourceAdapter.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #961 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/961/)
          HADOOP-8050. Deadlock in metrics. Contributed by Kihwal Lee. (Revision 1291084)

          Result = SUCCESS
          mattf : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1291084
          Files :

          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSourceAdapter.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #961 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/961/ ) HADOOP-8050 . Deadlock in metrics. Contributed by Kihwal Lee. (Revision 1291084) Result = SUCCESS mattf : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1291084 Files : /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSourceAdapter.java
          Matt Foley made changes -
          Resolution Fixed [ 1 ]
          Status Resolved [ 5 ] Reopened [ 4 ]
          Hide
          Matt Foley added a comment -

          The Commit integration to both trunk and v0.23 succeeded with common and hdfs, but aborted in mapreduce. The log records:

          ivy-resolve-mapred:
          Build timed out (after 45 minutes). Marking the build as aborted.
          Build was aborted
          Recording test results
          None of the test reports contained any result
          Updating HADOOP-8050
          No emails were triggered.
          Finished: ABORTED
          

          This looks more like a connectivity issue with ivy than a problem with the patch, but re-opening pending investigation.

          Show
          Matt Foley added a comment - The Commit integration to both trunk and v0.23 succeeded with common and hdfs, but aborted in mapreduce. The log records: ivy-resolve-mapred: Build timed out (after 45 minutes). Marking the build as aborted. Build was aborted Recording test results None of the test reports contained any result Updating HADOOP-8050 No emails were triggered. Finished: ABORTED This looks more like a connectivity issue with ivy than a problem with the patch, but re-opening pending investigation.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk-Commit #1768 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1768/)
          HADOOP-8050. Deadlock in metrics. Contributed by Kihwal Lee. (Revision 1291084)

          Result = ABORTED
          mattf : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1291084
          Files :

          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSourceAdapter.java
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #1768 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1768/ ) HADOOP-8050 . Deadlock in metrics. Contributed by Kihwal Lee. (Revision 1291084) Result = ABORTED mattf : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1291084 Files : /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSourceAdapter.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-0.23-Commit #580 (See https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Commit/580/)
          HADOOP-8050. Deadlock in metrics. Contributed by Kihwal Lee. (Revision 1291081)

          Result = ABORTED
          mattf : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1291081
          Files :

          • /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSourceAdapter.java
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-0.23-Commit #580 (See https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Commit/580/ ) HADOOP-8050 . Deadlock in metrics. Contributed by Kihwal Lee. (Revision 1291081) Result = ABORTED mattf : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1291081 Files : /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSourceAdapter.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #1757 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1757/)
          HADOOP-8050. Deadlock in metrics. Contributed by Kihwal Lee. (Revision 1291084)

          Result = SUCCESS
          mattf : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1291084
          Files :

          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSourceAdapter.java
          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #1757 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1757/ ) HADOOP-8050 . Deadlock in metrics. Contributed by Kihwal Lee. (Revision 1291084) Result = SUCCESS mattf : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1291084 Files : /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSourceAdapter.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #1831 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1831/)
          HADOOP-8050. Deadlock in metrics. Contributed by Kihwal Lee. (Revision 1291084)

          Result = SUCCESS
          mattf : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1291084
          Files :

          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSourceAdapter.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #1831 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1831/ ) HADOOP-8050 . Deadlock in metrics. Contributed by Kihwal Lee. (Revision 1291084) Result = SUCCESS mattf : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1291084 Files : /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSourceAdapter.java
          Matt Foley made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Fix Version/s 1.1.0 [ 12316501 ]
          Fix Version/s 0.24.0 [ 12317652 ]
          Resolution Fixed [ 1 ]
          Hide
          Matt Foley added a comment -

          Committed to branch-1.0, branch-1, branch-0.23, and trunk.
          Thanks, Kihwal and Luke!

          Show
          Matt Foley added a comment - Committed to branch-1.0, branch-1, branch-0.23, and trunk. Thanks, Kihwal and Luke!
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-0.23-Commit #578 (See https://builds.apache.org/job/Hadoop-Common-0.23-Commit/578/)
          HADOOP-8050. Deadlock in metrics. Contributed by Kihwal Lee. (Revision 1291081)

          Result = SUCCESS
          mattf : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1291081
          Files :

          • /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSourceAdapter.java
          Show
          Hudson added a comment - Integrated in Hadoop-Common-0.23-Commit #578 (See https://builds.apache.org/job/Hadoop-Common-0.23-Commit/578/ ) HADOOP-8050 . Deadlock in metrics. Contributed by Kihwal Lee. (Revision 1291081) Result = SUCCESS mattf : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1291081 Files : /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSourceAdapter.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-0.23-Commit #565 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Commit/565/)
          HADOOP-8050. Deadlock in metrics. Contributed by Kihwal Lee. (Revision 1291081)

          Result = SUCCESS
          mattf : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1291081
          Files :

          • /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSourceAdapter.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-0.23-Commit #565 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Commit/565/ ) HADOOP-8050 . Deadlock in metrics. Contributed by Kihwal Lee. (Revision 1291081) Result = SUCCESS mattf : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1291081 Files : /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSourceAdapter.java
          Hide
          Kihwal Lee added a comment -

          Filed HADOOP-8073 per Luke's comment. Thanks for the review.

          Show
          Kihwal Lee added a comment - Filed HADOOP-8073 per Luke's comment. Thanks for the review.
          Hide
          Luke Lu added a comment -

          The latest patch lgtm, +1. Thanks Kihwal. It'll be great if you can add a test case for the jmx metrics serving (the purpose is not reproduce the deadlock, but people can run something like jcarder with the unit tests and detect potential deadlocks). I'm not holding my +1 for it though.

          Show
          Luke Lu added a comment - The latest patch lgtm, +1. Thanks Kihwal. It'll be great if you can add a test case for the jmx metrics serving (the purpose is not reproduce the deadlock, but people can run something like jcarder with the unit tests and detect potential deadlocks). I'm not holding my +1 for it though.
          Kihwal Lee made changes -
          Fix Version/s 0.24.0 [ 12317652 ]
          Fix Version/s 0.23.2 [ 12319855 ]
          Hide
          Luke Lu added a comment -

          static analysis is hopeless for this case, as the compiler needs to know all the possible dynamic bindings of metrics source before hand. I recall that Todd ran jcarder (dynamic deadlock finder) on metrics2 in trunk and didn't find the issue, probably due to a lack of test coverage in the jmx metrics serving (there are some tests but we need to make sure snapshot thread happens in the tests as well), or the fact that metrics sources can be created via annotations.

          Show
          Luke Lu added a comment - static analysis is hopeless for this case, as the compiler needs to know all the possible dynamic bindings of metrics source before hand. I recall that Todd ran jcarder (dynamic deadlock finder) on metrics2 in trunk and didn't find the issue, probably due to a lack of test coverage in the jmx metrics serving (there are some tests but we need to make sure snapshot thread happens in the tests as well), or the fact that metrics sources can be created via annotations.
          Hide
          Kihwal Lee added a comment -

          No test was added since there is no functional change.
          I wish static analysis tools catch this kind of bugs.

          Show
          Kihwal Lee added a comment - No test was added since there is no functional change. I wish static analysis tools catch this kind of bugs.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12514353/hadoop-8050-trunk.patch.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in .

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/590//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/590//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514353/hadoop-8050-trunk.patch.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/590//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/590//console This message is automatically generated.
          Kihwal Lee made changes -
          Attachment hadoop-8050-branch-1.patch.txt [ 12514352 ]
          Attachment hadoop-8050-trunk.patch.txt [ 12514353 ]
          Hide
          Kihwal Lee added a comment -

          Thanks for the review, Luke. I removed the unused variable in the new patches. Attaching patches for branch-1 and trunk.

          Show
          Kihwal Lee added a comment - Thanks for the review, Luke. I removed the unused variable in the new patches. Attaching patches for branch-1 and trunk.
          Hide
          Luke Lu added a comment -

          Ah those pesky tests Kihwal's latest patch looks reasonable to me besides the redundant "needUpdate" variable.

          Show
          Luke Lu added a comment - Ah those pesky tests Kihwal's latest patch looks reasonable to me besides the redundant "needUpdate" variable.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12514317/hadoop-8050-trunk.patch.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in .

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/588//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/588//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514317/hadoop-8050-trunk.patch.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/588//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/588//console This message is automatically generated.
          Hide
          Kihwal Lee added a comment -

          The correct fix (sans moving jmx to a sink) is not removing the lock on metrics system in the snapshot thread but fixing the lock order in MetricsSourceAdapter (to make source.getMetrics is done without holding the adapter lock).

          I tried to do this in the new patch. Since updateJmxCache() doesn't block while calling getMetrics(), some may not get the latest metric data if updateJmxCache() is already being executed by another thread.

          The patch passes all metrics related tests.

          Show
          Kihwal Lee added a comment - The correct fix (sans moving jmx to a sink) is not removing the lock on metrics system in the snapshot thread but fixing the lock order in MetricsSourceAdapter (to make source.getMetrics is done without holding the adapter lock). I tried to do this in the new patch. Since updateJmxCache() doesn't block while calling getMetrics(), some may not get the latest metric data if updateJmxCache() is already being executed by another thread. The patch passes all metrics related tests.
          Kihwal Lee made changes -
          Attachment hadoop-8050-branch-1.patch.txt [ 12514316 ]
          Attachment hadoop-8050-trunk.patch.txt [ 12514317 ]
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12514294/hadoop-8050-trunk.patch.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/587//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/587//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514294/hadoop-8050-trunk.patch.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/587//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/587//console This message is automatically generated.
          Hide
          Kihwal Lee added a comment -

          The new patch sets publishSelfMetrics to false, which causes the snapshot to skip the system source.

          Show
          Kihwal Lee added a comment - The new patch sets publishSelfMetrics to false, which causes the snapshot to skip the system source.
          Kihwal Lee made changes -
          Attachment hadoop-8050-trunk.patch.txt [ 12514294 ]
          Kihwal Lee made changes -
          Attachment hadoop-8050-branch-1.patch.txt [ 12514293 ]
          Kihwal Lee made changes -
          Attachment hadoop-8050-branch-1.patch.txt [ 12514292 ]
          Hide
          Kihwal Lee added a comment -

          what if we set publishSelfMetrics to false?

          Show
          Kihwal Lee added a comment - what if we set publishSelfMetrics to false?
          Hide
          Luke Lu added a comment -

          @Matt, I'd have already attached a patch if not for my employer's patch review/approval process. For a quick fix for 1.0.1, the least risky approach would be commenting out the registerSystemSource line in MetricsSystemImpl#configureSources. The metrics system metrics was mostly for the original dev testing/debugging and not required for production. I'll review the patch

          Show
          Luke Lu added a comment - @Matt, I'd have already attached a patch if not for my employer's patch review/approval process. For a quick fix for 1.0.1, the least risky approach would be commenting out the registerSystemSource line in MetricsSystemImpl#configureSources. The metrics system metrics was mostly for the original dev testing/debugging and not required for production. I'll review the patch
          Hide
          Matt Foley added a comment -

          Hi Luke, would you be able to submit an alternate patch per your proposal (quick fix for lock order)? I'm trying to get a 1.0.1 build done, and it would be great to get this in. Thanks.

          Show
          Matt Foley added a comment - Hi Luke, would you be able to submit an alternate patch per your proposal (quick fix for lock order)? I'm trying to get a 1.0.1 build done, and it would be great to get this in. Thanks.
          Luke Lu made changes -
          Affects Version/s 0.23.0 [ 12315569 ]
          Hide
          Luke Lu added a comment -

          The reason of the deadlock is that the JMX serving thread has different lock order (source adapter, source (which can be metrics system)) than the snapshot thread (metrics system, source adapter). The correct fix (sans moving jmx to a sink) is not removing the lock on metrics system in the snapshot thread but fixing the lock order in MetricsSourceAdapter (to make source.getMetrics is done without holding the adapter lock).

          Show
          Luke Lu added a comment - The reason of the deadlock is that the JMX serving thread has different lock order (source adapter, source (which can be metrics system)) than the snapshot thread (metrics system, source adapter). The correct fix (sans moving jmx to a sink) is not removing the lock on metrics system in the snapshot thread but fixing the lock order in MetricsSourceAdapter (to make source.getMetrics is done without holding the adapter lock).
          Hide
          Luke Lu added a comment -

          Always wanted to move the JMX stuff to a sink to untangle the mess and never got around to do it (maybe I'll do it with HADOOP-8061). The main reason for locking metrics system during snapshots is that people can trigger a metrics system restart (stop/start) via JMX in another thread.

          Show
          Luke Lu added a comment - Always wanted to move the JMX stuff to a sink to untangle the mess and never got around to do it (maybe I'll do it with HADOOP-8061 ). The main reason for locking metrics system during snapshots is that people can trigger a metrics system restart (stop/start) via JMX in another thread.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12514152/hadoop-8050.patch.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/586//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514152/hadoop-8050.patch.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/586//console This message is automatically generated.
          Kihwal Lee made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Kihwal Lee made changes -
          Attachment hadoop-8050.patch.txt [ 12514152 ]
          Hide
          Kihwal Lee added a comment -

          If a lot of methods are synchronized and two classes containing them have interdependency, deadlock is likely.

          The current way of locking in metrics is a little excessive. I do not believe the strict global consistency is required in processing metrics. For one, sources are not cordinating with each other (they are mostly independent), so locking the whole subsystem and taking snapshot does not add much value to the quality of data.

          This patch removes some locks around accessing the source adapter map within MetricsSystemImpl. This makes the metric snapshot only lock on each individual source adapter, one at a time, instead of the entire metrics impl. This is safe because:

          • Once sources are registered, they are not removed until shutdown(). Even shoutdown() or stop() is called rarely.
          • During snapshot, the source adapter hashmap is the only data structure that needs protection.
          • snapshot() is only called from the timer event handler. startTimer() makes sure that there is only one timer.

          I wrapped the LinkeHashMap used for the source adapter map with Collections.synchronizedMap. This made accessing the data structure safe without holding a big coarse lock. No further synchronization between sources seem needed.

          Show
          Kihwal Lee added a comment - If a lot of methods are synchronized and two classes containing them have interdependency, deadlock is likely. The current way of locking in metrics is a little excessive. I do not believe the strict global consistency is required in processing metrics. For one, sources are not cordinating with each other (they are mostly independent), so locking the whole subsystem and taking snapshot does not add much value to the quality of data. This patch removes some locks around accessing the source adapter map within MetricsSystemImpl. This makes the metric snapshot only lock on each individual source adapter, one at a time, instead of the entire metrics impl. This is safe because: Once sources are registered, they are not removed until shutdown(). Even shoutdown() or stop() is called rarely. During snapshot, the source adapter hashmap is the only data structure that needs protection. snapshot() is only called from the timer event handler. startTimer() makes sure that there is only one timer. I wrapped the LinkeHashMap used for the source adapter map with Collections.synchronizedMap. This made accessing the data structure safe without holding a big coarse lock. No further synchronization between sources seem needed.
          Kihwal Lee made changes -
          Assignee Kihwal Lee [ kihwal ]
          Kihwal Lee made changes -
          Field Original Value New Value
          Description The metrics serving thread and the periodic snapshot thread can deadlock.
          It happened a few times on one of namenode we have. When it happens RPC works but the web ui and hftp stop working. I haven't look at the trunk too closely, but it might happen there too.
          The metrics serving thread and the periodic snapshot thread can deadlock.
          It happened a few times on one of namenodes we have. When it happens RPC works but the web ui and hftp stop working. I haven't look at the trunk too closely, but it might happen there too.
          Hide
          Kihwal Lee added a comment -

          "1822485214@qtp-1598533502-1267":
          at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMBeanInfo(MetricsSourceAdapter.java:141)

          • waiting to lock <0x00002aabae0cdb18> (a org.apache.hadoop.metrics2.impl.MetricsSourceAdapter)
            at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBeanInfo(DefaultMBeanServerInterceptor.java:1375)
            at com.sun.jmx.mbeanserver.JmxMBeanServer.getMBeanInfo(JmxMBeanServer.java:880)
            at org.apache.hadoop.jmx.JMXJsonServlet.listBeans(JMXJsonServlet.java:183)
            at org.apache.hadoop.jmx.JMXJsonServlet.doGet(JMXJsonServlet.java:159)
            at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
            at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
            at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
            at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
            at com.yahoo.hadoop.HadoopBouncerFilter.doFilter(HadoopBouncerFilter.java:60)
            at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
            at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:818)
            at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
            at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
            at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
            at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
            at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
            at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
            at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
            at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
            at org.mortbay.jetty.Server.handle(Server.java:326)
            at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
            at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
            at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
            at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
            at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
            at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
            at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
            "902074768@qtp-1598533502-432":
            at org.apache.hadoop.metrics2.impl.MetricsSystemImpl$5.getMetrics(MetricsSystemImpl.java:477)
          • waiting to lock <0x00002aabae06f408> (a org.apache.hadoop.metrics2.impl.MetricsSystemImpl)
            at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:169)
            at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:149)
            at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMBeanInfo(MetricsSourceAdapter.java:141)
          • locked <0x00002aabae0cdb18> (a org.apache.hadoop.metrics2.impl.MetricsSourceAdapter)
            at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBeanInfo(DefaultMBeanServerInterceptor.java:1375)
            at com.sun.jmx.mbeanserver.JmxMBeanServer.getMBeanInfo(JmxMBeanServer.java:880)
            at org.apache.hadoop.jmx.JMXJsonServlet.listBeans(JMXJsonServlet.java:183)
            at org.apache.hadoop.jmx.JMXJsonServlet.doGet(JMXJsonServlet.java:159)
            at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
            at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
            at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
            at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
            at com.yahoo.hadoop.HadoopBouncerFilter.doFilter(HadoopBouncerFilter.java:60)
            at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
            at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:818)
            at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
            at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
            at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
            at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
            at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
            at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
            at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
            at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
            at org.mortbay.jetty.Server.handle(Server.java:326)
            at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
            at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
            at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
            at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
            at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
            at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
            at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
            "Timer for 'NameNode' metrics system":
            at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:164)
          • waiting to lock <0x00002aabae0cdb18> (a org.apache.hadoop.metrics2.impl.MetricsSourceAdapter)
            at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.snapshotMetrics(MetricsSystemImpl.java:336)
            at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.snapshotMetrics(MetricsSystemImpl.java:327)
          • locked <0x00002aabae06f408> (a org.apache.hadoop.metrics2.impl.MetricsSystemImpl)
            at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.onTimerEvent(MetricsSystemImpl.java:309)
          • locked <0x00002aabae06f408> (a org.apache.hadoop.metrics2.impl.MetricsSystemImpl)
            at org.apache.hadoop.metrics2.impl.MetricsSystemImpl$4.run(MetricsSystemImpl.java:296)
            at java.util.TimerThread.mainLoop(Timer.java:512)
            at java.util.TimerThread.run(Timer.java:462)

          Found 1 deadlock.

          --------------------------------
          There is no problem for normal metrics sources, which locks the source adapter object while doing getMetrics(). But the system source locks MetricsSystemImpl on getMetrics().

          Show
          Kihwal Lee added a comment - "1822485214@qtp-1598533502-1267": at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMBeanInfo(MetricsSourceAdapter.java:141) waiting to lock <0x00002aabae0cdb18> (a org.apache.hadoop.metrics2.impl.MetricsSourceAdapter) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBeanInfo(DefaultMBeanServerInterceptor.java:1375) at com.sun.jmx.mbeanserver.JmxMBeanServer.getMBeanInfo(JmxMBeanServer.java:880) at org.apache.hadoop.jmx.JMXJsonServlet.listBeans(JMXJsonServlet.java:183) at org.apache.hadoop.jmx.JMXJsonServlet.doGet(JMXJsonServlet.java:159) at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221) at com.yahoo.hadoop.HadoopBouncerFilter.doFilter(HadoopBouncerFilter.java:60) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:818) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) "902074768@qtp-1598533502-432": at org.apache.hadoop.metrics2.impl.MetricsSystemImpl$5.getMetrics(MetricsSystemImpl.java:477) waiting to lock <0x00002aabae06f408> (a org.apache.hadoop.metrics2.impl.MetricsSystemImpl) at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:169) at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:149) at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMBeanInfo(MetricsSourceAdapter.java:141) locked <0x00002aabae0cdb18> (a org.apache.hadoop.metrics2.impl.MetricsSourceAdapter) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBeanInfo(DefaultMBeanServerInterceptor.java:1375) at com.sun.jmx.mbeanserver.JmxMBeanServer.getMBeanInfo(JmxMBeanServer.java:880) at org.apache.hadoop.jmx.JMXJsonServlet.listBeans(JMXJsonServlet.java:183) at org.apache.hadoop.jmx.JMXJsonServlet.doGet(JMXJsonServlet.java:159) at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221) at com.yahoo.hadoop.HadoopBouncerFilter.doFilter(HadoopBouncerFilter.java:60) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:818) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) "Timer for 'NameNode' metrics system": at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:164) waiting to lock <0x00002aabae0cdb18> (a org.apache.hadoop.metrics2.impl.MetricsSourceAdapter) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.snapshotMetrics(MetricsSystemImpl.java:336) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.snapshotMetrics(MetricsSystemImpl.java:327) locked <0x00002aabae06f408> (a org.apache.hadoop.metrics2.impl.MetricsSystemImpl) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.onTimerEvent(MetricsSystemImpl.java:309) locked <0x00002aabae06f408> (a org.apache.hadoop.metrics2.impl.MetricsSystemImpl) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl$4.run(MetricsSystemImpl.java:296) at java.util.TimerThread.mainLoop(Timer.java:512) at java.util.TimerThread.run(Timer.java:462) Found 1 deadlock. -------------------------------- There is no problem for normal metrics sources, which locks the source adapter object while doing getMetrics(). But the system source locks MetricsSystemImpl on getMetrics().
          Kihwal Lee created issue -

            People

            • Assignee:
              Kihwal Lee
              Reporter:
              Kihwal Lee
            • Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development