Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 0.23.0
    • Fix Version/s: 0.23.0
    • Component/s: metrics
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Lock cycle detected by jcarder between MetricsSystemImpl and DefaultMetricsSystem

      1. metrics-shutdown.png
        59 kB
        Todd Lipcon
      2. metrics-deadlock.png
        49 kB
        Todd Lipcon
      3. hadoop-7529-v2.patch
        2 kB
        Luke Lu
      4. hadoop-7529-v1.patch
        2 kB
        Luke Lu

        Activity

        Hide
        Hudson added a comment -

        Integrated in Hadoop-Common-trunk-Commit #741 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/741/)
        HADOOP-7529. Fix lock cycles in metrics system. (llu)

        llu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1157187
        Files :

        • /hadoop/common/trunk/hadoop-common/CHANGES.txt
        • /hadoop/common/trunk/hadoop-common/src/main/java/org/apache/hadoop/metrics2/lib/DefaultMetricsSystem.java
        Show
        Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #741 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/741/ ) HADOOP-7529 . Fix lock cycles in metrics system. (llu) llu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1157187 Files : /hadoop/common/trunk/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common/src/main/java/org/apache/hadoop/metrics2/lib/DefaultMetricsSystem.java
        Hide
        Luke Lu added a comment -

        Committed v2 to trunk. Thanks Todd for reviewing!

        Show
        Luke Lu added a comment - Committed v2 to trunk. Thanks Todd for reviewing!
        Hide
        Todd Lipcon added a comment -

        +1, patch looks good to me. Thanks Luke!

        Show
        Todd Lipcon added a comment - +1, patch looks good to me. Thanks Luke!
        Hide
        Luke Lu added a comment -

        Yep, the wiki is here: http://wiki.apache.org/hadoop/HowToUseJCarder

        Thanks. I was hoping for a simpler process/script to reproduce these graphs though I agree with you on the wiki that it's important to avoid false positives to reduce noise. These two cycles are already borderline and practically harmless, as unlike start/stop, init/shutdown doesn't overlap registration.

        I'm working on getting access to the new build infrastructure so I can set up hudson jobs to run jcarder on a regular basis

        This will be a time saver. +1 and Thanks!

        Show
        Luke Lu added a comment - Yep, the wiki is here: http://wiki.apache.org/hadoop/HowToUseJCarder Thanks. I was hoping for a simpler process/script to reproduce these graphs though I agree with you on the wiki that it's important to avoid false positives to reduce noise. These two cycles are already borderline and practically harmless, as unlike start/stop, init/shutdown doesn't overlap registration. I'm working on getting access to the new build infrastructure so I can set up hudson jobs to run jcarder on a regular basis This will be a time saver. +1 and Thanks!
        Hide
        Todd Lipcon added a comment -

        Yep, the wiki is here: http://wiki.apache.org/hadoop/HowToUseJCarder
        Recently I've been using the more experimental "lockclasses" branch from my github, which supports analyzing readwrite locks, but the rest of the instructions should be the same. I'm working on getting access to the new build infrastructure so I can set up hudson jobs to run jcarder on a regular basis

        Show
        Todd Lipcon added a comment - Yep, the wiki is here: http://wiki.apache.org/hadoop/HowToUseJCarder Recently I've been using the more experimental "lockclasses" branch from my github, which supports analyzing readwrite locks, but the rest of the instructions should be the same. I'm working on getting access to the new build infrastructure so I can set up hudson jobs to run jcarder on a regular basis
        Hide
        Luke Lu added a comment -

        v2 patch fixes the cycle in shutdown as well.

        Todd, do you have a wiki page on how to run the jcarder on hadoop?

        Show
        Luke Lu added a comment - v2 patch fixes the cycle in shutdown as well. Todd, do you have a wiki page on how to run the jcarder on hadoop?
        Hide
        Todd Lipcon added a comment -

        Actually, I take it back. It fixes the first cycle, but there's a second one, attached here.

        Show
        Todd Lipcon added a comment - Actually, I take it back. It fixes the first cycle, but there's a second one, attached here.
        Hide
        Todd Lipcon added a comment -

        +1, I verified that with this patch, the jcarder cycle disappears. And the patch looks reasonable, though I don't know this area of the code well.

        Show
        Todd Lipcon added a comment - +1, I verified that with this patch, the jcarder cycle disappears. And the patch looks reasonable, though I don't know this area of the code well.
        Hide
        Luke Lu added a comment -

        Though it's unlikely to happen in practice (metrics system initialization usually doesn't overlap with metrics source registration), it's probably worth fixing for posterity. The v1 patch make metrics impl in the DefaultMetricsSystem an AtomicReference.

        Show
        Luke Lu added a comment - Though it's unlikely to happen in practice (metrics system initialization usually doesn't overlap with metrics source registration), it's probably worth fixing for posterity. The v1 patch make metrics impl in the DefaultMetricsSystem an AtomicReference.
        Hide
        Luke Lu added a comment -

        Though it's unlikely to happen in practice (metrics system initialization usually doesn't overlap with metrics source registration), it's probably worth fixing. The v1 patch make metrics impl in the DefaultMetricsSystem an AtomicReference.

        Show
        Luke Lu added a comment - Though it's unlikely to happen in practice (metrics system initialization usually doesn't overlap with metrics source registration), it's probably worth fixing. The v1 patch make metrics impl in the DefaultMetricsSystem an AtomicReference.

          People

          • Assignee:
            Luke Lu
            Reporter:
            Todd Lipcon
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development