Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 0.23.0
    • Fix Version/s: 0.23.0
    • Component/s: metrics
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Lock cycle detected by jcarder between MetricsSystemImpl and DefaultMetricsSystem

      1. metrics-deadlock.png
        49 kB
        Todd Lipcon
      2. hadoop-7529-v1.patch
        2 kB
        Luke Lu
      3. metrics-shutdown.png
        59 kB
        Todd Lipcon
      4. hadoop-7529-v2.patch
        2 kB
        Luke Lu

        Activity

        Todd Lipcon created issue -
        Todd Lipcon made changes -
        Field Original Value New Value
        Attachment metrics-deadlock.png [ 12489764 ]
        Hide
        Luke Lu added a comment -

        Though it's unlikely to happen in practice (metrics system initialization usually doesn't overlap with metrics source registration), it's probably worth fixing. The v1 patch make metrics impl in the DefaultMetricsSystem an AtomicReference.

        Show
        Luke Lu added a comment - Though it's unlikely to happen in practice (metrics system initialization usually doesn't overlap with metrics source registration), it's probably worth fixing. The v1 patch make metrics impl in the DefaultMetricsSystem an AtomicReference.
        Luke Lu made changes -
        Attachment hadoop-7529-v1.patch [ 12489786 ]
        Luke Lu made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Luke Lu made changes -
        Assignee Luke Lu [ vicaya ]
        Hide
        Luke Lu added a comment -

        Though it's unlikely to happen in practice (metrics system initialization usually doesn't overlap with metrics source registration), it's probably worth fixing for posterity. The v1 patch make metrics impl in the DefaultMetricsSystem an AtomicReference.

        Show
        Luke Lu added a comment - Though it's unlikely to happen in practice (metrics system initialization usually doesn't overlap with metrics source registration), it's probably worth fixing for posterity. The v1 patch make metrics impl in the DefaultMetricsSystem an AtomicReference.
        Hide
        Todd Lipcon added a comment -

        +1, I verified that with this patch, the jcarder cycle disappears. And the patch looks reasonable, though I don't know this area of the code well.

        Show
        Todd Lipcon added a comment - +1, I verified that with this patch, the jcarder cycle disappears. And the patch looks reasonable, though I don't know this area of the code well.
        Hide
        Todd Lipcon added a comment -

        Actually, I take it back. It fixes the first cycle, but there's a second one, attached here.

        Show
        Todd Lipcon added a comment - Actually, I take it back. It fixes the first cycle, but there's a second one, attached here.
        Todd Lipcon made changes -
        Attachment metrics-shutdown.png [ 12490150 ]
        Luke Lu made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Hide
        Luke Lu added a comment -

        v2 patch fixes the cycle in shutdown as well.

        Todd, do you have a wiki page on how to run the jcarder on hadoop?

        Show
        Luke Lu added a comment - v2 patch fixes the cycle in shutdown as well. Todd, do you have a wiki page on how to run the jcarder on hadoop?
        Luke Lu made changes -
        Attachment hadoop-7529-v2.patch [ 12490174 ]
        Luke Lu made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        Todd Lipcon added a comment -

        Yep, the wiki is here: http://wiki.apache.org/hadoop/HowToUseJCarder
        Recently I've been using the more experimental "lockclasses" branch from my github, which supports analyzing readwrite locks, but the rest of the instructions should be the same. I'm working on getting access to the new build infrastructure so I can set up hudson jobs to run jcarder on a regular basis

        Show
        Todd Lipcon added a comment - Yep, the wiki is here: http://wiki.apache.org/hadoop/HowToUseJCarder Recently I've been using the more experimental "lockclasses" branch from my github, which supports analyzing readwrite locks, but the rest of the instructions should be the same. I'm working on getting access to the new build infrastructure so I can set up hudson jobs to run jcarder on a regular basis
        Hide
        Luke Lu added a comment -

        Yep, the wiki is here: http://wiki.apache.org/hadoop/HowToUseJCarder

        Thanks. I was hoping for a simpler process/script to reproduce these graphs though I agree with you on the wiki that it's important to avoid false positives to reduce noise. These two cycles are already borderline and practically harmless, as unlike start/stop, init/shutdown doesn't overlap registration.

        I'm working on getting access to the new build infrastructure so I can set up hudson jobs to run jcarder on a regular basis

        This will be a time saver. +1 and Thanks!

        Show
        Luke Lu added a comment - Yep, the wiki is here: http://wiki.apache.org/hadoop/HowToUseJCarder Thanks. I was hoping for a simpler process/script to reproduce these graphs though I agree with you on the wiki that it's important to avoid false positives to reduce noise. These two cycles are already borderline and practically harmless, as unlike start/stop, init/shutdown doesn't overlap registration. I'm working on getting access to the new build infrastructure so I can set up hudson jobs to run jcarder on a regular basis This will be a time saver. +1 and Thanks!
        Hide
        Todd Lipcon added a comment -

        +1, patch looks good to me. Thanks Luke!

        Show
        Todd Lipcon added a comment - +1, patch looks good to me. Thanks Luke!
        Hide
        Luke Lu added a comment -

        Committed v2 to trunk. Thanks Todd for reviewing!

        Show
        Luke Lu added a comment - Committed v2 to trunk. Thanks Todd for reviewing!
        Luke Lu made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Hadoop Flags [Reviewed]
        Resolution Fixed [ 1 ]
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Common-trunk-Commit #741 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/741/)
        HADOOP-7529. Fix lock cycles in metrics system. (llu)

        llu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1157187
        Files :

        • /hadoop/common/trunk/hadoop-common/CHANGES.txt
        • /hadoop/common/trunk/hadoop-common/src/main/java/org/apache/hadoop/metrics2/lib/DefaultMetricsSystem.java
        Show
        Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #741 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/741/ ) HADOOP-7529 . Fix lock cycles in metrics system. (llu) llu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1157187 Files : /hadoop/common/trunk/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common/src/main/java/org/apache/hadoop/metrics2/lib/DefaultMetricsSystem.java
        Arun C Murthy made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Patch Available Patch Available Open Open
        2d 20h 35m 1 Luke Lu 11/Aug/11 22:23
        Open Open Patch Available Patch Available
        2h 14m 2 Luke Lu 11/Aug/11 22:26
        Patch Available Patch Available Resolved Resolved
        19h 33m 1 Luke Lu 12/Aug/11 17:59
        Resolved Resolved Closed Closed
        94d 7h 50m 1 Arun C Murthy 15/Nov/11 00:50

          People

          • Assignee:
            Luke Lu
            Reporter:
            Todd Lipcon
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development