Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-14420 Zombie Stomping Session
  3. HBASE-14772

Improve zombie detector; be more discerning

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 2.0.0
    • test
    • None

    Description

      Currently, any surefire process with the hbase flag is a potential zombie. Our zombie check currently takes a reading and if it finds candidate zombies, it waits 30 seconds and then does another reading. If a concurrent build going on, in both cases the zombie detector will come up positive though the adjacent test run may be making progress; i.e. the cast of surefire processes may have changed between readings but our detector just sees presence of hbase surefire processes.

      Here is example:

      Suspicious java process found - waiting 30s to see if there are just slow to stop
      There appear to be 5 zombie tests, they should have been killed by surefire but survived
      12823 surefirebooter852180186418035480.jar -enableassertions -Dhbase.test -Xmx2800m -XX:MaxPermSize=256m -Djava.security.egd=file:/dev/./urandom -Djava.net.preferIPv4Stack=true -Djava.awt.headless=true
      7653 surefirebooter8579074445899448699.jar -enableassertions -Dhbase.test -Xmx2800m -XX:MaxPermSize=256m -Djava.security.egd=file:/dev/./urandom -Djava.net.preferIPv4Stack=true -Djava.awt.headless=true
      12614 surefirebooter136529596936417090.jar -enableassertions -Dhbase.test -Xmx2800m -XX:MaxPermSize=256m -Djava.security.egd=file:/dev/./urandom -Djava.net.preferIPv4Stack=true -Djava.awt.headless=true
      7836 surefirebooter3217047564606450448.jar -enableassertions -Dhbase.test -Xmx2800m -XX:MaxPermSize=256m -Djava.security.egd=file:/dev/./urandom -Djava.net.preferIPv4Stack=true -Djava.awt.headless=true
      13566 surefirebooter2084039411151963494.jar -enableassertions -Dhbase.test -Xmx2800m -XX:MaxPermSize=256m -Djava.security.egd=file:/dev/./urandom -Djava.net.preferIPv4Stack=true -Djava.awt.headless=true
      ************ BEGIN zombies jstack extract
      ************ END  zombies jstack extract
      

      5 is the number of forked processes we allow when doing medium and large tests.... so an adjacent build will always show as '5 zombies'.

      Need to add discerning if list of processes changes between readings.

      Can I also add a tag per build run that all forked processes pick up so I can look at the current builds progeny only?

      Attachments

        1. zombie.patch
          6 kB
          Michael Stack
        2. zombiev2.patch
          6 kB
          Michael Stack
        3. 14772v3.patch
          4 kB
          Michael Stack

        Activity

          People

            stack Michael Stack
            stack Michael Stack
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: