Uploaded image for project: 'UIMA'
  1. UIMA
  2. UIMA-3685

DUCC's rogue process detector not reporting JPs parented by init (1)

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.0.0-Ducc
    • 1.1.0-Ducc
    • DUCC
    • None

    Description

      Its been observed that a JP launched by DUCC hung while writing out its core dump due to exceeded quota. The process was still alive blocking in write().

      The core dump caused the change in process ownership. The OS changed the owner from <user> to init(1). The process still had its cgroup intact as it was still running.

      The rogue process detector while looking for rogue processes checks if a process belongs to a cgroup. If it does, the detector assumes that this is a valid process and not rogue.

      The detector should not check if the process belongs to a cgroup while determining if its rogue or not. Any process that does not have ducc as its ancestor should be treated as rogue and reported as such for subsequent cleanup. Exception to this are processes belonging to users with reservations on the node.

      Attachments

        Activity

          People

            cwiklik Jaroslaw Cwiklik
            cwiklik Jaroslaw Cwiklik
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: