Uploaded image for project: 'Tajo'
  1. Tajo
  2. TAJO-1507

Resource leak when a worker does not response KILLED message

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.11.0
    • Component/s: Resource Manager, Worker
    • Labels:
      None

      Description

      Terminology

      • QM - query master
      • TM - tajo master

      The query kill mechanism is as follows:

      • Client sends kill command to TM.
      • TM forwards the kill command to QM.
      • QM disseminates the kill command to all workers.
      • Corresponding workers kill tasks and response KILLED to QM.

      BTW, Some workers cannot response KILLED message to QM due to its node failure, temporary network problem, or worker restart. In this case, TM cannot retrieve allocated resources even though the workers become normal.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              hyunsik Hyunsik Choi
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: