Uploaded image for project: 'Aurora'
  1. Aurora
  2. AURORA-698

aurora executor _shutdown deadline calls should be daemonized

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.9.0
    • Executor
    • None
    • Twitter Aurora Q2'15 Sprint 3, Twitter Aurora Q2'15 Sprint 6
    • 1

    Description

      In the aurora executor shutdown method, we have deadline() calls:

        def _shutdown(self, status_result):
          runner_status = self._runner.status
      
          try:
            deadline(self._runner.stop, timeout=self.STOP_TIMEOUT)
          except Timeout:
            log.error('Failed to stop runner within deadline.')
      
          try:
            deadline(self._chained_checker.stop, timeout=self.STOP_TIMEOUT)
          except Timeout:
            log.error('Failed to stop all checkers within deadline.')
      
          # If the runner was alive when _shutdown was called, defer to the status_result,
          # otherwise the runner's terminal state is the preferred state.
          exit_status = runner_status or status_result
      
          self.send_update(
              self._driver,
              self._task_id,
              exit_status.status,
              status_result.reason)
      
          self.terminated.set()
          defer(self._driver.stop, delay=self.PERSISTENCE_WAIT)
      

      However if runner.stop fails with a Timeout exception, the spawned AnonymousThread is not daemonized and causes the executor to fail to exit. This means that the cgroup will not be torn down and if the runner.stop actually failed, the process can stay alive even if TASK_KILLED was delivered.

      Attachments

        Activity

          People

            wickman Brian Wickman
            wickman Brian Wickman
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: