Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-7052

RM SchedulingMonitor gives no indication why the spawned thread crashed.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • 2.9.0, 3.0.0-beta1, 2.8.2
    • yarn
    • None
    • Reviewed

    Description

      In YARN-7051, we ran into a case where the preemption monitor thread hung with no indication of why.

      The preemption monitor is started by the SchedulingExecutorService from SchedulingMonitor#serviceStart. Once an uncaught throwable happens, nothing ever gets the result of the future, the thread running the preemption monitor never dies, and it never gets rescheduled.

      If HadoopExecutor were used, it would at least provide a HadoopScheduledThreadPoolExecutor that logs the exception if one happens.

      Attachments

        1. YARN-7052.001.patch
          1 kB
          Eric Payne

        Activity

          People

            epayne Eric Payne
            epayne Eric Payne
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: