Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-7052

RM SchedulingMonitor gives no indication why the spawned thread crashed.

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.9.0, 3.0.0-beta1, 2.8.2
    • Component/s: yarn
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      In YARN-7051, we ran into a case where the preemption monitor thread hung with no indication of why.

      The preemption monitor is started by the SchedulingExecutorService from SchedulingMonitor#serviceStart. Once an uncaught throwable happens, nothing ever gets the result of the future, the thread running the preemption monitor never dies, and it never gets rescheduled.

      If HadoopExecutor were used, it would at least provide a HadoopScheduledThreadPoolExecutor that logs the exception if one happens.

        Attachments

          Activity

            People

            • Assignee:
              eepayne Eric Payne
              Reporter:
              eepayne Eric Payne
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: