Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-270

RM scheduler event handler thread gets behind

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Critical
    • Resolution: Unresolved
    • 0.23.5
    • None
    • resourcemanager
    • None

    Description

      We had a couple of incidents on a 2800 node cluster where the RM scheduler event handler thread got behind processing events and basically become unusable. It was still processing apps, but taking a long time (1 hr 45 minutes) to accept new apps. this actually happened twice within 5 days.

      We are using the capacity scheduler and at the time had between 400 and 500 applications running. There were another 250 apps that were in the SUBMITTED state in the RM but the scheduler hadn't processed those to put in pending state yet. We had about 15 queues none of them hierarchical. We also had plenty of space lefts on the cluster.

      Attachments

        Issue Links

          There are no Sub-Tasks for this issue.

          Activity

            People

              tgraves Thomas Graves
              tgraves Thomas Graves
              Votes:
              0 Vote for this issue
              Watchers:
              22 Start watching this issue

              Dates

                Created:
                Updated: