Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-5964

Lower the granularity of locks in FairScheduler

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Critical
    • Resolution: Duplicate
    • 2.7.1
    • None
    • fairscheduler
    • None
    • CentOS-7.1

    • This issue is duplicated. Please pay attention to YARN-3091

    Description

      When too many applications are running, we found that client couldn't submit the application, and a high callqueuelength of port 8032. I catch the jstack of resourcemanager when callqueuelength is too high. I found that the thread "IPC Server handler xxx on 8032" are waitting for the object lock of FairScheduler, nodeupdate holds the lock of the FairScheduler. Maybe high process time leads to the phenomenon that client can't submit the application.
      Here I don't consider the problem that client can't submit the application, only estimates the performance of the fairscheduler. We can see too many function which needs object lock are used, the granularity of object lock is too big. For example, nodeUpdate and getAppWeight wanna hold the same object lock. It is unresonable and inefficiency. I recommand that the low granularity lock replaces the current lock.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            zhengchenyu Chenyu Zheng
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 2m
                2m
                Remaining:
                Remaining Estimate - 2m
                2m
                Logged:
                Time Spent - Not Specified
                Not Specified

                Slack

                  Issue deployment