Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-5964

Lower the granularity of locks in FairScheduler

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Critical
    • Resolution: Duplicate
    • Affects Version/s: 2.7.1
    • Fix Version/s: None
    • Component/s: fairscheduler
    • Labels:
      None
    • Environment:

      CentOS-7.1

    • Target Version/s:
    • Release Note:
      This issue is duplicated. Please pay attention to YARN-3091

      Description

      When too many applications are running, we found that client couldn't submit the application, and a high callqueuelength of port 8032. I catch the jstack of resourcemanager when callqueuelength is too high. I found that the thread "IPC Server handler xxx on 8032" are waitting for the object lock of FairScheduler, nodeupdate holds the lock of the FairScheduler. Maybe high process time leads to the phenomenon that client can't submit the application.
      Here I don't consider the problem that client can't submit the application, only estimates the performance of the fairscheduler. We can see too many function which needs object lock are used, the granularity of object lock is too big. For example, nodeUpdate and getAppWeight wanna hold the same object lock. It is unresonable and inefficiency. I recommand that the low granularity lock replaces the current lock.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                zhengchenyu zhengchenyu
              • Votes:
                0 Vote for this issue
                Watchers:
                8 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - 2m
                  2m
                  Remaining:
                  Remaining Estimate - 2m
                  2m
                  Logged:
                  Time Spent - Not Specified
                  Not Specified