Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-3091 [Umbrella] Improve and fix locks of RM scheduler
  3. YARN-4416

Deadlock due to synchronised get Methods in AbstractCSQueue




      While debugging in eclipse came across a scenario where in i had to get to know the name of the queue but every time i tried to see the queue it was getting hung. On seeing the stack realized there was a deadlock but on analysis found out that it was only due to queue.toString() during debugging as AbstractCSQueue.getAbsoluteUsedCapacity was synchronized.
      Hence we need to ensure following :

      1. queueCapacity, resource-usage has their own read/write lock hence synchronization is not req
      2. numContainers is volatile hence synchronization is not req.
      3. read/write lock could be added to Ordering Policy. Read operations don't need synchronized. So getNumApplications doesn't need synchronized.
        (First 2 will be handled in this jira and the third will be handled in YARN-4443)


        1. deadlock.log
          161 kB
          Naganarasimha G R
        2. YARN-4416.v1.001.patch
          12 kB
          Naganarasimha G R
        3. YARN-4416.v1.002.patch
          13 kB
          Naganarasimha G R
        4. YARN-4416.v2.001.patch
          5 kB
          Naganarasimha G R
        5. YARN-4416.v2.002.patch
          5 kB
          Naganarasimha G R
        6. YARN-4416.v2.003.patch
          5 kB
          Naganarasimha G R



            • Assignee:
              Naganarasimha Naganarasimha G R
              Naganarasimha Naganarasimha G R
            • Votes:
              0 Vote for this issue
              7 Start watching this issue


              • Created: