In existing YARN RM scheduler, there're some issues of using locks. For example:
- Many unnecessary synchronized locks, we have seen several cases recently that too frequent access of scheduler makes scheduler hang. Which could be addressed by using read/write lock. Components include scheduler, CS queues, apps
- Some fields not properly locked (Like clusterResource)
We can address them together in this ticket.
(More details see comments below)