Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
HADOOP-9640 added RPC Fair Call Queue and HADOOP-10597 added RPC backoff to ensure the fairness usage of the HDFS namenode resources. YARN, the Hadoop cluster resource manager currently manages the CPU and Memory resources for jobs/tasks but not the storage resources such as HDFS namenode and datanode usage directly. As a result of that, a high priority Yarn Job may send too many RPC requests to HDFS namenode and get demoted into low priority call queues due to lack of reservation/coordination.
To better support multi-tenancy use cases like above, we propose to manage RPC server resource usage via coupon mechanism integrated with YARN. The idea is to allow YARN request HDFS storage resource coupon (e.g., namenode RPC calls, datanode I/O bandwidth) from namenode on behalf of the job upon submission time. Once granted, the tasks will include the coupon identifier in RPC header for the subsequent calls. HDFS namenode RPC scheduler maintains the state of the coupon usage based on the scheduler policy (fairness or priority) to match the RPC priority with the YARN scheduling priority.
I will post a proposal with more detail shortly.