Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-13128

Manage Hadoop RPC resource usage via resource coupon

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      HADOOP-9640 added RPC Fair Call Queue and HADOOP-10597 added RPC backoff to ensure the fairness usage of the HDFS namenode resources. YARN, the Hadoop cluster resource manager currently manages the CPU and Memory resources for jobs/tasks but not the storage resources such as HDFS namenode and datanode usage directly. As a result of that, a high priority Yarn Job may send too many RPC requests to HDFS namenode and get demoted into low priority call queues due to lack of reservation/coordination.

      To better support multi-tenancy use cases like above, we propose to manage RPC server resource usage via coupon mechanism integrated with YARN. The idea is to allow YARN request HDFS storage resource coupon (e.g., namenode RPC calls, datanode I/O bandwidth) from namenode on behalf of the job upon submission time. Once granted, the tasks will include the coupon identifier in RPC header for the subsequent calls. HDFS namenode RPC scheduler maintains the state of the coupon usage based on the scheduler policy (fairness or priority) to match the RPC priority with the YARN scheduling priority.

      I will post a proposal with more detail shortly.

        Activity

        Hide
        xyao Xiaoyu Yao added a comment -

        Attach a draft proposal for discussion.

        Show
        xyao Xiaoyu Yao added a comment - Attach a draft proposal for discussion.
        Hide
        xyao Xiaoyu Yao added a comment -

        Create HADOOP-13128 branch for resource token related development.

        Show
        xyao Xiaoyu Yao added a comment - Create HADOOP-13128 branch for resource token related development.
        Hide
        shv Konstantin Shvachko added a comment -

        Will this also affect WebHDFS clients, or is it limited to RPCs only?
        Http clients can be as "aggressive" as RPC ones based on my experience.

        Show
        shv Konstantin Shvachko added a comment - Will this also affect WebHDFS clients, or is it limited to RPCs only? Http clients can be as "aggressive" as RPC ones based on my experience.
        Hide
        ywskycn Wei Yan added a comment -

        Xiaoyu Yao thanks for sharing the design. We have a very similar issue as you discussed in the doc and resource coupon is a very good idea. Our Hadoop cluster is shared among multiple different services/jobs/queries, and some services/jobs (ETL/ingestion) may send too many RPC calls to NN. Under current implementation, these jobs can be easily backoff and low-prioritied as they run under the same service account, and it's not straightforward to distribute these calls to multiple service accounts. Also, some of these jobs get guaranteed YARN resources, but sometimes these jobs still get delayed due to RPC starvation.

        Instead of using resource coupon idea to manage RPC resources, we're looking into some more static approaches (as the number of abovementioned services/jobs is very small, less than 10), and trying to allocate dedicated RPC share for certain service users. Along with existing FairCallQueue setup (like using 10 queues with different priorities), we would add some additional special queues, one for each special user. For each special user, we provide a guarantee RPC share (like 10% which can be aligned with its YARN resource share), and this percentage can be converted to a weight used in WeightedRoundRobinMultiplexer. A quick example, we have 4 default queues with default weights (8, 4, 2, 1), and two special service users (user1 with 10% share, and user2 with 15% share). So finally we'll have 6 queues, 4 default queues (with weights 8, 4, 2, 1) and 2 special queues (user1Queue weighted 15*10%/75%=2, and user2Queue weighted 15*15%/75%=3).

        For new coming RPC call, we'll add one additional check. If it comes from a special user, it will be put into the dedicated queue reserved for that user; for other calls, we'll follow current count decay mechanism and put into the default queues.

        From the handler side, it fetches new calls from queue using the index provided by WeightedRoundRobinMultiplexer.

        By default, there is no special user and all RPC requests follow existing FairCallQueue implementation.

        Would like to hear more comments on this approach; also want to know any other available approaches?

        Show
        ywskycn Wei Yan added a comment - Xiaoyu Yao thanks for sharing the design. We have a very similar issue as you discussed in the doc and resource coupon is a very good idea. Our Hadoop cluster is shared among multiple different services/jobs/queries, and some services/jobs (ETL/ingestion) may send too many RPC calls to NN. Under current implementation, these jobs can be easily backoff and low-prioritied as they run under the same service account, and it's not straightforward to distribute these calls to multiple service accounts. Also, some of these jobs get guaranteed YARN resources, but sometimes these jobs still get delayed due to RPC starvation. Instead of using resource coupon idea to manage RPC resources, we're looking into some more static approaches (as the number of abovementioned services/jobs is very small, less than 10), and trying to allocate dedicated RPC share for certain service users. Along with existing FairCallQueue setup (like using 10 queues with different priorities), we would add some additional special queues, one for each special user. For each special user, we provide a guarantee RPC share (like 10% which can be aligned with its YARN resource share), and this percentage can be converted to a weight used in WeightedRoundRobinMultiplexer. A quick example, we have 4 default queues with default weights (8, 4, 2, 1), and two special service users (user1 with 10% share, and user2 with 15% share). So finally we'll have 6 queues, 4 default queues (with weights 8, 4, 2, 1) and 2 special queues (user1Queue weighted 15*10%/75%=2, and user2Queue weighted 15*15%/75%=3). For new coming RPC call, we'll add one additional check. If it comes from a special user, it will be put into the dedicated queue reserved for that user; for other calls, we'll follow current count decay mechanism and put into the default queues. From the handler side, it fetches new calls from queue using the index provided by WeightedRoundRobinMultiplexer. By default, there is no special user and all RPC requests follow existing FairCallQueue implementation. Would like to hear more comments on this approach; also want to know any other available approaches?

          People

          • Assignee:
            xyao Xiaoyu Yao
            Reporter:
            xyao Xiaoyu Yao
          • Votes:
            0 Vote for this issue
            Watchers:
            25 Start watching this issue

            Dates

            • Created:
              Updated:

              Development