Hadoop Common
  1. Hadoop Common
  2. HADOOP-2864

Improve the Scalability and Robustness of IPC

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.16.0
    • Fix Version/s: None
    • Component/s: ipc
    • Labels:
      None

      Description

      This jira is intended to enhance IPC's scalability and robustness.

      Currently an IPC server can easily hung due to a disk failure or garbage collection, during which it cannot respond to the clients promptly. This has caused a lot of dropped calls and delayed responses thus many running applications fail on timeout. On the other side if busy clients send a lot of requests to the server in a short period of time or too many clients communicate with the server simultaneously, the server may be swarmed by requests and cannot work responsively.

      The proposed changes aim to

      1. provide a better client/server coordination
        • Server should be able to throttle client during burst of requests.
        • A slow client should not affect server from serving other clients.
        • A temporary hanging server should not cause catastrophic failures to clients.
      2. Client/server should detect remote side failures. Examples of failures include: (1) the remote host is crashed; (2) the remote host is crashed and then rebooted; (3) the remote process is crashed or shut down by an operator;
      3. Fairness. Each client should be able to make progress.

        Issue Links

          Activity

          Hide
          Hairong Kuang added a comment -

          Design document is attached.

          Show
          Hairong Kuang added a comment - Design document is attached.

            People

            • Assignee:
              Hairong Kuang
              Reporter:
              Hairong Kuang
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:

                Development