[HADOOP-2864] Improve the Scalability and Robustness of IPC - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 0.16.0
Fix Version/s: None
Component/s: ipc
Labels:
None

Description

This jira is intended to enhance IPC's scalability and robustness.

Currently an IPC server can easily hung due to a disk failure or garbage collection, during which it cannot respond to the clients promptly. This has caused a lot of dropped calls and delayed responses thus many running applications fail on timeout. On the other side if busy clients send a lot of requests to the server in a short period of time or too many clients communicate with the server simultaneously, the server may be swarmed by requests and cannot work responsively.

The proposed changes aim to

provide a better client/server coordination
- Server should be able to throttle client during burst of requests.
- A slow client should not affect server from serving other clients.
- A temporary hanging server should not cause catastrophic failures to clients.
Client/server should detect remote side failures. Examples of failures include: (1) the remote host is crashed; (2) the remote host is crashed and then rebooted; (3) the remote process is crashed or shut down by an operator;
Fairness. Each client should be able to make progress.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

RPCScalabilityDesignWeb.pdf
20/Feb/08 22:48
85 kB
Hairong Kuang

Issue Links

depends upon

HADOOP-2870 Datanode.shutdown() and Namenode.stop() should close all rpc connections

Closed

incorporates

HADOOP-2909 Improve IPC idle connection management

Closed

HADOOP-2975 IPC server should not allocate a buffer for each request

Resolved

HADOOP-2188 RPC should send a ping rather than use client timeouts

Closed

HADOOP-2910 Throttle IPC Client/Server during bursts of requests or server slowdown

Closed

Activity

People

Assignee:: Hairong Kuang

Reporter:: Hairong Kuang

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 20/Feb/08 22:46

Updated:: 17/Jul/14 21:00

Resolved:: 17/Jul/14 21:00