[HADOOP-14033] Reduce fair call queue lock contention - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 2.7.0
Fix Version/s: 2.9.0, 3.0.0-alpha4, 2.8.2
Component/s: ipc
Labels:
None

Target Version/s:

2.8.1
Hadoop Flags:

Reviewed

Description

Under heavy load the call queue may run dry yet clients experience high latency.

The fcq requires producers and consumers to sync via a shared lock. Polling consumers hold the lock while scanning all sub-queues. Consumers are serialized despite the sub-queues being thread-safe blocking queues. The effect is to cause other producers/consumers to frequently park.

The lock is unfair, so producers/consumers attempt to barge in on the lock. The outnumbered producers tend to remain blocked for an extended time. As load increases and the queues fill, the barging consumers drain the queues faster than the producers can fill it.

Server metrics provide an illusion of healthy throughput, response time, and call queue length due to starvation on the ingress. Often as the load gets worse, the server looks better.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HADOOP-14033.patch
30/Jan/17 18:07
8 kB
Daryn Sharp

Issue Links

breaks

HADOOP-14912 FairCallQueue may defer servicing calls

Resolved

Activity

People

Assignee:: Daryn Sharp

Reporter:: Daryn Sharp

Votes:: 0 Vote for this issue

Watchers:: 10 Start watching this issue

Dates

Created:: 27/Jan/17 22:08

Updated:: 27/Sep/17 21:18

Resolved:: 09/Feb/17 22:21