[HDFS-12533] NNThroughputBenchmark threads get stuck on UGI.getCurrentUser() - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: None
Labels:
None

Description

In NameNode#getRemoteUser(), it first attempts to fetch from the RPC user (not a synchronized operation), and if there is no RPC call, it will call UserGroupInformation#getCurrentUser() (which is synchronized). This makes it efficient for RPC operations (the bulk) so that there is not too much contention.

In NNThroughputBenchmark, however, there is no RPC call since we bypass that later, so with a high thread count many of the threads are getting stuck. At one point I attached a profiler and found that quite a few threads had been waiting for #getCurrentUser() for 2 minutes ( ! ). When taking this away I found some improvement in the throughput numbers I was seeing. To more closely emulate a real NN we should improve this issue.

Attachments

Issue Links

is related to

HADOOP-9747 Reduce unnecessary UGI synchronization

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: Erik Krogen

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 22/Sep/17 21:54

Updated:: 08/Jan/19 00:32