[HDFS-5276] FileSystem.Statistics got performance issue on multi-thread read/write. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 2.0.4-alpha
Fix Version/s: 2.3.0
Component/s: None
Labels:
None

Target Version/s:

Description

FileSystem.Statistics is a singleton variable for each FS scheme, each read/write on HDFS would lead to a AutomicLong.getAndAdd(). AutomicLong does not perform well in multi-threads(let's say more than 30 threads). so it may cause serious performance issue. during our spark test profile, 32 threads read data from HDFS, about 70% cpu time is spent on FileSystem.Statistics.incrementBytesRead().

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HDFSStatisticTest.java
30/Sep/13 03:14
2 kB
Chengxiang Li
hdfs-test.PNG
30/Sep/13 03:14
122 kB
Chengxiang Li
jstack-trace.PNG
30/Sep/13 03:14
118 kB
Chengxiang Li
DisableFSReadWriteBytesStat.patch
30/Sep/13 08:26
3 kB
Binglin Chang
ThreadLocalStat.patch
01/Oct/13 16:13
3 kB
Binglin Chang
HDFS-5276.001.patch
10/Oct/13 19:19
14 kB
Colin McCabe
HDFS-5276.002.patch
10/Oct/13 21:13
14 kB
Colin McCabe
TestFileSystemStatistics.java
12/Oct/13 08:27
2 kB
Binglin Chang
HDFS-5276.003.patch
15/Oct/13 01:05
15 kB
Colin McCabe

Issue Links

is related to

HADOOP-13435 Add thread local mechanism for aggregating file system storage stats

Patch Available

HADOOP-5318 Poor IO Performance due to AtomicLong operations

Resolved

Activity

People

Assignee:: Colin McCabe

Reporter:: Chengxiang Li

Votes:: 0 Vote for this issue

Watchers:: 9 Start watching this issue

Dates

Created:: 29/Sep/13 03:11

Updated:: 28/Jul/16 19:53

Resolved:: 19/Oct/13 00:19