Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
2.0.4-alpha
-
None
-
None
Description
FileSystem.Statistics is a singleton variable for each FS scheme, each read/write on HDFS would lead to a AutomicLong.getAndAdd(). AutomicLong does not perform well in multi-threads(let's say more than 30 threads). so it may cause serious performance issue. during our spark test profile, 32 threads read data from HDFS, about 70% cpu time is spent on FileSystem.Statistics.incrementBytesRead().
Attachments
Attachments
Issue Links
- is related to
-
HADOOP-13435 Add thread local mechanism for aggregating file system storage stats
- Patch Available
-
HADOOP-5318 Poor IO Performance due to AtomicLong operations
- Resolved