Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.20.1
-
None
-
None
Description
While profiling tasktracker with Sort benchmark, it was observed that threads block on LocalDirAllocator.getLocalPathToRead() in order to get the index file and temporary map output file.
As LocalDirAllocator is tied up with ServetContext, only one instance would be available per tasktracker httpserver. Given the jobid & mapid, LocalDirAllocator retrieves index file path and temporary map output file path. getLocalPathToRead() is internally synchronized.
Introducing a LRUCache for this lookup reduces the contention heavily (LRUCache with key =jobid +mapid and value=PATH to the file). Size of the LRUCache can be varied based on the environment and I observed a throughput improvement in the order of 4-7% with the introduction of LRUCache.