Details
-
Improvement
-
Status: Patch Available
-
Major
-
Resolution: Unresolved
-
3.0.0-alpha4
-
None
Description
MRAppMaster uses TaskAttemptImpl::resolveHosts to determine the dataLocalHosts for each task when the location of data split is IP, which will call a lot of times ( taskNum * dfsReplication) of function InetAddress::getByName and most of the funcition calls are redundant. When the job has a great number of tasks and the speed of DNS resolution is not fast enough, it will take a lot of time at this stage before the job running.