Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Fixed
-
0.20.205.0, 0.23.1
-
None
Description
The getDU method should not include the size of the directory. The Java interface says that the value is undefined and in Linux/Sun it gets the 4096 for the inode. Clearly this isn't useful.
It also recursively calls itself. In case the directory has a symbolic link forming a cycle, getDU keeps spinning in the cycle. In our case, we saw this in the org.apache.hadoop.mapred.JobLocalizer.downloadPrivateCacheObjects call. This prevented other tasks on the same node from committing, causing the TT to become effectively useless (because the JT thinks it already has enough tasks running)