Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
0.20.1
-
None
-
None
-
Incompatible change, Reviewed
-
Directories specified in mapred.local.dir that can not be created now cause the TaskTracker to fail to start.
Description
We are seeing that when we restart a tasktracker, it tries to recursively delete all the file in the distributed cache. It invoked FileUtil.fullyDelete() which is very very slow. This means that the TaskTracker cannot join the cluster for an extended period of time (upto 2 hours for us). The problem is acute if the number of files in a distributed cache is a few-thousands.
Attachments
Attachments
Issue Links
- blocks
-
MAPREDUCE-1303 Merge org.apache.hadoop.mapred.CleanupQueue with MRAsyncDiskService
- Open
-
MAPREDUCE-1302 TrackerDistributedCacheManager can delete file asynchronously
- Closed
- breaks
-
MAPREDUCE-4481 User Log Retention across TT restarts
- Reopened
- is blocked by
-
HADOOP-6433 Add AsyncDiskService that is used in both hdfs and mapreduce
- Closed
- is related to
-
HDFS-611 Heartbeats times from Datanodes increase when there are plenty of blocks to delete
- Closed
- relates to
-
MAPREDUCE-2049 JT and TT should prune invalid local dirs on startup
- Resolved
-
MAPREDUCE-1382 MRAsyncDiscService should tolerate missing local.dir
- Closed