Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
2.6.0
-
None
-
Reviewed
Description
When NM work-preserving restart is enabled, we see several NPEs on recovery. These seem to correspond to sub-directories that need to be deleted. I wonder if null pointers here mean incorrect tracking of these resources and a potential leak. This JIRA is to investigate and fix anything required.
Logs show:
2015-05-18 07:06:10,225 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting absolute path : null 2015-05-18 07:06:10,224 ERROR org.apache.hadoop.yarn.server.nodemanager.DeletionService: Exception during execution of task in DeletionService java.lang.NullPointerException at org.apache.hadoop.fs.FileContext.fixRelativePart(FileContext.java:274) at org.apache.hadoop.fs.FileContext.delete(FileContext.java:755) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.deleteAsUser(DefaultContainerExecutor.java:458) at org.apache.hadoop.yarn.server.nodemanager.DeletionService$FileDeletionTask.run(DeletionService.java:293)