[YARN-2600] if the container is killed during localization outstanding public cache localization tasks should be cancelled - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 2.4.0
Fix Version/s: None
Component/s: nodemanager
Labels:
None

Description

We came across a situation (partly related with ~~HDFS-7005~~) where a large number of public cache localization tasks were queued in the public localizer thread pool but the container is killed during localization (as it went over the timeout).

What's not helpful in this situation is that any work item that's queued will still be serviced by the resource localization service which is wasteful. And that may further delay localization efforts of other containers.

It would be good if we can cancel the pending localization tasks when the container is killed.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Sangjin Lee

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 24/Sep/14 23:55

Updated:: 24/Sep/14 23:55