[HDFS-9409] DataNode shutdown does not guarantee full shutdown of all threads due to race condition. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: datanode
Labels:
None

Description

DataNode#shutdown is documented to return "only after shutdown is complete". Even after completion of this method, it's possible that threads started by the DataNode are still running. Race conditions in the shutdown sequence may cause it to skip stopping and joining the BPServiceActor threads.

This is likely not a big problem in normal operations, because these are daemon threads that won't block overall process exit. It is more of a problem for tests, because it makes it impossible to write reliable assertions that these threads exited cleanly. For large test suites, it can also cause an accumulation of unneeded threads, which might harm test performance.

Attachments

Issue Links

relates to

HDFS-15618 Improve datanode shutdown latency

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: Chris Nauroth

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 10/Nov/15 19:59

Updated:: 09/Oct/20 02:15