Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
DataNode#shutdown is documented to return "only after shutdown is complete". Even after completion of this method, it's possible that threads started by the DataNode are still running. Race conditions in the shutdown sequence may cause it to skip stopping and joining the BPServiceActor threads.
This is likely not a big problem in normal operations, because these are daemon threads that won't block overall process exit. It is more of a problem for tests, because it makes it impossible to write reliable assertions that these threads exited cleanly. For large test suites, it can also cause an accumulation of unneeded threads, which might harm test performance.
Attachments
Issue Links
- relates to
-
HDFS-15618 Improve datanode shutdown latency
- Resolved