Details
-
Bug
-
Status: Closed
-
Blocker
-
Resolution: Fixed
-
0.3.0
-
None
-
None
Description
A data node looses connection to a name node and then tries to offerService() again.
HADOOP-270 changes force it to start dataXceiveServer, which is already started and in this case
throws IllegalThreadStateException, which goes on in a loop, and never reaches the heartbeat section.
So the data node never re-joins the cluster, while from the out side it looks it's still running.
This is another reason why we see missing data, and don't see failed data nodes.