Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Won't Fix
-
None
-
None
-
None
-
None
Description
Currently data node does not distinguish between critical and non critical exceptions.
Any exception is treated as a signal to sleep and then try again. See
org.apache.hadoop.dfs.DataNode.run()
This is happening because RPC always throws the same RemoteException.
In some cases (like UnregisteredDatanodeException, IncorrectVersionException) the data
node should shutdown rather than retry.
This logic naturally belongs to the
org.apache.hadoop.dfs.DataNode.offerService()
but can be reasonably implemented (without examining the RemoteException.className
field) after HADOOP-266 (2) is fixed.