Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.2.1
Description
DatanodeAdminManager$Monitor reports a node as invalid continuously
2022-07-21 06:54:38,562 WARN org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager (DatanodeAdminMonitor-0): DatanodeAdminMonitor caught exception when processing node 1.2.3.4:9866. java.lang.IllegalStateException: Node 1.2.3.4:9866 is in an invalid state! Invalid state: In Service 0 blocks are on this dn. at com.google.common.base.Preconditions.checkState(Preconditions.java:172) at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.check(DatanodeAdminManager.java:601) at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:504) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750)
A node goes into invalid state when stopDecommission sets the node to IN-Service and misses to remove from pendingNodes queues (HDFS-16675). This will be corrected only when user triggers startDecommission. Till then we need not keep the invalid state node in the queue as anyway startDecommission will add it back.
Attachments
Issue Links
- links to