Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
3.1.4
-
None
-
None
-
Patch
Description
DataNode DiskChecker treat all IOExeption as check failed. But some IOException like "too many open files" or "can not create new thread" is not about volume health state.
2021-01-11 19:17:10,751 | WARN | Thread-121065 | Removing failed volume /srv/BigData/hadoop/data17/dn/current: | FsVolumeList.java:247 org.apache.hadoop.util.DiskChecker$DiskErrorException: Directory is not writable: /srv/BigData/hadoop/data17/dn/current/BP-197188276-xxxxxxxx-1525514126952/current/finalized at org.apache.hadoop.util.DiskChecker.checkAccessByRWFile(DiskChecker.java:235) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.checkDirs(BlockPoolSlice.java:346) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.checkDirs(FsVolumeImpl.java:938) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.checkDirs(FsVolumeList.java:245) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.checkDataDir(FsDatasetImpl.java:2234) at org.apache.hadoop.hdfs.server.datanode.DataNode.checkDiskError(DataNode.java:3537) at org.apache.hadoop.hdfs.server.datanode.DataNode.access$900(DataNode.java:254) at org.apache.hadoop.hdfs.server.datanode.DataNode$8.run(DataNode.java:3571) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.IOException: Too many open files at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.createNewFile(File.java:1012) at org.apache.hadoop.util.DiskChecker.checkAccessByRWFile(DiskChecker.java:232) ... 8 more
we should treat IOExcption more precisely.