Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Cannot Reproduce
-
None
-
None
-
None
-
None
Description
We get 100s of exceptions at WARN level per day indicating errors while trying to read local blocks. When this occurs, I've checked on the local box's dfs.data.dir and the block is not present. Here is a relevant snippet from the logs regarding the missing block. It looks like the DataNode deletes the block and then tries to read it again later.
NOTE: this is for the jar file as up to 8 hosts have this exception for one block and our data repl factor is only 3.