Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
DataNode currently just ignores the files it does not know about. There could be a lot of files left in DataNode's storage that never get noticed or deleted. These files could be left because of bugs or by a misconfiguration. E.g. while upgrading from 0.17, DN left a lot of metada files that were not named in correct format for 0.18 (HADOOP-4663).
The proposal here is simply to make DN print a warning for each of the unknown files at the start up. This at least gives a way to list all the unknown files and (equally importantly) forces a notion of "known" and "unknown" files in the storage.