2 improvements are suggested for implementation of methods org.apache.hadoop.fs.FileUtil.fullyDelete(File) and org.apache.hadoop.fs.FileUtil.fullyDeleteContents(File):
1) We should grant +rwx permissions the target directories before trying to delete them.
The mentioned methods fail to delete directories that don't have read or execute permissions.
Actual problem appears if an hdfs-related test is timed out (with a short timeout like tens of seconds), and the forked test process is killed, some directories are left on disk that are not readable and/or executable. This prevents next tests from being executed properly because these directories cannot be deleted with FileUtil#fullyDelete(), so many subsequent tests fail. So, its recommended to grant the read, write, and execute permissions the directories whose content is to be deleted.
2) Generic reliability improvement: we shouldn't rely upon File#delete() return value, use File#exists() instead.
FileUtil#fullyDelete() uses return value of method java.io.File#delete(), but this is not reliable because File#delete() returns true only if the file was deleted as a result of the #delete() method invocation. E.g. in the following code
if the file f was deleted by another thread or process between calls "1" and "2", this fragment will return "false", while the file f does not exist upon the method return.
So, better to write