Bikas, thanks for the feedback.
I thought that static blocks are executed once per class loader. So I am not sure why this one would be executed per inner class object creation.
It's true that static initialization happens at class load time, so once per JVM/class loader. The problem here is that the test is submitting a MapReduce job with a mapper class defined as a nested class of the test class. Then, each map task runs inside its own JVM/class loader. Therefore, each separate JVM loads the class separately and executes the static block separately. All of these JVMs were trying to create a MiniDFSCluster with the same configuration, so they were all colliding on the same directories for namenode metadata and datanode blocks.
I would be wary of simply doubling the test timeouts.
The biggest driver of timeout changes has been differences in developer environments. Test timeouts have been problematic for people like me who primarily develop on Mac or Linux but now want to contribute to Windows compatibility. We end up needing to run tests on under-powered VMs. Timeout values are generally an arbitrary choice by the original author of the test, based on his or her own machine's performance characteristics at the time. There has been some discussion of trying to parameterize JUnit to scale the timeouts up or down to suit your development hardware. For right now, I don't have any better solution than increasing the timeouts.
this probably can be done once instead of multiple times right? I am assuming this is a slow filesystem operation.
This logic iterates over multiple distinct localized resources, potentially a mix of file and directories, so we need to check isDirectory for each one.
btw, there doesnt seem to be a test about explicitly adding local resources to the classpath in this patch, right?
No, this case is already covered by the existing test TestMRJobs#testDistributedCache. The test fails before this patch and passes after this patch.
Finally, this will have to be split into common, mr and yarn jiras+patches, though we will need a combined patch to get a successful jenkins run. we can attach the combined patch to the common jira because that will be committed first.
This is done. The related jiras are:
HADOOP-9488, MAPREDUCE-4987, and YARN-593.