Here's a patch that uses a lockless approach to the problem. The variables that were being synchronized were treated read-only, except for the dirNumLastAccessed field. Therefore the patch wraps the relevant variables in a context object and when the conf is updated the context reference is atomically swapped out for a new context object. Each method needing to access the context makes a local copy of the reference and always uses the context through that local reference, so they are always looking at a self-consistent context. The tradeoff is that the context itself may be stale, but it should always be self-consistent. The dirNumLastAccessed field is treated in a similar fashion. It is read once into a local variable then the local variable is used for iteration purposes.
Since this is primarily a performance optimization the patch also changes the local directories to be stored as Path objects rather than Strings so we don't need to create so many Paths from scratch during methods.
Ran some performance numbers on multithreaded accesses with this change and was a bit surprised to see it was significantly slower than the old version. However that's because the change in
HADOOP-9652 causes fs.exists() to fork-and-exec the stat command, and before this change that was done serially across threads. With this change it effectively becomes a mini fork-bomb, forking stat in parallel like crazy. When I reverted the HADOOP-9652 change locally, the patched version was about 2.5x faster than the original version with 8 threads across 12 local directories.
The speedup is nice, but I'm not sure this needs to be a Blocker for the 2.3.0 release. If the filesystem is the real bottleneck (e.g.: accessing one of the drives is always really, really slow or using fork-and-exec stat to do an fs.exists) then this change will only marginally help in most use cases. Eventually all of the serving threads are going to be hung up waiting for the filesystem which is only a bit better (or sometimes worse as in the fork-and-exec case) than waiting serially. The real throughput of a serve using the allocator may not be significantly improved.