In my opinion, the cached directories vs. instantiated directories have one big advantage:
They are forced to use the same locking mechanism. So if somebody creates a directory using one LockFactory, writes to the index and in a parallel thread uses another locking mechanism with a separate dir instance, he corrupts his index. So from that point of view, only have one directory instance per resource is a good thing (it does not work from different JVM processes, sure).
But, I don't think this is a strong enough reason for Lucene to be
doing such magic under-the-hood, going forward. This magic leads to
other problems (like
> Ie, it's only if you use the new FSDir.open() API that you get the new behavior. I intentionally went and fixed tests to use FSDir.open so that we stress the new functionality, which then led us to discover tests making invalid assumptions, which we should then fix.
This is correct. For unit testing, I found out now, that it is much simplier to check, if all tests would also work with other platforms, if you set the FSDir system property when running the tests. With open() this is currently not possible.
Before committing we should confirm all tests pass if we temporarily
hardwire open to return each of the 3 FSDir impls.
But I don't think this is reason enough to leave the global system
property in place for real usage of Lucene.
Maybe I un-comment-out the caching again, but let getDirectory still use the new behaviour, if the system property is not set. We could then in 3.0 just remove the caching, but let getDirectory() alive. I am not sure.
But you've still unnecessarily broken back-compat with that. By
making a new method (open), which does neither the magic singleton
caching nor the global system prop, back-compat users are guaranteed
to see no changes.
In my opinion, this is not really a more serious bw-change than a small behaviour change, that can be written into CHANGES.txt. We have more serious ones.
I would strongly tend to remove the cache at all and write a warning into CHANGES.txt.
At all, I do not really think anybody has implemented an own subclass of FSDir. The current patch's bw-change is more, that the protected no-arg ctors no longer exist and are no longer used.
Why take that chance unnecessarily? What are we gaining by changing
getDirectory so much in place, vs switching to a new (open) API? It's
entirely possible apps have subclassed FSDir, rely on the singleton
cache and rely on the global system prop. Making a new API, and
deprecating in favor of that, won't affect back-compat users at all.