I actually started working on that a long time ago and had a patch which I never posted because it wasn't ready. Back then I was thinking of how can the search and taxonomy indexes be synced. I.e. in your patch, opening a pair on a Directory assumes that the pair is "valid", which may not be the case at all. For instance, when you commit such pair, you need to first commit IW, and only then TW. That way, TW always contains ordinals that are >= than what IW knows about, in which case they are "valid". So if the commit to TW succeeds and the commit to IW fails, you potentially don't end up in an inconsistent state. However, if an app makes a mistake and commits them in a different order, the pair may not be valid. I'm willing to live with a documentation that says "you should commit the pair like so...".
But, if the app's commit logic is fine, yet it opened IW and TW with OpenMode.CREATE, then the taxonomy will include ordinals that may be completely unrelated to what IW stores, right? For that reason, I created (in the un-posted patch) a taxonomy timestamp class (which we can treat as version or something similar) which is written to both TW's and IW's commitData, and the manager checks for that at initialization to ensure the two actually match.
The taxonomy writer records an index.epoch property on the internal IW commitData, which keeps track of how many times it has been recreated (opened w/ OpenMode.CREATE, or replaceTaxonomy). TaxoReader uses that on openIfChanged, returning a new instance if it is the case. I think I had issues recording that in the IW commitData as well, because you need to first commit IW, but the epoch on TW commitData is unknown until it is committed... and pulling it from DirTaxoWriter's member is dangerous because in between you might get a replaceTaxo call which increments the epoch.
It's not that simple to keep these two in sync, so I put it aside until I have time to get back to it. Thanks for re-initiating!
What do you think about this? This recreate thing is delicate. Since apps can call IW.deleteAll() as well as TW.replaceTaxonomy, I wish that we give a solution that works in all cases, rather than say this manager doesn't work if these methods were called.
I think that we also need an object that manages IW and TW pair, so that a faceted search app calls commit() on it, and it handles the delicate commit order + whatever metadata we need to commit to make these two in sync. I wonder if it should be this manager? Then it can always take IW and TW pair, and offer both the acquire/release logic as well as commit.
About the patch:
- Why does the test uses newFSDirectory? I didn't read through, but can't it work with RAMDirectory too?
- Manager.decRef()-- I think you should searcher.reader.incRef() if taxoReader.decRef() failed?
- It's odd that acquire() throws IOE ... I realize it's because the decRef call in tryIncRef. I don't know if it's critical, but if it is, you may want to throw RuntimeEx?