|
Michael Busch made changes - 22/Aug/07 11:21 PM
This is good stuff, Hoss. I like the DirectoryIndexReader idea.
Sounds like a very clean and logical separation. I'll work through the code to understand all consequences. What do you think about this alternative approach:
The advantage here is that only one class (MultiSegmentReader) is If we go the DirectoryIndexReader way (where SegmentReader and So I'm not sure which approach is the better one. I'm hoping to get some i rarely remember a week later what i was thinking when i wrote something, but i suspect that when i suggested the DirectoryIndexReader i was assuming it would have everything directory/lock related thta currently exists in the IndexReader base class (including the directoryOwner boolean) ... in cases where there is a single Segment in a directory, there will be SegmentReader with directoryOwner==true ... in the multi segment cases, the MultiSegmentReader will have directoryOwner==true, and it's sub SegmentReaders will all have directoryOwner==false. ...
...the key point of DirectoryIndexReader being that any subclass can own a directory (and automaticly inherits all the code for dealing with locks properly when it needs/wants to) but doesn't always have to own the directory. meanwhile MultiReader (and ParallelIndexReader and FilteredIndexReader) make no attempt at owning a directory, and inherit no code for doing so (or for dealing with the locking of such non existent directories) I don't really know enough about the performance characteristics of SegmentReader vs a MultiSegmentReader of one segment to have a sense of how possible/practical it would be to eliminate the need for SegmentReader and replace it completely with MultiSegmentReader ... one hitch might be that SegmentReader.get is public, and in order to keep supporting it, SegmentReader still needs to have/inherit the same segment info and directory owning/locking code that we want to move out of IndexReader (so just putting it MultiSegmentReader won't fly unless we kill that public method) Here is the patch with the DirectoryIndexReader approach.
It moves SegmentInfos and Directory from IndexReader into MultiSegmentReader and SegmentReader extend DirectoryIndexReader. I added the method acquireWriteLock() to IndexReader that does IndexReader is very abstract now and almost all logic moved into All unit tests pass with this patch.
Michael Busch made changes - 13/Sep/07 06:54 AM
> one hitch might be that SegmentReader.get is public, and in order to keep
> supporting it, SegmentReader still needs to have/inherit the same segment info > and directory owning/locking code that we want to move out of IndexReader (so > just putting it MultiSegmentReader won't fly unless we kill that public method). OK, I implemented the DirectoryIndexReader approach. Also because I'm not sure I'd like to commit this rather soon. A review of the patch would be highly Michael: I've been meaning to look at this, but haven't had the time ... your recent update has goaded me
just to clarify: the patch you added on September 12th is your latest patch right? ... it's not clear from you comment on the 17th if you intended to attach an update and something went wrong. I ask because i'm haivng trouble applying the patch from the 12th ... i must be tired because i can't understand why, it doesn't look like the files have changed since you posted the patch, so i'm not sure what it's complaining about ... visually everything seems to match up... hossman@coaster:~/lucene/lucene$ svn update I got the patch to apply cleanly (see mailing list for details) On the whole it looks really good, i'm attaching an updated version with some minor improvements (mainly javadocs), but first a few questions...
here's the list of tweaks I made...
Hoss Man made changes - 18/Sep/07 08:18 AM
> I got the patch to apply cleanly (see mailing list for details)
Thanks, Hoss! I'm using TortoiseSVN, I have to check how to set those > * just to clarify: IndexReader(Directory) is only around for Yes, correct. > * IndexReader() says it can be removed once the other constructor is Yes, just a suggested simplification. Keeping the constructor wouldn't hurt > * since one of the goals is that IndexReaders which don't own their Sounds good, will do... > * the way TestMultiReader is setup with the "mode" is a bit confusing Yes, that's cleaner, will make that change as well. > here's the list of tweaks I made... The improvements look good to me. In addition to Hoss' changes this patch:
I'm planning to commit this in a day or so if nobody objects.
Michael Busch made changes - 18/Sep/07 08:24 PM
Michael Busch made changes - 20/Sep/07 07:30 AM
Michael Busch made changes - 25/Jan/08 03:24 AM
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
I haven't worked through all the implications, but perhaps the most logical refactoring would be...
...as abstract as possible given that we can't actually make methods abstract
...new class, encapsulated all the segmentInfos and locking logic currently in
IndexReader (can definitely be made abstract if helpful)
...(side note that i really haven't thought through completley: should
MultiReader extend FilterIndexReader?)
there would likely be some utlity functionality that could be reused between MultiReader and MultiSegmentReader ... possible as static methods in IndexReader (or a new util class)