Lucene - Core
  1. Lucene - Core
  2. LUCENE-2858

Separate SegmentReaders (and other atomic readers) from composite IndexReaders

    Details

    • Type: Task Task
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.0-ALPHA
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      With current trunk, whenever you open an IndexReader on a directory you get back a DirectoryReader which is a composite reader. The interface of IndexReader has now lots of methods that simply throw UOE (in fact more than 50% of all methods that are commonly used ones are unuseable now). This confuses users and makes the API hard to understand.

      This issue should split "atomic readers" from "reader collections" with a separate API. After that, you are no longer able, to get TermsEnum without wrapping from those composite readers. We currently have helper classes for wrapping (SlowMultiReaderWrapper - please rename, the name is really ugly; or Multi*), those should be retrofitted to implement the correct classes (SlowMultiReaderWrapper would be an atomic reader but takes a composite reader as ctor param, maybe it could also simply take a List<AtomicReader>). In my opinion, maybe composite readers could implement some collection APIs and also have the ReaderUtil method directly built in (possibly as a "view" in the util.Collection sense). In general composite readers do not really need to look like the previous IndexReaders, they could simply be a "collection" of SegmentReaders with some functionality like reopen.

      On the other side, atomic readers do not need reopen logic anymore? When a segment changes, you need a new atomic reader? - maybe because of deletions thats not the best idea, but we should investigate. Maybe make the whole reopen logic simplier to use (ast least on the collection reader level).

      We should decide about good names, i have no preference at the moment.

      1. LUCENE-2858-FixSlowEnsureOpen.patch
        1 kB
        Uwe Schindler
      2. LUCENE-2858-FCinsanity.patch
        10 kB
        Uwe Schindler
      3. LUCENE-2858.patch
        724 kB
        Uwe Schindler
      4. LUCENE-2858.patch
        724 kB
        Uwe Schindler

        Issue Links

          Activity

          Hide
          Michael McCandless added a comment -

          +1

          We very much need this now – it should be strongly (statically) typed, whether you are using an AtomicIndexReader vs CompositeIndexReader.

          And I agree CompositeIndexReader should feel more like a Collection than what IR is today.

          Show
          Michael McCandless added a comment - +1 We very much need this now – it should be strongly (statically) typed, whether you are using an AtomicIndexReader vs CompositeIndexReader. And I agree CompositeIndexReader should feel more like a Collection than what IR is today.
          Hide
          Robert Muir added a comment -

          (SlowMultiReaderWrapper - please rename, the name is really ugly; or Multi*)

          My problem with Multi* is that this makes things sound cool and powerful.
          If they are really slow transition mechanisms, I prefer Slow* as it will actually discourage their use.

          But I agree with this issue in general and it would be great for it to be cleaned up.

          Show
          Robert Muir added a comment - (SlowMultiReaderWrapper - please rename, the name is really ugly; or Multi*) My problem with Multi* is that this makes things sound cool and powerful. If they are really slow transition mechanisms, I prefer Slow* as it will actually discourage their use. But I agree with this issue in general and it would be great for it to be cleaned up.
          Hide
          Earwin Burrfoot added a comment -

          On the other side, atomic readers do not need reopen logic anymore? When a segment changes, you need a new atomic reader?

          There is a freakload of places that "upgrade" SegmentReader in various ways, with deletions guilty only for the part of the cases. I'll try getting back to LUCENE-2355 at the end of the week.

          Show
          Earwin Burrfoot added a comment - On the other side, atomic readers do not need reopen logic anymore? When a segment changes, you need a new atomic reader? There is a freakload of places that "upgrade" SegmentReader in various ways, with deletions guilty only for the part of the cases. I'll try getting back to LUCENE-2355 at the end of the week.
          Hide
          Uwe Schindler added a comment - - edited

          Any comments about removing write access from IndexReaders? I think setNorms() will be removed soon, but how about the others like deleteDocument()? I would propose to also make all IndexReaders simply readers not writers?

          Show
          Uwe Schindler added a comment - - edited Any comments about removing write access from IndexReaders? I think setNorms() will be removed soon, but how about the others like deleteDocument()? I would propose to also make all IndexReaders simply readers not writers?
          Hide
          Robert Muir added a comment -

          I think setNorms() will be removed soon

          Why do you think this?

          On the norms cleanup issue, i only removed setNorm(float), because its completely useless.
          All it did was call Similarity.getDefault().encode(float) + setNorm(byte).

          Show
          Robert Muir added a comment - I think setNorms() will be removed soon Why do you think this? On the norms cleanup issue, i only removed setNorm(float), because its completely useless. All it did was call Similarity.getDefault().encode(float) + setNorm(byte).
          Hide
          Uwe Schindler added a comment -

          I was talking about replacing norms by CSF, maybe It's just not soon.

          Show
          Uwe Schindler added a comment - I was talking about replacing norms by CSF, maybe It's just not soon.
          Hide
          Robert Muir added a comment -

          Ah, ok. sorry i was confused. Still, i think we would need this method (somewhere) even
          with CSF, so that people can change the norms and they instantly take effect for searches.

          Show
          Robert Muir added a comment - Ah, ok. sorry i was confused. Still, i think we would need this method (somewhere) even with CSF, so that people can change the norms and they instantly take effect for searches.
          Hide
          Earwin Burrfoot added a comment -

          Any comments about removing write access from IndexReaders? I think setNorms() will be removed soon, but how about the others like deleteDocument()? I would propose to also make all IndexReaders simply readers not writers?

          Voting with all my extremities - yes!!

          Show
          Earwin Burrfoot added a comment - Any comments about removing write access from IndexReaders? I think setNorms() will be removed soon, but how about the others like deleteDocument()? I would propose to also make all IndexReaders simply readers not writers? Voting with all my extremities - yes!!
          Hide
          Earwin Burrfoot added a comment -

          Still, i think we would need this method (somewhere) even with CSF, so that people can change the norms and they instantly take effect for searches.

          This still puzzles me. I can strain my imagination, and get people who just need to change norms without reindexing.
          But doing this and requiring instant turnaround? Kid me not

          Show
          Earwin Burrfoot added a comment - Still, i think we would need this method (somewhere) even with CSF, so that people can change the norms and they instantly take effect for searches. This still puzzles me. I can strain my imagination, and get people who just need to change norms without reindexing. But doing this and requiring instant turnaround? Kid me not
          Hide
          Michael McCandless added a comment -

          I don't think we should remove setNorm/deleteDocuments, even from the composite reader class.

          Deleting docs from IR has advantages over deleting from IW: the change is "live" to any searches running on that IR; you get an immediate count of how many docs were deleted; you can delete by docID.

          setNorm is also useful in that it can be use to boost docs (globally), live, if that reader is being used for searching. When/if we cutover norms -> doc values we'll have to decide what to do about setNorm...

          At a higher level, for this "strong typing of atomic vs composite IRs", we shouldn't try to change functionality – let's just do a rote refactoring, such that methods that now throw UOE on IR are moved to the atomic reader only.

          Separately we can think about whether existing functions should be dropped...

          Show
          Michael McCandless added a comment - I don't think we should remove setNorm/deleteDocuments, even from the composite reader class. Deleting docs from IR has advantages over deleting from IW: the change is "live" to any searches running on that IR; you get an immediate count of how many docs were deleted; you can delete by docID. setNorm is also useful in that it can be use to boost docs (globally), live, if that reader is being used for searching. When/if we cutover norms -> doc values we'll have to decide what to do about setNorm... At a higher level, for this "strong typing of atomic vs composite IRs", we shouldn't try to change functionality – let's just do a rote refactoring, such that methods that now throw UOE on IR are moved to the atomic reader only. Separately we can think about whether existing functions should be dropped...
          Hide
          Marvin Humphrey added a comment -

          > Deleting docs from IR has advantages over deleting from IW: the change is
          > "live" to any searches running on that IR; you get an immediate count of how
          > many docs were deleted; you can delete by docID.

          Alternate plan:

          • Move responsibility for deletions to a pluggable DeletionsReader
            subcomponent of SegmentReader.
          • Have the default DeletionsReader be read-only.
          • People who need the esoteric functionality described above can use a
            subclass of DeletionsReader.
          Show
          Marvin Humphrey added a comment - > Deleting docs from IR has advantages over deleting from IW: the change is > "live" to any searches running on that IR; you get an immediate count of how > many docs were deleted; you can delete by docID. Alternate plan: Move responsibility for deletions to a pluggable DeletionsReader subcomponent of SegmentReader. Have the default DeletionsReader be read-only. People who need the esoteric functionality described above can use a subclass of DeletionsReader.
          Hide
          Earwin Burrfoot added a comment -

          APIs have to be there still. All that commity, segment-deletery, mutabley stuff (that spans both atomic and composite readers).
          So, while your plan is viable, it won't remove that much cruft.

          Show
          Earwin Burrfoot added a comment - APIs have to be there still. All that commity, segment-deletery, mutabley stuff (that spans both atomic and composite readers). So, while your plan is viable, it won't remove that much cruft.
          Hide
          Uwe Schindler added a comment - - edited

          I try to do this now, Robert will help.

          Most crap is removed form IndexReader, so this should be easy now. We should use AtomicIndexReader and CompositeIndexReader.

          • IndexReader base class only contains the methods that can actually be implemented in every reader.
          • The current BaseMultiReader (pkg-private) would be the later CompositeIndexReader
          • AtomicIndexReader is also abstract, SegmentReader and SlowMultiReaderWrapper would implement it.
          • The static open methods should maybe in a separate class or the abstract IndexReader base with the limited API, but they would return CompositeIndexReader
          • FilterIndexReader could be split in two classes, but a FilterMultiReader makes no sense FilterIndexReader will be FilterAtomicIndexReader (also the superclass of SlowMultiReaderWrapper)
          Show
          Uwe Schindler added a comment - - edited I try to do this now, Robert will help. Most crap is removed form IndexReader, so this should be easy now. We should use AtomicIndexReader and CompositeIndexReader. IndexReader base class only contains the methods that can actually be implemented in every reader. The current BaseMultiReader (pkg-private) would be the later CompositeIndexReader AtomicIndexReader is also abstract, SegmentReader and SlowMultiReaderWrapper would implement it. The static open methods should maybe in a separate class or the abstract IndexReader base with the limited API, but they would return CompositeIndexReader FilterIndexReader could be split in two classes, but a FilterMultiReader makes no sense FilterIndexReader will be FilterAtomicIndexReader (also the superclass of SlowMultiReaderWrapper)
          Hide
          Simon Willnauer added a comment -

          uwe are you gonna work on this?

          Show
          Simon Willnauer added a comment - uwe are you gonna work on this?
          Hide
          Uwe Schindler added a comment -

          I will hopefully start this weekend. I just have a schedule currently that has no slots left for that.

          Robert and I wanted to start a branch soon, maybe I simply do that as first step soon!

          Show
          Uwe Schindler added a comment - I will hopefully start this weekend. I just have a schedule currently that has no slots left for that. Robert and I wanted to start a branch soon, maybe I simply do that as first step soon!
          Hide
          Uwe Schindler added a comment -

          Simon: Just to inform you, I am working on this. Currently I have a heavy broken checkout that does no longer compile at all Working, working, working... It's a mess!

          Once I have something intially compiling for core (not tests), I will create a branch!

          Show
          Uwe Schindler added a comment - Simon: Just to inform you, I am working on this. Currently I have a heavy broken checkout that does no longer compile at all Working, working, working... It's a mess! Once I have something intially compiling for core (not tests), I will create a branch!
          Hide
          Uwe Schindler added a comment - - edited

          I created the branch at https://svn.apache.org/repos/asf/lucene/dev/branches/lucene2858 and committed my first steps:

          • Add CompositeIndexReader and AtomicIndexReader
          • Moved methods around, still not yet finished (see below)
          • DirectoryReader is public now and is returned by IR.open() and IW.getReader()

          TODO:

          • IR.openIfChanged makes no sense for any reader other than DirectoryReader, let's move it also there
          • isCurrent and getVersion() is also useless for atomic readers and composite readers except DR
          • The strange generics in ReaderContext caused by the final field will go away, when changing reader field to aaccessor method returning the correct type (by return type overloading).

          Comments welcome and also heavy committing.

          Show
          Uwe Schindler added a comment - - edited I created the branch at https://svn.apache.org/repos/asf/lucene/dev/branches/lucene2858 and committed my first steps: Add CompositeIndexReader and AtomicIndexReader Moved methods around, still not yet finished (see below) DirectoryReader is public now and is returned by IR.open() and IW.getReader() TODO: IR.openIfChanged makes no sense for any reader other than DirectoryReader, let's move it also there isCurrent and getVersion() is also useless for atomic readers and composite readers except DR The strange generics in ReaderContext caused by the final field will go away, when changing reader field to aaccessor method returning the correct type (by return type overloading). Comments welcome and also heavy committing.
          Hide
          Uwe Schindler added a comment -

          Some TODOs fixed:

          • The strange generics in ReaderContext caused by the final field will go away, when changing reader field to aaccessor method returning the correct type (by return type overloading).
          • Some refactoring done on docFreq()
          • Moved the ReaderContext impls to the corresponding abstract reader class: CompositeReaderContext -> CompositeIndexReader, AtomicReaderContext -> AtomicIndexReader
          Show
          Uwe Schindler added a comment - Some TODOs fixed: The strange generics in ReaderContext caused by the final field will go away, when changing reader field to aaccessor method returning the correct type (by return type overloading). Some refactoring done on docFreq() Moved the ReaderContext impls to the corresponding abstract reader class: CompositeReaderContext -> CompositeIndexReader, AtomicReaderContext -> AtomicIndexReader
          Hide
          Uwe Schindler added a comment -

          More TODOs:

          • Rename SlowMultiReaderWrapper -> SlowCompositeReaderWrapper (as its not only for MultiReaders, also other CompositeIndexReaders like DirectoryReader. Name is not in line with our current naming
          • All version/commit/reopen stuff should go into DirectoryReader, its useless for the abstract IR interfaces
          Show
          Uwe Schindler added a comment - More TODOs: Rename SlowMultiReaderWrapper -> SlowCompositeReaderWrapper (as its not only for MultiReaders, also other CompositeIndexReaders like DirectoryReader. Name is not in line with our current naming All version/commit/reopen stuff should go into DirectoryReader, its useless for the abstract IR interfaces
          Hide
          Uwe Schindler added a comment -

          I now fixed the branch's test-framework and all remaining TODOs about the API.

          Now the horrible stupid slave-work to port all test starts. I assume the API is now fixed, as nobody complained after one week.

          Show
          Uwe Schindler added a comment - I now fixed the branch's test-framework and all remaining TODOs about the API. Now the horrible stupid slave-work to port all test starts. I assume the API is now fixed, as nobody complained after one week.
          Hide
          Yonik Seeley added a comment -

          SlowMultiReaderWrapper - please rename, the name is really ugly; or Multi*

          +1, the Slow* is misleading as it makes it seem like there's a faster way you should be doing it.
          CompositeReaderWrapper should be fine. And no, it doesn't sound too "cool" for the hypothetical developers who use that as a criteria when coding

          Other possibilities include AtomicReaderEmulator, AtomicEmulatorReader, CompositeAsAtomicReader, etc

          Show
          Yonik Seeley added a comment - SlowMultiReaderWrapper - please rename, the name is really ugly; or Multi* +1, the Slow* is misleading as it makes it seem like there's a faster way you should be doing it. CompositeReaderWrapper should be fine. And no, it doesn't sound too "cool" for the hypothetical developers who use that as a criteria when coding Other possibilities include AtomicReaderEmulator, AtomicEmulatorReader, CompositeAsAtomicReader, etc
          Hide
          Robert Muir added a comment -

          Can we please do some eclipse-renames like:

          AtomicIndexReader -> AtomicReader
          AtomicIndexReader.AtomicReaderContext -> AtomicReader.Context

          The verbosity of the api is killing me

          Show
          Robert Muir added a comment - Can we please do some eclipse-renames like: AtomicIndexReader -> AtomicReader AtomicIndexReader.AtomicReaderContext -> AtomicReader.Context The verbosity of the api is killing me
          Hide
          Michael McCandless added a comment -

          +1 for those names.

          Show
          Michael McCandless added a comment - +1 for those names.
          Hide
          Uwe Schindler added a comment -

          Jaja, will fix this...

          Show
          Uwe Schindler added a comment - Jaja, will fix this...
          Hide
          Uwe Schindler added a comment -

          I renamed the enclosing classes and also removed the public ctors from ReaderContexts (to prevent stupid things already reported on mailing lists).

          The renameing of ReaderContexts all to the same name Context, but with different enclosing class is a refactoring, Eclipse cannot do (it creates invalid code). It seems only NetBeans can do this, I will try to find a solution. The problem is that Eclipse always tries to import the inner class, what causes conflicts.

          Finally, e.g. the method getDocIdSet should look like getDocIdSet(AtomicReader.Context,...) [only importing AtomicReader], but Eclipse always tries to use Context [and import oal.AtomicReader.Context]. At the end we should have abstract IndexReader.Context, AtomicReader.Context, CompositeReader.Context.

          Will go to bed now.

          Show
          Uwe Schindler added a comment - I renamed the enclosing classes and also removed the public ctors from ReaderContexts (to prevent stupid things already reported on mailing lists). The renameing of ReaderContexts all to the same name Context, but with different enclosing class is a refactoring, Eclipse cannot do (it creates invalid code). It seems only NetBeans can do this, I will try to find a solution. The problem is that Eclipse always tries to import the inner class, what causes conflicts. Finally, e.g. the method getDocIdSet should look like getDocIdSet(AtomicReader.Context,...) [only importing AtomicReader] , but Eclipse always tries to use Context [and import oal.AtomicReader.Context] . At the end we should have abstract IndexReader.Context, AtomicReader.Context, CompositeReader.Context. Will go to bed now.
          Hide
          Uwe Schindler added a comment -

          After sleeping one more night about it, the easiest and most simple way to remove the stupid verbosity in imports:

          I will move the ReaderContexts out of their enclosing class and make a top-level class without ctor out of it. Then import statements look nice. The RCs are now so important for search, so they should be on its own.

          Show
          Uwe Schindler added a comment - After sleeping one more night about it, the easiest and most simple way to remove the stupid verbosity in imports: I will move the ReaderContexts out of their enclosing class and make a top-level class without ctor out of it. Then import statements look nice. The RCs are now so important for search, so they should be on its own.
          Hide
          Uwe Schindler added a comment - - edited

          Here the final patch. Thanks for the good work and lots of help from Robert and Mike.

          As this patch contains heavy changes, we will commit it as soon as possible so work can go on. It would be nice, if you would not commit anything until this is done.

          There are some minor issues open:

          • DirectoryReader is final and has only static factory methods. It is not possible to subclass it in any way. The problem is mainly Solr, as Solr accesses directory(), IndexCommits,... and therefore cannot work on abstract IndexReader anymore. This should be changed, by e.g. handling reopening in the IRFactory, also versions, commits,... Currently its not possible to implement any other IRFactory that returns something else.
          • The PayloadProcessorProvider has a broken API, this should be fixed. The current patch mimics the old behaviour, but not 100%.
          • ParallelReader is now atomic. We should add a sugar wrapper method to allow synchronized composite readers (with same segment sizes) to be aligned with MultiReaders or wrapped by Slow.

          The remaining issues will be fixed in later issues!

          Show
          Uwe Schindler added a comment - - edited Here the final patch. Thanks for the good work and lots of help from Robert and Mike. As this patch contains heavy changes, we will commit it as soon as possible so work can go on. It would be nice, if you would not commit anything until this is done. There are some minor issues open: DirectoryReader is final and has only static factory methods. It is not possible to subclass it in any way. The problem is mainly Solr, as Solr accesses directory(), IndexCommits,... and therefore cannot work on abstract IndexReader anymore. This should be changed, by e.g. handling reopening in the IRFactory, also versions, commits,... Currently its not possible to implement any other IRFactory that returns something else. The PayloadProcessorProvider has a broken API, this should be fixed. The current patch mimics the old behaviour, but not 100%. ParallelReader is now atomic. We should add a sugar wrapper method to allow synchronized composite readers (with same segment sizes) to be aligned with MultiReaders or wrapped by Slow. The remaining issues will be fixed in later issues!
          Hide
          Uwe Schindler added a comment -

          Patch with nocommits removed.

          Show
          Uwe Schindler added a comment - Patch with nocommits removed.
          Hide
          Robert Muir added a comment -

          Thanks for all the work here Uwe! I think its a great step forward.
          The previous situation where half the methods were UOE is really unreleasable.

          Show
          Robert Muir added a comment - Thanks for all the work here Uwe! I think its a great step forward. The previous situation where half the methods were UOE is really unreleasable.
          Hide
          Uwe Schindler added a comment -

          Committed trunk revision: 1238085

          Show
          Uwe Schindler added a comment - Committed trunk revision: 1238085
          Hide
          Michael McCandless added a comment -

          Nice work guys!

          Show
          Michael McCandless added a comment - Nice work guys!
          Hide
          Uwe Schindler added a comment -

          Patch that adds FC insanity checking for slow wrappers back. Will commit now.

          Show
          Uwe Schindler added a comment - Patch that adds FC insanity checking for slow wrappers back. Will commit now.
          Hide
          Uwe Schindler added a comment -

          Committed trunk revision: 1238112

          Show
          Uwe Schindler added a comment - Committed trunk revision: 1238112
          Hide
          Robert Muir added a comment -

          I think somehow ensureOpen is broken here in some situations (possibly slowmultireaderwrapper):

          ant test -Dtestcase=TestReaderClosed -Dtestmethod=test -Dtests.seed=42907a63342da06b:-1490dd5f0d0f26d9:-7e3c360ea2d32539 -Dargs="-Dfile.encoding=UTF-8"
          
          Show
          Robert Muir added a comment - I think somehow ensureOpen is broken here in some situations (possibly slowmultireaderwrapper): ant test -Dtestcase=TestReaderClosed -Dtestmethod=test -Dtests.seed=42907a63342da06b:-1490dd5f0d0f26d9:-7e3c360ea2d32539 -Dargs="-Dfile.encoding=UTF-8"
          Hide
          Uwe Schindler added a comment -

          Hi Robert, this patch fixes the problem:

          SlowMultiReaderWrapper creates the fields and liveDocs on its ctor (as its expensive) and reuses. This is no problem, but the test did something very bad: It closed not the slow reader but the wrapped DirectoryReader. As fields() was not delegated to the DR or MultiFields with DR, Slow returned its own cached instance. Slow's ensureOpen was not throwing Ex, as it was still open.

          Theoretically, Slow* should incRef the underlying indexreader and decRef on close(). But that would require, that you close the SlowReader after wrapping.

          The current solution is not optimal but makes it easy to wrap without explicitely closing the slow wrapper.

          The fix was to call in.ensureOpen() in slow before the cached instance was returned.

          Show
          Uwe Schindler added a comment - Hi Robert, this patch fixes the problem: SlowMultiReaderWrapper creates the fields and liveDocs on its ctor (as its expensive) and reuses. This is no problem, but the test did something very bad: It closed not the slow reader but the wrapped DirectoryReader. As fields() was not delegated to the DR or MultiFields with DR, Slow returned its own cached instance. Slow's ensureOpen was not throwing Ex, as it was still open. Theoretically, Slow* should incRef the underlying indexreader and decRef on close(). But that would require, that you close the SlowReader after wrapping. The current solution is not optimal but makes it easy to wrap without explicitely closing the slow wrapper. The fix was to call in.ensureOpen() in slow before the cached instance was returned.
          Hide
          Uwe Schindler added a comment -

          Committed fix in revision: 1238788
          I removed the extra in.ensureOpen() for hasDeletions(), as this was only a null-check.

          Show
          Uwe Schindler added a comment - Committed fix in revision: 1238788 I removed the extra in.ensureOpen() for hasDeletions(), as this was only a null-check.
          Hide
          Michael McCandless added a comment -

          Not entirely sure why but it looks like the commits here slowed down our NRT reopen latency. If you look at the nightly bench graph: http://people.apache.org/~mikemccand/lucenebench/nrt.html and click + drag from Jan 2012 to today, annotation R shows we increased from ~46 msec NRT reopen latency to ~50 msec ... could just be hotspot being upset...

          Show
          Michael McCandless added a comment - Not entirely sure why but it looks like the commits here slowed down our NRT reopen latency. If you look at the nightly bench graph: http://people.apache.org/~mikemccand/lucenebench/nrt.html and click + drag from Jan 2012 to today, annotation R shows we increased from ~46 msec NRT reopen latency to ~50 msec ... could just be hotspot being upset...
          Hide
          Michael McCandless added a comment -

          Not entirely sure why but it looks like the commits here slowed down our NRT reopen latency.

          OK, sorry, I was wrong about this!

          I went back and re-ran the NRTPerfTest and isolated the slowdown to LUCENE-3728 (also committed on the same day)... it was because we lost the SegmentInfo.sizeInBytes caching... but then in LUCENE-4055 we got it back and we got the performance back ... so all's well that ends well

          Show
          Michael McCandless added a comment - Not entirely sure why but it looks like the commits here slowed down our NRT reopen latency. OK, sorry, I was wrong about this! I went back and re-ran the NRTPerfTest and isolated the slowdown to LUCENE-3728 (also committed on the same day)... it was because we lost the SegmentInfo.sizeInBytes caching... but then in LUCENE-4055 we got it back and we got the performance back ... so all's well that ends well
          Hide
          Uwe Schindler added a comment -

          Thanks Mike for taking care. For me this looked crazy, because refactoring like this should not change perf.

          This explains the change back after 4055, I thought un-compressed oops responsible for the speedup.

          Show
          Uwe Schindler added a comment - Thanks Mike for taking care. For me this looked crazy, because refactoring like this should not change perf. This explains the change back after 4055, I thought un-compressed oops responsible for the speedup.
          Hide
          Michael McCandless added a comment -

          This explains the change back after 4055, I thought un-compressed oops responsible for the speedup.

          That's what I thought too (and I was very confused that uncompressed oops would improve things)! But LUCENE-4055 landed on the same day I turned off uncompressed oops... so I think all mysteries are now explained.

          Show
          Michael McCandless added a comment - This explains the change back after 4055, I thought un-compressed oops responsible for the speedup. That's what I thought too (and I was very confused that uncompressed oops would improve things)! But LUCENE-4055 landed on the same day I turned off uncompressed oops... so I think all mysteries are now explained.

            People

            • Assignee:
              Uwe Schindler
              Reporter:
              Uwe Schindler
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development