Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.6, 4.0-ALPHA
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      There are two major problems:
      1. The listener api does not really apply all indexreaders. for example segmentreaders dont fire it on close, only segmentcorereaders. this is wrong, a segmentcorereader is not an indexreader. Furthermore, if you register it on a top-level reader you get events for anything under the reader tree (sometimes, unless they are segmentreaders as mentioned above, where it doesnt work correctly at all).
      2. Furthermore your listener is 'passed along' in a viral fashion from clone() and reopen(). This means for example, if you are trying to listen to readers in NRT search you are just accumulating reader listeners, all potentially keeping references to old indexreaders (because, in order to deal with #1 your listener must 'keep' a reference to the IR it was registered on, so it can check if thats really the one).

      We should discuss how to fix #1.

      I will create a patch for #2 shortly and commit it, its just plain wrong.

      1. LUCENE-3644.patch
        29 kB
        Robert Muir
      2. LUCENE-3644.patch
        29 kB
        Robert Muir
      3. LUCENE-3644.patch
        29 kB
        Robert Muir

        Activity

        Hide
        Uwe Schindler added a comment -

        Hi,

        +1, I was f*cking with this the whole day when rewriting oirectoryReader/MultiReader.

        The bad thing is also that the Set(CHM) is not initialized in IndexReader base class (and is not final), the subclasses must do it. This should be done by base class and be protected final, then normal IndexReaders dont need to f*ck with it. Just make the listeners collection in IndexReader base class final and initialize with MapBackedSet(CHM).
        DirectoryReader and other MultiReaders may simply delegate the registration to the subreaders, but dont pass sets around. If they need to copy all, just do "for (Listener l: this.listenerss) subreader[i].add(l);", sharing the same map between different readers is wrong.

        Show
        Uwe Schindler added a comment - Hi, +1, I was f*cking with this the whole day when rewriting oirectoryReader/MultiReader. The bad thing is also that the Set(CHM) is not initialized in IndexReader base class (and is not final), the subclasses must do it. This should be done by base class and be protected final, then normal IndexReaders dont need to f*ck with it. Just make the listeners collection in IndexReader base class final and initialize with MapBackedSet(CHM). DirectoryReader and other MultiReaders may simply delegate the registration to the subreaders, but dont pass sets around. If they need to copy all, just do "for (Listener l: this.listenerss) subreader [i] .add(l);", sharing the same map between different readers is wrong.
        Hide
        Michael McCandless added a comment -

        The original intention of this API (LUCENE-2474) was in fact to detect
        only when a SegmentCoreReader was closed, so that "external" caches
        the app kept could be purged, and not really to detect closing of all
        other readers (eg composite readers).

        But I agree, we should somehow generalize this to do both
        functions...

        Show
        Michael McCandless added a comment - The original intention of this API ( LUCENE-2474 ) was in fact to detect only when a SegmentCoreReader was closed, so that "external" caches the app kept could be purged, and not really to detect closing of all other readers (eg composite readers). But I agree, we should somehow generalize this to do both functions...
        Hide
        Robert Muir added a comment -

        I think thats confusing, I think if you set the listener on a reader, it should work.

        If we want a core listener, it should be a separate thing! and we can have a clean API for that,
        specific to SR (Maybe even only on SR itself!) that doesn't confuse the other case

        Show
        Robert Muir added a comment - I think thats confusing, I think if you set the listener on a reader, it should work. If we want a core listener, it should be a separate thing! and we can have a clean API for that, specific to SR (Maybe even only on SR itself!) that doesn't confuse the other case
        Hide
        Michael McCandless added a comment -

        That's a good idea: separate APIs for "core closed event" and "composite reader closed event".

        Show
        Michael McCandless added a comment - That's a good idea: separate APIs for "core closed event" and "composite reader closed event".
        Hide
        Robert Muir added a comment -

        That's a good idea: separate APIs for "core closed event" and "composite reader closed event".

        Thats not really what i'm saying though.

        I think readerFinishedListener, since its on indexreader, should fire when an IndexReader closes.
        I dont care if its composite or not. Its an indexreader, thats all we care about.

        Then core closed event naturally makes sense on SegmentReader, because its the only one that
        cares/knows about cores.

        Show
        Robert Muir added a comment - That's a good idea: separate APIs for "core closed event" and "composite reader closed event". Thats not really what i'm saying though. I think readerFinishedListener, since its on indexreader, should fire when an IndexReader closes. I dont care if its composite or not. Its an indexreader, thats all we care about. Then core closed event naturally makes sense on SegmentReader, because its the only one that cares/knows about cores.
        Hide
        Uwe Schindler added a comment -

        I hate the whole thing, sorry. Please remove it,... completely.

        Show
        Uwe Schindler added a comment - I hate the whole thing, sorry. Please remove it,... completely.
        Hide
        Uwe Schindler added a comment -

        And "ReaderFinished" is wrong word, it is "ReaderClosed". Finished sounds more like "Ready" than like "Close", its misunderstanding. I heard of that shit the first time last weekend and was teriified. Who proposed it, I will kill him/her?

        Show
        Uwe Schindler added a comment - And "ReaderFinished" is wrong word, it is "ReaderClosed". Finished sounds more like "Ready" than like "Close", its misunderstanding. I heard of that shit the first time last weekend and was teriified. Who proposed it, I will kill him/her?
        Hide
        Uwe Schindler added a comment -

        Sorry for the last comment, I am still in rage, rage against this... aaaaah ! die, die, die!

        Show
        Uwe Schindler added a comment - Sorry for the last comment, I am still in rage, rage against this... aaaaah ! die, die, die!
        Hide
        Robert Muir added a comment -

        attached is a patch, cleaning this up. All tests pass.

        The two use cases are now totally separated instead of being confusingly and dangerously interwoven. I also renamed things per uwe's advice:

        • IndexReader's ReaderClosedListener does exactly what it says. It fires when readers close. This is used to e.g. evict top-level fieldcaches.
        • SegmentReader's CoreClosedListener fires when the shared core is closed. This is used to e.g. evict per-segment fieldcaches.
        Show
        Robert Muir added a comment - attached is a patch, cleaning this up. All tests pass. The two use cases are now totally separated instead of being confusingly and dangerously interwoven. I also renamed things per uwe's advice: IndexReader's ReaderClosedListener does exactly what it says. It fires when readers close. This is used to e.g. evict top-level fieldcaches. SegmentReader's CoreClosedListener fires when the shared core is closed. This is used to e.g. evict per-segment fieldcaches.
        Hide
        Robert Muir added a comment -

        updated patch, with a stupid optimization for when you have field cache on top-level readers and also insanity.

        Show
        Robert Muir added a comment - updated patch, with a stupid optimization for when you have field cache on top-level readers and also insanity.
        Hide
        Michael McCandless added a comment -

        Patch looks great! I love the separation into "reader closed" vs "core closed", and removing all that hairy code that was copying around the previously registered listeners... MUCH simpler.

        But: I don't think we should "forward to subs" when someone registers for reader closed? I think if you register you should only be notified when that exact reader is closed? If somehow you care about the subs (I think this is the exception not the rule) you can always pull them and register with them too.

        Show
        Michael McCandless added a comment - Patch looks great! I love the separation into "reader closed" vs "core closed", and removing all that hairy code that was copying around the previously registered listeners... MUCH simpler. But: I don't think we should "forward to subs" when someone registers for reader closed? I think if you register you should only be notified when that exact reader is closed? If somehow you care about the subs (I think this is the exception not the rule) you can always pull them and register with them too.
        Hide
        Uwe Schindler added a comment -

        Hi, I am happy with the code now, but I still hate the (Event-)Listener-pattern. In a library like Lucene its very "foreign", it fits much better to GUIs with event loops

        And thanks for renaming the Listener!

        Show
        Uwe Schindler added a comment - Hi, I am happy with the code now, but I still hate the (Event-)Listener-pattern. In a library like Lucene its very "foreign", it fits much better to GUIs with event loops And thanks for renaming the Listener!
        Hide
        Robert Muir added a comment -

        But: I don't think we should "forward to subs" when someone registers for reader closed? I think if you register you should only be notified when that exact reader is closed?

        I agree, the only two real uses of this api we have today (as of the previous patch), both jump thru hoops to avoid this. Here's a patch implementing it that way (much cleaner imo)

        Show
        Robert Muir added a comment - But: I don't think we should "forward to subs" when someone registers for reader closed? I think if you register you should only be notified when that exact reader is closed? I agree, the only two real uses of this api we have today (as of the previous patch), both jump thru hoops to avoid this. Here's a patch implementing it that way (much cleaner imo)
        Hide
        Michael McCandless added a comment -

        +1 looks great!

        Show
        Michael McCandless added a comment - +1 looks great!
        Hide
        Shay Banon added a comment - - edited

        I just saw this thread, and would like to respond to @Uwe as I think he is completely wrong here. Its perfectly fine to have listeners in this case. If you are implementing a cache based on core readers (think custom filter cache for example), its basically the only sane way to do it (and not have to handle weak references which is bad on GC and error prone with complex keys). And Lucene should allow for that.

        I don't think the separation to segment core listener and index reader listener is the best solution here, since it exposes "too much info" to the user (having to deal with different reader cases, check FieldCacheImpl for example). I don't really mind it, since in my case I only work on low level Segment case (which is the sane way to work when you do NRT) and can handle it, but not sure its best in a general usability wise case. What the user want is: here is a reader, which I cache base on it, and here is a listener, call me back when that listener is no longer valid.

        Show
        Shay Banon added a comment - - edited I just saw this thread, and would like to respond to @Uwe as I think he is completely wrong here. Its perfectly fine to have listeners in this case. If you are implementing a cache based on core readers (think custom filter cache for example), its basically the only sane way to do it (and not have to handle weak references which is bad on GC and error prone with complex keys). And Lucene should allow for that. I don't think the separation to segment core listener and index reader listener is the best solution here, since it exposes "too much info" to the user (having to deal with different reader cases, check FieldCacheImpl for example). I don't really mind it, since in my case I only work on low level Segment case (which is the sane way to work when you do NRT) and can handle it, but not sure its best in a general usability wise case. What the user want is: here is a reader, which I cache base on it, and here is a listener, call me back when that listener is no longer valid.

          People

          • Assignee:
            Robert Muir
            Reporter:
            Robert Muir
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development