Lucene - Core
  1. Lucene - Core
  2. LUCENE-2468

reopen on NRT reader should share readers w/ unchanged segments

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.9.3, 3.0.2, 3.1, 4.0-ALPHA
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      A repoen on an NRT reader doesn't seem to share readers for those segments that are unchanged.
      http://search.lucidimagination.com/search/document/9f0335d480d2e637/nrt_and_caching_based_on_indexreader

      1. CacheTest.java
        6 kB
        Shay Banon
      2. DeletionAwareConstantScoreQuery.java
        6 kB
        Shay Banon
      3. LUCENE-2468.patch
        28 kB
        Michael McCandless
      4. LUCENE-2468.patch
        16 kB
        Michael McCandless
      5. LUCENE-2468.patch
        1 kB
        Yonik Seeley

        Activity

        Hide
        Yonik Seeley added a comment -

        Attached simple test case demonstrating the issue.

        Show
        Yonik Seeley added a comment - Attached simple test case demonstrating the issue.
        Hide
        Earwin Burrfoot added a comment -

        Or, you do it so various caches are preserved across clone()

        Show
        Earwin Burrfoot added a comment - Or, you do it so various caches are preserved across clone()
        Hide
        Michael McCandless added a comment -

        Indeed, right now the newly returned NRT reader will always provide a
        shallow clone for any segments that have not changed vs the previous
        NRT reader.

        FieldCache is unaffected by this (it always keys on the "core"
        readers, getFieldCacheKey, which is the same for shallow clones) –
        such shallow clones will share the same field cache entry.

        But other caches (CachingWrapperFilter, CachingSpanFilter) don't use
        this key, and so they'll now get multiple entries for the shallow
        clones. So we need to fix that.

        However, when new deletions have arrived, a new shallow clone must be
        created. In this case the FieldCache entries are shared.

        So, should these other caches share an entry for that clone, or not?
        It's tempting to do so – all that's changed is new docs got deleted,
        and any time these filters are applied for searching, they are AND'd
        with "not deleted".

        But, this is technically shaky ground, since the new deletions will in
        fact mean some docs that previously passed the filter (bit was set)
        will now have the bit un-set.

        I would lean towards letting the caches share the filter in these
        cases, and advertising in these classes javadocs that this will
        happen. Thoughts?

        Show
        Michael McCandless added a comment - Indeed, right now the newly returned NRT reader will always provide a shallow clone for any segments that have not changed vs the previous NRT reader. FieldCache is unaffected by this (it always keys on the "core" readers, getFieldCacheKey, which is the same for shallow clones) – such shallow clones will share the same field cache entry. But other caches (CachingWrapperFilter, CachingSpanFilter) don't use this key, and so they'll now get multiple entries for the shallow clones. So we need to fix that. However, when new deletions have arrived, a new shallow clone must be created. In this case the FieldCache entries are shared. So, should these other caches share an entry for that clone, or not? It's tempting to do so – all that's changed is new docs got deleted, and any time these filters are applied for searching, they are AND'd with "not deleted". But, this is technically shaky ground, since the new deletions will in fact mean some docs that previously passed the filter (bit was set) will now have the bit un-set. I would lean towards letting the caches share the filter in these cases, and advertising in these classes javadocs that this will happen. Thoughts?
        Hide
        Earwin Burrfoot added a comment -

        Reusing fieldCacheKey is probably a good temporary solution? (possibly renaming it in the process to just cacheKey)
        NRT.reopen() is not the only case when we're getting shallow clones, IR.clone() explicitly does that, so with cacheKey we're getting all the bases covered.

        Show
        Earwin Burrfoot added a comment - Reusing fieldCacheKey is probably a good temporary solution? (possibly renaming it in the process to just cacheKey) NRT.reopen() is not the only case when we're getting shallow clones, IR.clone() explicitly does that, so with cacheKey we're getting all the bases covered.
        Hide
        Shay Banon added a comment -

        Sounds like a good solution for me. I just noticed in trunk that there is also explicit purge from FieldCache when possible. I think it would be great to enable to do this for other caches that are based on it (like the CachingWrapperFilter, but externally written ones as well).

        I was thinking of an expert API to allow to add a "CacheEvictionListener" or something similar, which will be called when this happens. What do you think?

        Show
        Shay Banon added a comment - Sounds like a good solution for me. I just noticed in trunk that there is also explicit purge from FieldCache when possible. I think it would be great to enable to do this for other caches that are based on it (like the CachingWrapperFilter, but externally written ones as well). I was thinking of an expert API to allow to add a "CacheEvictionListener" or something similar, which will be called when this happens. What do you think?
        Hide
        Yonik Seeley added a comment -

        bq, I would lean towards letting the caches share the filter in these cases, and advertising in these classes javadocs that this will happen. Thoughts?

        I think that's prob OK - users won't notice when using filters to search, but may get different behavior if they use it for other purposes.

        Shay, as far as CachingWrapperFilter and CacheEvictionListener, it seems more powerful to just let apps create a new query type themselves? That's the nice part of lucene's openness to user query types - start with the code for CachingWrapperFilter and hook up your own caching logic.

        Show
        Yonik Seeley added a comment - bq, I would lean towards letting the caches share the filter in these cases, and advertising in these classes javadocs that this will happen. Thoughts? I think that's prob OK - users won't notice when using filters to search, but may get different behavior if they use it for other purposes. Shay, as far as CachingWrapperFilter and CacheEvictionListener, it seems more powerful to just let apps create a new query type themselves? That's the nice part of lucene's openness to user query types - start with the code for CachingWrapperFilter and hook up your own caching logic.
        Hide
        Shay Banon added a comment -

        Shay, as far as CachingWrapperFilter and CacheEvictionListener, it seems more powerful to just let apps create a new query type themselves? That's the nice part of lucene's openness to user query types - start with the code for CachingWrapperFilter and hook up your own caching logic.

        Yea, but it would be great to know when an IndexReader has decided to actually close, so caches can be eagerly cleaned. Even if one will write a custom implementation, it would benefit it.

        Show
        Shay Banon added a comment - Shay, as far as CachingWrapperFilter and CacheEvictionListener, it seems more powerful to just let apps create a new query type themselves? That's the nice part of lucene's openness to user query types - start with the code for CachingWrapperFilter and hook up your own caching logic. Yea, but it would be great to know when an IndexReader has decided to actually close, so caches can be eagerly cleaned. Even if one will write a custom implementation, it would benefit it.
        Hide
        Shay Banon added a comment -

        I think that the solution suggested, to use the FieldCacheKey is not good enough, sadly. I am attaching a simpel test that shows that this does not work for cases when a query is passed to a searcher, without a filter, but that query, is, for example, a ConstantScoreQuery. I have simply taken the CachingWrapperFiler and changed it to use the getFieldCacheKey instead of using the IndexReader.

        This is problematic, since a filter can be used somewhere in the query tree, and wrapped for caching. I am running against 3.0.1.

        Show
        Shay Banon added a comment - I think that the solution suggested, to use the FieldCacheKey is not good enough, sadly. I am attaching a simpel test that shows that this does not work for cases when a query is passed to a searcher, without a filter, but that query, is, for example, a ConstantScoreQuery. I have simply taken the CachingWrapperFiler and changed it to use the getFieldCacheKey instead of using the IndexReader. This is problematic, since a filter can be used somewhere in the query tree, and wrapped for caching. I am running against 3.0.1.
        Hide
        Michael McCandless added a comment -

        Renaming to cacheKey makes me a bit nervous.... since... this key is the same even when deletions change. How about coreCacheKey? The javadocs should make it clear that new deletions can show up yet have the identical (==) coreCacheKey.

        OK I'll take this approach.

        Show
        Michael McCandless added a comment - Renaming to cacheKey makes me a bit nervous.... since... this key is the same even when deletions change. How about coreCacheKey? The javadocs should make it clear that new deletions can show up yet have the identical (==) coreCacheKey. OK I'll take this approach.
        Hide
        Michael McCandless added a comment -

        This is problematic, since a filter can be used somewhere in the query tree, and wrapped for caching

        Right so the issue here is that ConstantScoreQuery's scorer does not check deleted docs when it runs – it just relies entirely on what the filter said was set.

        But I expect this is the exception not the rule.

        Ie most uses of a filter will see to it that deleted docs are already removed.

        It's as if, somehow, when a caller wants the scorer or DocIdSet, it should express whether it's OK that deleted docs are not removed... I think this'd be another boolean arg (mustContainDeletions or some such) to scorer and getDocIdSet.

        Or, for a less invasive change, we could that you tell ConstantScoreQuery that it must fully enforce deletions (if your app runs a query tree that has a path involving ConstantScoreQuery not AND'd with some other query that'd enforce deletions).

        Show
        Michael McCandless added a comment - This is problematic, since a filter can be used somewhere in the query tree, and wrapped for caching Right so the issue here is that ConstantScoreQuery's scorer does not check deleted docs when it runs – it just relies entirely on what the filter said was set. But I expect this is the exception not the rule. Ie most uses of a filter will see to it that deleted docs are already removed. It's as if, somehow, when a caller wants the scorer or DocIdSet, it should express whether it's OK that deleted docs are not removed... I think this'd be another boolean arg (mustContainDeletions or some such) to scorer and getDocIdSet. Or, for a less invasive change, we could that you tell ConstantScoreQuery that it must fully enforce deletions (if your app runs a query tree that has a path involving ConstantScoreQuery not AND'd with some other query that'd enforce deletions).
        Hide
        Shay Banon added a comment -

        Agreed, seems like ConstantScoreQuery is the only problematic one... .

        Show
        Shay Banon added a comment - Agreed, seems like ConstantScoreQuery is the only problematic one... .
        Hide
        Michael McCandless added a comment -

        Attached patch – renames IR.getFieldCacheKey -> IR.getCoreCacheKey, fixes CachingWrapperFilter and CachingSpanFilter to, by default, disregard deletions when checking the cache. But I added expert ctors to each to force deletions to be "respected" at a perf hit.

        Show
        Michael McCandless added a comment - Attached patch – renames IR.getFieldCacheKey -> IR.getCoreCacheKey, fixes CachingWrapperFilter and CachingSpanFilter to, by default, disregard deletions when checking the cache. But I added expert ctors to each to force deletions to be "respected" at a perf hit.
        Hide
        Shay Banon added a comment -

        Thanks for the work Michael!. Is this issue going to include the ConstantSoreQuery, or should I open a different issue for this?

        Show
        Shay Banon added a comment - Thanks for the work Michael!. Is this issue going to include the ConstantSoreQuery, or should I open a different issue for this?
        Hide
        Michael McCandless added a comment -

        Is this issue going to include the ConstantSoreQuery, or should I open a different issue for this?

        Sorry – what change is needed to ConstantScoreQuery?

        Show
        Michael McCandless added a comment - Is this issue going to include the ConstantSoreQuery, or should I open a different issue for this? Sorry – what change is needed to ConstantScoreQuery?
        Hide
        Shay Banon added a comment -

        Check two comments above , we discussed it. Basically, it does not work with your change and it using a cached filter.

        Show
        Shay Banon added a comment - Check two comments above , we discussed it. Basically, it does not work with your change and it using a cached filter.
        Hide
        Michael McCandless added a comment -

        Basically, it does not work with your change and it using a cached filter.

        I'm still confused My patch has your test case (which uses ConstantScoreQuery). I tweaked the test case a bit, eg to not rely on TermsFilter (which is in contrib).

        The test failed when I first made the change (as expected).

        Then I modified CachingWrapperFilter to take the optional boolean "enforceDeletions".

        Then I changed the test to pass "true" for enforceDeletions, and the test now passes.

        I don't think any change is needed to ConstantScoreQuery? (Ie, I took the "less invasive" option in my comment above).

        Show
        Michael McCandless added a comment - Basically, it does not work with your change and it using a cached filter. I'm still confused My patch has your test case (which uses ConstantScoreQuery). I tweaked the test case a bit, eg to not rely on TermsFilter (which is in contrib). The test failed when I first made the change (as expected). Then I modified CachingWrapperFilter to take the optional boolean "enforceDeletions". Then I changed the test to pass "true" for enforceDeletions, and the test now passes. I don't think any change is needed to ConstantScoreQuery? (Ie, I took the "less invasive" option in my comment above).
        Hide
        Shay Banon added a comment -

        Ahh, now I see that, sorry I missed it. But, basically, enforcing deletions means that we are back to the original problem... . I think it would be quite confusing for users, to be honest. Out of the filters, the problematic ones are the ones that can be converted to queries. From what I can see, the FilteredQuery is ok, so, maybe the ConstantScore can be changed (if possible) to do that... .

        Show
        Shay Banon added a comment - Ahh, now I see that, sorry I missed it. But, basically, enforcing deletions means that we are back to the original problem... . I think it would be quite confusing for users, to be honest. Out of the filters, the problematic ones are the ones that can be converted to queries. From what I can see, the FilteredQuery is ok, so, maybe the ConstantScore can be changed (if possible) to do that... .
        Hide
        Shay Banon added a comment -

        Here is a go at making ConstantScoreQuery deletion aware. I named it differently, but it can replace ConstantScoreQuery with a flag making it deletion aware. What do you think?

        Show
        Shay Banon added a comment - Here is a go at making ConstantScoreQuery deletion aware. I named it differently, but it can replace ConstantScoreQuery with a flag making it deletion aware. What do you think?
        Hide
        Shay Banon added a comment -

        Another quick question Mike, what do you think about the ability to know when a "cache key" is actually closed so it can be removed from a cache? Similar in concept to the eviction done from the field cache in trunk by readers, but open so other Reader#cacheKey based caches (which is the simplest way to do caching in Lucene) can use.

        Show
        Shay Banon added a comment - Another quick question Mike, what do you think about the ability to know when a "cache key" is actually closed so it can be removed from a cache? Similar in concept to the eviction done from the field cache in trunk by readers, but open so other Reader#cacheKey based caches (which is the simplest way to do caching in Lucene) can use.
        Hide
        Michael McCandless added a comment -

        Here is a go at making ConstantScoreQuery deletion aware. I named it differently, but it can replace ConstantScoreQuery with a flag making it deletion aware. What do you think?

        I don't think we need to fix ConstantScoreQuery to be deletions
        aware?

        With the perf fix we are doing here, the problem (not correctly
        "seeing" deletes on a reopened reader) is isolated to
        CachingWrapperFilter/CachingSpanFilter, right?

        Why fix ConstantScoreQuery, when so many other Filter impls will
        properly apply deletions?

        Show
        Michael McCandless added a comment - Here is a go at making ConstantScoreQuery deletion aware. I named it differently, but it can replace ConstantScoreQuery with a flag making it deletion aware. What do you think? I don't think we need to fix ConstantScoreQuery to be deletions aware? With the perf fix we are doing here, the problem (not correctly "seeing" deletes on a reopened reader) is isolated to CachingWrapperFilter/CachingSpanFilter, right? Why fix ConstantScoreQuery, when so many other Filter impls will properly apply deletions?
        Hide
        Michael McCandless added a comment -

        Another quick question Mike, what do you think about the ability to know when a "cache key" is actually closed so it can be removed from a cache? Similar in concept to the eviction done from the field cache in trunk by readers, but open so other Reader#cacheKey based caches (which is the simplest way to do caching in Lucene) can use.

        I think this would be a good change – it would make eviction immediate instead of just when GC gets around to pruning the WeakHashMap. Can you open a separate issue and maybe work out a patch?

        Or, the other alternative would be to have IR hold such caches, as a service, to "things" that need caching.

        Show
        Michael McCandless added a comment - Another quick question Mike, what do you think about the ability to know when a "cache key" is actually closed so it can be removed from a cache? Similar in concept to the eviction done from the field cache in trunk by readers, but open so other Reader#cacheKey based caches (which is the simplest way to do caching in Lucene) can use. I think this would be a good change – it would make eviction immediate instead of just when GC gets around to pruning the WeakHashMap. Can you open a separate issue and maybe work out a patch? Or, the other alternative would be to have IR hold such caches, as a service, to "things" that need caching.
        Hide
        Shay Banon added a comment -

        With the perf fix we are doing here, the problem (not correctly

        "seeing" deletes on a reopened reader) is isolated to
        CachingWrapperFilter/CachingSpanFilter, right?

        Yes, but, this means that ConstantScoreQuery should basically not be cached when using NRT (even with using IndexReader as key...), because of the excessive readers created. With the one that is deletion aware, you can cache it based on the cache key.

        I think this would be a good change - it would make eviction immediate instead of just when GC gets around to pruning the WeakHashMap. Can you open a separate issue and maybe work out a patch?

        Sure, I will do it.

        Show
        Shay Banon added a comment - With the perf fix we are doing here, the problem (not correctly "seeing" deletes on a reopened reader) is isolated to CachingWrapperFilter/CachingSpanFilter, right? Yes, but, this means that ConstantScoreQuery should basically not be cached when using NRT (even with using IndexReader as key...), because of the excessive readers created. With the one that is deletion aware, you can cache it based on the cache key. I think this would be a good change - it would make eviction immediate instead of just when GC gets around to pruning the WeakHashMap. Can you open a separate issue and maybe work out a patch? Sure, I will do it.
        Hide
        Michael McCandless added a comment -

        Yes, but, this means that ConstantScoreQuery should basically not be cached when using NRT (even with using IndexReader as key...), because of the excessive readers created. With the one that is deletion aware, you can cache it based on the cache key.

        OK, now (finally) I understand the problem!

        You want to be able to cache the original filter and reuse it even when deletions have changed, but then dynamically apply the deletions so they are properly enforced (rather than discarding the cache entry).

        So... why not do this in CachingWrapper/SpanFilter, but, instead of discarding the cache entry when deletions must be enforced, we dynamically apply the deletions? (I think we could use FilteredDocIdSet).

        Really... we need a more generic solution here (but, it's a much bigger change), where somehow in creating the scorer per-segment we dynamically determine who/where the deletions are enforced. A Filter need not care about deletions if it's AND'd w/ a query that already enforces the deletions.

        Show
        Michael McCandless added a comment - Yes, but, this means that ConstantScoreQuery should basically not be cached when using NRT (even with using IndexReader as key...), because of the excessive readers created. With the one that is deletion aware, you can cache it based on the cache key. OK, now (finally) I understand the problem! You want to be able to cache the original filter and reuse it even when deletions have changed, but then dynamically apply the deletions so they are properly enforced (rather than discarding the cache entry). So... why not do this in CachingWrapper/SpanFilter, but, instead of discarding the cache entry when deletions must be enforced, we dynamically apply the deletions? (I think we could use FilteredDocIdSet). Really... we need a more generic solution here (but, it's a much bigger change), where somehow in creating the scorer per-segment we dynamically determine who/where the deletions are enforced. A Filter need not care about deletions if it's AND'd w/ a query that already enforces the deletions.
        Hide
        Shay Banon added a comment -

        So... why not do this in CachingWrapper/SpanFilter, but, instead of discarding the cache entry when deletions must be enforced, we dynamically apply the deletions? (I think we could use FilteredDocIdSet).

        Yea, that would work well. You will need to somehow still know when to enable or disable this based on the filter you use (it should basically only be enabled ones that are passed to constant score... .

        Really... we need a more generic solution here (but, it's a much bigger change), where somehow in creating the scorer per-segment we dynamically determine who/where the deletions are enforced. A Filter need not care about deletions if it's AND'd w/ a query that already enforces the deletions.

        Agreed. As I see it, caching based on IndexReader is key in Lucene, and with NRT, it should feel the same way as it is without it. NRT should not change the way you build your system.

        Show
        Shay Banon added a comment - So... why not do this in CachingWrapper/SpanFilter, but, instead of discarding the cache entry when deletions must be enforced, we dynamically apply the deletions? (I think we could use FilteredDocIdSet). Yea, that would work well. You will need to somehow still know when to enable or disable this based on the filter you use (it should basically only be enabled ones that are passed to constant score... . Really... we need a more generic solution here (but, it's a much bigger change), where somehow in creating the scorer per-segment we dynamically determine who/where the deletions are enforced. A Filter need not care about deletions if it's AND'd w/ a query that already enforces the deletions. Agreed. As I see it, caching based on IndexReader is key in Lucene, and with NRT, it should feel the same way as it is without it. NRT should not change the way you build your system.
        Hide
        Michael McCandless added a comment -

        So... why not do this in CachingWrapper/SpanFilter, but, instead of discarding the cache entry when deletions must be enforced, we dynamically apply the deletions? (I think we could use FilteredDocIdSet).

        Yea, that would work well. You will need to somehow still know when to enable or disable this based on the filter you use (it should basically only be enabled ones that are passed to constant score...

        OK I'll take that approach on next iter.

        But: I think this may need to be enabled in other cases where the
        filter is used (ie not only CSQ). Sure, CSQ is the one example we
        have today, where if you pass a Filter that ignores "recent" deletions
        you'll be in trouble... but who knows what other uses of a Filter
        might trip up on this intentional cache-incoherence we're introducing.

        Agreed. As I see it, caching based on IndexReader is key in Lucene, and with NRT, it should feel the same way as it is without it. NRT should not change the way you build your system.

        Well... NRT and up-to-date deletions will always present a challenge.

        Really, this tradeoff we are making here, where a cached filter can be
        set to either 1) ignore new deletions, 2) discard its cache entry and
        fully regenerate itself, or 3) dynamically intersect the deletions, is
        similar to the discussions we've had about just how an NRT segment
        reader should enforce recent deletions.

        Ie, ignoring option 1 (which of course gives the best perf), option 2,
        while making a reopen more costly, gets you the best search
        performance (since only one bit set is checked during searches).

        Option 3 makes reopens much faster, but then search peformance takes a
        hit (since you're checking 2 bit sets).

        Option 2 is analogous to how Lucene now handles the per-segment
        deleted docs bit vector (it's fully recreated on each reopen), while
        option 3 is analogous to how Zoie handles deletions (new deletions are
        dynamically applied to all search hits).

        Show
        Michael McCandless added a comment - So... why not do this in CachingWrapper/SpanFilter, but, instead of discarding the cache entry when deletions must be enforced, we dynamically apply the deletions? (I think we could use FilteredDocIdSet). Yea, that would work well. You will need to somehow still know when to enable or disable this based on the filter you use (it should basically only be enabled ones that are passed to constant score... OK I'll take that approach on next iter. But: I think this may need to be enabled in other cases where the filter is used (ie not only CSQ). Sure, CSQ is the one example we have today, where if you pass a Filter that ignores "recent" deletions you'll be in trouble... but who knows what other uses of a Filter might trip up on this intentional cache-incoherence we're introducing. Agreed. As I see it, caching based on IndexReader is key in Lucene, and with NRT, it should feel the same way as it is without it. NRT should not change the way you build your system. Well... NRT and up-to-date deletions will always present a challenge. Really, this tradeoff we are making here, where a cached filter can be set to either 1) ignore new deletions, 2) discard its cache entry and fully regenerate itself, or 3) dynamically intersect the deletions, is similar to the discussions we've had about just how an NRT segment reader should enforce recent deletions. Ie, ignoring option 1 (which of course gives the best perf), option 2, while making a reopen more costly, gets you the best search performance (since only one bit set is checked during searches). Option 3 makes reopens much faster, but then search peformance takes a hit (since you're checking 2 bit sets). Option 2 is analogous to how Lucene now handles the per-segment deleted docs bit vector (it's fully recreated on each reopen), while option 3 is analogous to how Zoie handles deletions (new deletions are dynamically applied to all search hits).
        Hide
        Shay Banon added a comment -

        Hi Mike,

        First, I opened and attached a patch regarding the Cache eviction listeners to IndexReader: https://issues.apache.org/jira/browse/LUCENE-2474, tell me what you think.

        Regarding your last comment, I agree. Though, trying to streamline its usage in terms of having all built in components and possible extensions work well with it make sense. Thats what you suggest in with the filtered doc set, which is cool.

        Show
        Shay Banon added a comment - Hi Mike, First, I opened and attached a patch regarding the Cache eviction listeners to IndexReader: https://issues.apache.org/jira/browse/LUCENE-2474 , tell me what you think. Regarding your last comment, I agree. Though, trying to streamline its usage in terms of having all built in components and possible extensions work well with it make sense. Thats what you suggest in with the filtered doc set, which is cool.
        Hide
        Michael McCandless added a comment -

        OK, I attached another go at this.

        I added a DeletesMode enum to CachingWrapperFilter: IGNORE (default)
        means just re-use the cache entry when the reader is reopened w/ new
        deletions; RECACHE (fully recreate the cache entry); DYNAMIC (re-use
        the cache entry, but use FilteredDocIdSet to dynamically re-filter it
        against the current deletions).

        I did the same for CachingSpanFilter, but I don't allow DYNAMIC for
        that one – I punted on it because it's kinda hairy (I'd have to copy
        the List<PositionInfo> and remove entries corresponding to deleted
        docs). IGNORE and RECACHE are allowed.

        Show
        Michael McCandless added a comment - OK, I attached another go at this. I added a DeletesMode enum to CachingWrapperFilter: IGNORE (default) means just re-use the cache entry when the reader is reopened w/ new deletions; RECACHE (fully recreate the cache entry); DYNAMIC (re-use the cache entry, but use FilteredDocIdSet to dynamically re-filter it against the current deletions). I did the same for CachingSpanFilter, but I don't allow DYNAMIC for that one – I punted on it because it's kinda hairy (I'd have to copy the List<PositionInfo> and remove entries corresponding to deleted docs). IGNORE and RECACHE are allowed.
        Hide
        Michael McCandless added a comment -

        backport

        Show
        Michael McCandless added a comment - backport

          People

          • Assignee:
            Michael McCandless
            Reporter:
            Yonik Seeley
          • Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development