Lucene - Core
  1. Lucene - Core
  2. LUCENE-4874

Don't override non abstract methods that have an impl through other abstract methods in FilterAtomicReader and related classes

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Terms.intersect is an optional method. The fact that it is overridden in FilterTerms forces any non-trivial class that extends FilterTerms to override intersect in order this method to have a correct behavior. If FilterTerms did not override this method and used the default impl, we would not have this problem.

      1. LUCENE-4874.patch
        32 kB
        Adrien Grand
      2. LUCENE-4874.patch
        31 kB
        Adrien Grand

        Activity

        Hide
        Robert Muir added a comment -

        I think we should be consistent about this completely and not override any methods with default implementations in these Filter* classes.

        So that means stuff like FilterTermsEnum.seekExact too. We should review all these classes in FilterAtomicReader.

        Show
        Robert Muir added a comment - I think we should be consistent about this completely and not override any methods with default implementations in these Filter* classes. So that means stuff like FilterTermsEnum.seekExact too. We should review all these classes in FilterAtomicReader.
        Hide
        Adrien Grand added a comment -

        This makes sense. I found another bug in SortingAtomicReader which doesn't override getCoreCacheKey, this could lead to very bad things if an atomic reader and its sorted view were both used with the same FieldCache instance.

        I've started looking at methods that override default impls and would like to have your opinion on some of them:

        • shouldn't IndexReader.hasDeletions return numDeletedDocs() > 0 by default instead of being abstract?
        • isn't the default impl of TermsEnum.termState dangerous? Shouldn't it throw an UnsupportedOperationException or being abstract instead?
        Show
        Adrien Grand added a comment - This makes sense. I found another bug in SortingAtomicReader which doesn't override getCoreCacheKey, this could lead to very bad things if an atomic reader and its sorted view were both used with the same FieldCache instance. I've started looking at methods that override default impls and would like to have your opinion on some of them: shouldn't IndexReader.hasDeletions return numDeletedDocs() > 0 by default instead of being abstract? isn't the default impl of TermsEnum.termState dangerous? Shouldn't it throw an UnsupportedOperationException or being abstract instead?
        Hide
        Robert Muir added a comment -

        shouldn't IndexReader.hasDeletions return numDeletedDocs() > 0 by default instead of being abstract?

        I fell into this trap myself before (naively providing a fake livedocs to a filterreader), there are quite a few traps and inconsistencies:

        • if you override the livedocs you currently need to override a bunch of other things too (e.g. hasDeletions, numDocs). we should at least add a javadoc to FilterAtomicReader make it less trappy.
        • hasDeletions is abstract, and its easy to forget to implement this.
        • SegmentReader says that for hasDeletions/numDocs/maxDoc "Don't call ensureOpen() here (it could affect performance)". But FilterAtomicReader only says this for numDocs and maxDoc and calls it on hasDeletions!
        • default implementation in SegmentReader uses liveDocs != null: but i dont know anything checking (e.g. CheckIndex should do this) that the codec's LiveDocsFormat must return null (vs a Bits matching all docs). We should check this if there is code actually making this (undocumented) assumption.

        For SegmentReader at least, it seems to be better if it had this implementation: and neither numDocs() and maxDocs() calls ensureOpen() either so I only see it as a safer way to go (we could avoid the sketchy liveDocs != null, but i bet other code elsewhere relies on this too).

        isn't the default impl of TermsEnum.termState dangerous? Shouldn't it throw an UnsupportedOperationException or being abstract instead?

        I don't know whats going on here: but I agree. it seems the current default implementation is "silently wont work".

        Show
        Robert Muir added a comment - shouldn't IndexReader.hasDeletions return numDeletedDocs() > 0 by default instead of being abstract? I fell into this trap myself before (naively providing a fake livedocs to a filterreader), there are quite a few traps and inconsistencies: if you override the livedocs you currently need to override a bunch of other things too (e.g. hasDeletions, numDocs). we should at least add a javadoc to FilterAtomicReader make it less trappy. hasDeletions is abstract, and its easy to forget to implement this. SegmentReader says that for hasDeletions/numDocs/maxDoc "Don't call ensureOpen() here (it could affect performance)". But FilterAtomicReader only says this for numDocs and maxDoc and calls it on hasDeletions! default implementation in SegmentReader uses liveDocs != null: but i dont know anything checking (e.g. CheckIndex should do this) that the codec's LiveDocsFormat must return null (vs a Bits matching all docs). We should check this if there is code actually making this (undocumented) assumption. For SegmentReader at least, it seems to be better if it had this implementation: and neither numDocs() and maxDocs() calls ensureOpen() either so I only see it as a safer way to go (we could avoid the sketchy liveDocs != null, but i bet other code elsewhere relies on this too). isn't the default impl of TermsEnum.termState dangerous? Shouldn't it throw an UnsupportedOperationException or being abstract instead? I don't know whats going on here: but I agree. it seems the current default implementation is "silently wont work".
        Hide
        Adrien Grand added a comment -

        Although DocIdSetIterator.advance is abstract, it describes a default implementation that many classes that extend DocsEnum/DocsAndPositionsEnum duplicate. Maybe we should just provide a default implementation for advance, this would save copy-pastes.

        Show
        Adrien Grand added a comment - Although DocIdSetIterator.advance is abstract, it describes a default implementation that many classes that extend DocsEnum/DocsAndPositionsEnum duplicate. Maybe we should just provide a default implementation for advance, this would save copy-pastes.
        Hide
        Robert Muir added a comment -

        In this case I think its a little scary to have a default very-slow implementation. Similar to concerns raised on LUCENE-1592

        Show
        Robert Muir added a comment - In this case I think its a little scary to have a default very-slow implementation. Similar to concerns raised on LUCENE-1592
        Hide
        Adrien Grand added a comment -

        Patch

        • Implements IndexReader.hasDeletions() as numDeletedDocs() > 0 by default (instead of abstract) with no ensureOpen check (SegmentReader uses this default impl instead of liveDocs != null).
        • Adds warning in FilterAtomicReader about overriding
        • Methods that provide a default implementation are not implemented as in.XXX in Filter*** anymore (in particular Terms.intersect, IndexReader.hasDeletions and IndexReader.get*CacheKey). The test in TestFilterAtomicReader has been modified to ensure this.
        • TermsEnum.termState returns a TermState that throws an UOE instead of failing silently
        • DocIdSetIterator provides a protected final slowAdvance(int target) method, which advances linearly and can be used in implementations that don't require advance to be fast (and has an assert to make sure that target is greater than the current position).
        Show
        Adrien Grand added a comment - Patch Implements IndexReader.hasDeletions() as numDeletedDocs() > 0 by default (instead of abstract) with no ensureOpen check (SegmentReader uses this default impl instead of liveDocs != null). Adds warning in FilterAtomicReader about overriding Methods that provide a default implementation are not implemented as in.XXX in Filter*** anymore (in particular Terms.intersect, IndexReader.hasDeletions and IndexReader.get*CacheKey). The test in TestFilterAtomicReader has been modified to ensure this. TermsEnum.termState returns a TermState that throws an UOE instead of failing silently DocIdSetIterator provides a protected final slowAdvance(int target) method, which advances linearly and can be used in implementations that don't require advance to be fast (and has an assert to make sure that target is greater than the current position).
        Hide
        Robert Muir added a comment -

        i think attributes() should be overridden by the Filter** by default. This is basically a getter.

        Show
        Robert Muir added a comment - i think attributes() should be overridden by the Filter** by default. This is basically a getter.
        Hide
        Adrien Grand added a comment -

        Updated patch. I added back attributes modified the test (+ comments) and added a changelog entry.

        Show
        Adrien Grand added a comment - Updated patch. I added back attributes modified the test (+ comments) and added a changelog entry.
        Hide
        Robert Muir added a comment -

        +1, thanks for working on these API issues!

        Show
        Robert Muir added a comment - +1, thanks for working on these API issues!
        Hide
        Commit Tag Bot added a comment -

        [trunk commit] jpountz
        http://svn.apache.org/viewvc?view=revision&revision=1464710

        LUCENE-4874: Don't override non abstract methods that have an impl through
        other abstract methods in FilterAtomicReader and related classes.

        Show
        Commit Tag Bot added a comment - [trunk commit] jpountz http://svn.apache.org/viewvc?view=revision&revision=1464710 LUCENE-4874 : Don't override non abstract methods that have an impl through other abstract methods in FilterAtomicReader and related classes.
        Hide
        Commit Tag Bot added a comment -

        [branch_4x commit] jpountz
        http://svn.apache.org/viewvc?view=revision&revision=1464776

        LUCENE-4874: Don't override non abstract methods that have an impl through
        other abstract methods in FilterAtomicReader and related classes (merged from
        r1464710 and r1464768).

        Show
        Commit Tag Bot added a comment - [branch_4x commit] jpountz http://svn.apache.org/viewvc?view=revision&revision=1464776 LUCENE-4874 : Don't override non abstract methods that have an impl through other abstract methods in FilterAtomicReader and related classes (merged from r1464710 and r1464768).
        Hide
        Adrien Grand added a comment -

        Thanks for helping Robert!

        Show
        Adrien Grand added a comment - Thanks for helping Robert!

          People

          • Assignee:
            Adrien Grand
            Reporter:
            Adrien Grand
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development