Details

    • Type: Improvement
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.1
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Scorer currently has to implement a whole bunch of methods that are never called. The only method that Scorer uses in addition to the methods on DocIdSetIterator is freq(), and as currently implemented this means different things on different Scorers:

      • TermScorer returns its underlying termfreq
      • MinShouldMatchScorer returns how many of its subscorers are matching
      • {Exact|Sloppy}

        PhraseScorer returns how many phrases it has found on a document

      In addition, freq() is never actually called on TermScorer, and it's only used in explain() on the phrase scorers.

      We should make Scorer extend DocIdSetIterator instead. In place of freq(), Scorer would have a coord() method that by default returns 1, and for boolean scorers returns how many subscorers are matching.

      1. LUCENE-6272.patch
        54 kB
        Alan Woodward
      2. LUCENE-6272.patch
        68 kB
        Alan Woodward
      3. LUCENE-6272.patch
        116 kB
        Alan Woodward

        Activity

        Hide
        romseygeek Alan Woodward added a comment -

        Patch. I also added a PhraseScorer abstract class with a phraseFreq() method that PhraseWeight.explain() calls.

        There are a couple of tests that walk the Scorer tree and check that TermScorers return the correct freq() which I think we can just nuke? Seeing as TermScorer.freq() is literally called nowhere else in the codebase.

        Alternatively, we could keep the existing freq() implementations for backwards-compatibility purposes, deprecating them in 5x and removing in trunk.

        Show
        romseygeek Alan Woodward added a comment - Patch. I also added a PhraseScorer abstract class with a phraseFreq() method that PhraseWeight.explain() calls. There are a couple of tests that walk the Scorer tree and check that TermScorers return the correct freq() which I think we can just nuke? Seeing as TermScorer.freq() is literally called nowhere else in the codebase. Alternatively, we could keep the existing freq() implementations for backwards-compatibility purposes, deprecating them in 5x and removing in trunk.
        Hide
        rcmuir Robert Muir added a comment -

        I think its complicated to change the semantics of freq() at the same time as refactoring the class hierarchy.

        freq() and getChildren() are still being discussed here: LUCENE-6229

        Show
        rcmuir Robert Muir added a comment - I think its complicated to change the semantics of freq() at the same time as refactoring the class hierarchy. freq() and getChildren() are still being discussed here: LUCENE-6229
        Hide
        romseygeek Alan Woodward added a comment -

        OK, so just add freq() as an abstract method directly on Scorer then? It's not actually called as PostingsEnum.freq() anywhere.

        Show
        romseygeek Alan Woodward added a comment - OK, so just add freq() as an abstract method directly on Scorer then? It's not actually called as PostingsEnum.freq() anywhere.
        Hide
        rcmuir Robert Muir added a comment -

        Yes, thats how it used to be. DocsEnum already had freq() and attributes(), so Scorer just built on that.

        Show
        rcmuir Robert Muir added a comment - Yes, thats how it used to be. DocsEnum already had freq() and attributes(), so Scorer just built on that.
        Hide
        romseygeek Alan Woodward added a comment -

        Updated patch, just making Scorer extend DocIdSetIterator and adding freq() as an abstract method directly on Scorer.

        Show
        romseygeek Alan Woodward added a comment - Updated patch, just making Scorer extend DocIdSetIterator and adding freq() as an abstract method directly on Scorer.
        Hide
        rcmuir Robert Muir added a comment -

        I like it, removes a lot of hassle.

        btw what IDE do you use? Is it intellij? I noticed all the imports in the files modified switch the order of java.XXX vs org.apache.lucene.XXX imports. I think these IDE templates need to be consistent here.

        Show
        rcmuir Robert Muir added a comment - I like it, removes a lot of hassle. btw what IDE do you use? Is it intellij? I noticed all the imports in the files modified switch the order of java.XXX vs org.apache.lucene.XXX imports. I think these IDE templates need to be consistent here.
        Hide
        romseygeek Alan Woodward added a comment -

        Yes, IntelliJ. I've got the template set up as:

        • other imports
        • <blank line>
        • import javax.*
        • import java.*
        • <blank line>
        • static imports
        Show
        romseygeek Alan Woodward added a comment - Yes, IntelliJ. I've got the template set up as: other imports <blank line> import javax.* import java.* <blank line> static imports
        Hide
        romseygeek Alan Woodward added a comment -

        Which is obviously different to what everyone else has... Hmpf, I thought I had set it up to be consistent with the project. Will change.

        Show
        romseygeek Alan Woodward added a comment - Which is obviously different to what everyone else has... Hmpf, I thought I had set it up to be consistent with the project. Will change.
        Hide
        elyograg Shawn Heisey added a comment -

        I use eclipse. Hopefully one day I can switch, but haven't had enough time to learn IntelliJ yet.

        I haven't looked at everything that gets set by the idea/eclipse targets, but I would have hoped that those targets would force all formatting templates at the project level. I know that for eclipse, the general formatting (indentation, etc) is set by the eclipse target.

        Show
        elyograg Shawn Heisey added a comment - I use eclipse. Hopefully one day I can switch, but haven't had enough time to learn IntelliJ yet. I haven't looked at everything that gets set by the idea/eclipse targets, but I would have hoped that those targets would force all formatting templates at the project level. I know that for eclipse, the general formatting (indentation, etc) is set by the eclipse target.
        Hide
        romseygeek Alan Woodward added a comment -

        Patch with import noise removed.

        Show
        romseygeek Alan Woodward added a comment - Patch with import noise removed.
        Hide
        rcmuir Robert Muir added a comment -

        Thanks. In the CHANGES file can we say no longer extends DocsEnum (i think its just a typo).

        +1

        Show
        rcmuir Robert Muir added a comment - Thanks. In the CHANGES file can we say no longer extends DocsEnum (i think its just a typo). +1
        Hide
        romseygeek Alan Woodward added a comment -

        Oops, yes. I think I need to go to bed.

        Show
        romseygeek Alan Woodward added a comment - Oops, yes. I think I need to go to bed.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 1661376 from Alan Woodward in branch 'dev/trunk'
        [ https://svn.apache.org/r1661376 ]

        LUCENE-6272: Scorer extends DocIdSetIterator directly

        Show
        jira-bot ASF subversion and git services added a comment - Commit 1661376 from Alan Woodward in branch 'dev/trunk' [ https://svn.apache.org/r1661376 ] LUCENE-6272 : Scorer extends DocIdSetIterator directly
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 1661381 from Alan Woodward in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1661381 ]

        LUCENE-6272: Scorer extends DocIdSetIterator directly

        Show
        jira-bot ASF subversion and git services added a comment - Commit 1661381 from Alan Woodward in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1661381 ] LUCENE-6272 : Scorer extends DocIdSetIterator directly
        Hide
        rcmuir Robert Muir added a comment -

        Thanks Alan!

        Show
        rcmuir Robert Muir added a comment - Thanks Alan!
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 1661396 from Adrien Grand in branch 'dev/trunk'
        [ https://svn.apache.org/r1661396 ]

        LUCENE-6270: Fix merge issues with LUCENE-6272.

        Show
        jira-bot ASF subversion and git services added a comment - Commit 1661396 from Adrien Grand in branch 'dev/trunk' [ https://svn.apache.org/r1661396 ] LUCENE-6270 : Fix merge issues with LUCENE-6272 .
        Hide
        thelabdude Timothy Potter added a comment -

        Bulk close after 5.1 release

        Show
        thelabdude Timothy Potter added a comment - Bulk close after 5.1 release

          People

          • Assignee:
            romseygeek Alan Woodward
            Reporter:
            romseygeek Alan Woodward
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development