Lucene - Core
  1. Lucene - Core
  2. LUCENE-4933

SweetSpotSimilarity doesnt override tf(float)

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 2.0.0
    • Fix Version/s: 4.4, 6.0
    • Component/s: core/query/scoring
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      This means its scoring is not really right: it only applies to term queries and exact phrase queries, but not e.g. sloppy phrase queries and spans.

      As far as I can tell, its had this bug all along.

        Activity

        Hide
        Robert Muir added a comment -

        patch removing score(int)/tf(int) so the trap doesn't even exist anymore.

        Show
        Robert Muir added a comment - patch removing score(int)/tf(int) so the trap doesn't even exist anymore.
        Hide
        Hoss Man added a comment -

        As far as I can tell, its had this bug all along.

        definitely a non-intentional fuck up from the very begining

        patch removing score(int)/tf(int) so the trap doesn't even exist anymore.

        i only skimmed the patch and didn't understand most of it (need to look at in context more) but if i'm understanding the crux of it you don't just mean from SSS, you mean you want to remove tf(int) from TFIDFSimilarity entirely?

        isn't that kind of a baby vs bathwater situation?

        historically the value of having both tf(int) and tf(float) was that people could choose to implement alternative functions for dealing with phrase frequency (using tf(float)) vs single term query's (using tf(int)) ... is that still possible in some other way in all of the other changes in your patch that i didn't see in my quick skim?

        Show
        Hoss Man added a comment - As far as I can tell, its had this bug all along. definitely a non-intentional fuck up from the very begining patch removing score(int)/tf(int) so the trap doesn't even exist anymore. i only skimmed the patch and didn't understand most of it (need to look at in context more) but if i'm understanding the crux of it you don't just mean from SSS, you mean you want to remove tf(int) from TFIDFSimilarity entirely? isn't that kind of a baby vs bathwater situation? historically the value of having both tf(int) and tf(float) was that people could choose to implement alternative functions for dealing with phrase frequency (using tf(float)) vs single term query's (using tf(int)) ... is that still possible in some other way in all of the other changes in your patch that i didn't see in my quick skim?
        Hide
        Robert Muir added a comment -

        historically the value of having both tf(int) and tf(float) was that people could choose to implement alternative functions for dealing with phrase frequency (using tf(float)) vs single term query's (using tf(int))

        There is no value in having different functions here: only the possibility of bugs.

        Show
        Robert Muir added a comment - historically the value of having both tf(int) and tf(float) was that people could choose to implement alternative functions for dealing with phrase frequency (using tf(float)) vs single term query's (using tf(int)) There is no value in having different functions here: only the possibility of bugs.
        Hide
        Michael McCandless added a comment -

        +1 to remove the confusing double tf ... it seems incredibly trappy.

        Show
        Michael McCandless added a comment - +1 to remove the confusing double tf ... it seems incredibly trappy.
        Hide
        Steve Rowe added a comment -

        Bulk close resolved 4.4 issues

        Show
        Steve Rowe added a comment - Bulk close resolved 4.4 issues

          People

          • Assignee:
            Unassigned
            Reporter:
            Robert Muir
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development