Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: master (7.0), 6.4
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      This can be used when the user doesn't want tf/idf scoring for some reason. The idea is that the score is just query_time_boost * index_time_boost, no queryNorm/IDF/TF/lengthNorm...

      1. LUCENE-5867.patch
        11 kB
        Adrien Grand
      2. LUCENE-5867.patch
        5 kB
        Robert Muir

        Activity

        Hide
        rcmuir Robert Muir added a comment -

        Here's the start to a patch. No tests yet.

        Show
        rcmuir Robert Muir added a comment - Here's the start to a patch. No tests yet.
        Hide
        mkhludnev Mikhail Khludnev added a comment -

        People often want coord-factor also.

        Show
        mkhludnev Mikhail Khludnev added a comment - People often want coord-factor also.
        Hide
        rcmuir Robert Muir added a comment -

        This similarity is already a coordinate-level match, because it ignores TF etc completely and scores 1 for each matching term.

        Show
        rcmuir Robert Muir added a comment - This similarity is already a coordinate-level match, because it ignores TF etc completely and scores 1 for each matching term.
        Hide
        teofili Tommaso Teofili added a comment -

        +1

        Show
        teofili Tommaso Teofili added a comment - +1
        Hide
        jkrupan Jack Krupansky added a comment -

        Would this be expected to result in any dramatic improvement in indexing or query performance, or a dramatic reduction in index size?

        Show
        jkrupan Jack Krupansky added a comment - Would this be expected to result in any dramatic improvement in indexing or query performance, or a dramatic reduction in index size?
        Hide
        rcmuir Robert Muir added a comment -

        It is not intended to be faster or anything like that. The idea is that this is simpler to use for use-cases where the typical ranking "gets in the way".

        Show
        rcmuir Robert Muir added a comment - It is not intended to be faster or anything like that. The idea is that this is simpler to use for use-cases where the typical ranking "gets in the way".
        Hide
        jpountz Adrien Grand added a comment -

        I'd like to revive this issue. Here is an updated patch against current master. Like the previous patch, it does score regardless of index statistics, document length or term freq. However it does not take the index time boost into account for scoring (only query-time boost) and encodes norms the same way as BM25Similarity, ClassicSimilarity or SimilarityBase. The benefit is that it would allow to switch between the BM25, Classic or Boolean similarity after the index has been created.

        Show
        jpountz Adrien Grand added a comment - I'd like to revive this issue. Here is an updated patch against current master. Like the previous patch, it does score regardless of index statistics, document length or term freq. However it does not take the index time boost into account for scoring (only query-time boost) and encodes norms the same way as BM25Similarity, ClassicSimilarity or SimilarityBase. The benefit is that it would allow to switch between the BM25, Classic or Boolean similarity after the index has been created.
        Hide
        thetaphi Uwe Schindler added a comment -

        Hi,
        I like the patch and the simplicity of it!
        Basically, this patch provides the same effect if I would wrap all my TermQuery and PhraseQuery with a ConstantScoreQuery and only apply BoostQuery() to them?
        In addition, when using this similarity, I could also just disable norms for all fields I use it on?

        Show
        thetaphi Uwe Schindler added a comment - Hi, I like the patch and the simplicity of it! Basically, this patch provides the same effect if I would wrap all my TermQuery and PhraseQuery with a ConstantScoreQuery and only apply BoostQuery() to them? In addition, when using this similarity, I could also just disable norms for all fields I use it on?
        Hide
        jpountz Adrien Grand added a comment -

        Basically, this patch provides the same effect if I would wrap all my TermQuery and PhraseQuery with a ConstantScoreQuery and only apply BoostQuery() to them?

        This is correct.

        In addition, when using this similarity, I could also just disable norms for all fields I use it on?

        This is correct if you do not plan to switch to another similarity later on. And in the case that you do not need phrase matching, you could also index with IndexOptions.DOCS_ONLY rather than IndexOptions.DOCS_AND_FREQS since freqs are not used for scoring either.

        Show
        jpountz Adrien Grand added a comment - Basically, this patch provides the same effect if I would wrap all my TermQuery and PhraseQuery with a ConstantScoreQuery and only apply BoostQuery() to them? This is correct. In addition, when using this similarity, I could also just disable norms for all fields I use it on? This is correct if you do not plan to switch to another similarity later on. And in the case that you do not need phrase matching, you could also index with IndexOptions.DOCS_ONLY rather than IndexOptions.DOCS_AND_FREQS since freqs are not used for scoring either.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 3e15233b23197122e40a851edab7b7257ce63f02 in lucene-solr's branch refs/heads/master from Adrien Grand
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=3e15233 ]

        LUCENE-5867: Add a BooleanSimilarity.

        Show
        jira-bot ASF subversion and git services added a comment - Commit 3e15233b23197122e40a851edab7b7257ce63f02 in lucene-solr's branch refs/heads/master from Adrien Grand [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=3e15233 ] LUCENE-5867 : Add a BooleanSimilarity.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 74da1cff27467ab540c343dd589832d9f417dd25 in lucene-solr's branch refs/heads/branch_6x from Adrien Grand
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=74da1cf ]

        LUCENE-5867: Add a BooleanSimilarity.

        Show
        jira-bot ASF subversion and git services added a comment - Commit 74da1cff27467ab540c343dd589832d9f417dd25 in lucene-solr's branch refs/heads/branch_6x from Adrien Grand [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=74da1cf ] LUCENE-5867 : Add a BooleanSimilarity.

          People

          • Assignee:
            Unassigned
            Reporter:
            rcmuir Robert Muir
          • Votes:
            1 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development