Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 7.0
    • Fix Version/s: 7.0, 6.4
    • Component/s: core/search
    • Labels:
    • Lucene Fields:
      New

      Description

      Add axiomatic similarity approaches to the similarity family.
      More details can be found at http://dl.acm.org/citation.cfm?id=1076116 and https://www.eecis.udel.edu/~hfang/pubs/sigir05-axiom.pdf
      There are in total six similarity models. All of them are based on BM25, Pivoted Document Length Normalization or Language Model with Dirichlet prior.
      We think it is worthy to add the models as part of Lucene.

        Issue Links

          Activity

          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user Peilin-Yang opened a pull request:

          https://github.com/apache/lucene-solr/pull/83

          LUCENE-7466 add axiomatic similarity

          Add axiomatic similarity approaches to the similarity family.
          More details can be found at http://dl.acm.org/citation.cfm?id=1076116 and https://www.eecis.udel.edu/~hfang/pubs/sigir05-axiom.pdf
          There are in total six similarity models. All of them are based on BM25, Pivoted Document Length Normalization or Language Model with Dirichlet prior.
          We think it is worthy to add the models as part of Lucene.

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/Peilin-Yang/lucene-solr add_axiomatic_funcs

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/lucene-solr/pull/83.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #83


          commit a71d7d9662b57cc704b9c92d2c013826d9ad8c73
          Author: Peilin Yang <yangpeilyn@gmail.com>
          Date: 2016-09-24T17:56:28Z

          add axiomatic similarity


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user Peilin-Yang opened a pull request: https://github.com/apache/lucene-solr/pull/83 LUCENE-7466 add axiomatic similarity Add axiomatic similarity approaches to the similarity family. More details can be found at http://dl.acm.org/citation.cfm?id=1076116 and https://www.eecis.udel.edu/~hfang/pubs/sigir05-axiom.pdf There are in total six similarity models. All of them are based on BM25, Pivoted Document Length Normalization or Language Model with Dirichlet prior. We think it is worthy to add the models as part of Lucene. You can merge this pull request into a Git repository by running: $ git pull https://github.com/Peilin-Yang/lucene-solr add_axiomatic_funcs Alternatively you can review and apply these changes as the patch at: https://github.com/apache/lucene-solr/pull/83.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #83 commit a71d7d9662b57cc704b9c92d2c013826d9ad8c73 Author: Peilin Yang <yangpeilyn@gmail.com> Date: 2016-09-24T17:56:28Z add axiomatic similarity
          Hide
          teofili Tommaso Teofili added a comment -

          thanks Peilin Yang for your patch, here're a couple of comments:

          • I think a testcase for all the added models should be provided in order to make sure they work as expected
          • the changes to NumericDocValues, FloatDocValues and DoubleDocValues break some tests as it seems NDV always returns a Long while FDV and DDV convert such a Long value to an Integer and then back to a Float / Double using Float.intBitsToFloat / Double.intBitsToDouble, can you clarify if / why is that needed for axiomatic similarity ? (if I remove the mentioned changes all the tests pass but then I'm not sure if that has an impact on the Axiomatic similarities because of the missing tests)
          Show
          teofili Tommaso Teofili added a comment - thanks Peilin Yang for your patch, here're a couple of comments: I think a testcase for all the added models should be provided in order to make sure they work as expected the changes to NumericDocValues , FloatDocValues  and DoubleDocValues  break some tests as it seems NDV always returns a Long while FDV and DDV convert such a Long value to an Integer and then back to a Float / Double using Float.intBitsToFloat / Double.intBitsToDouble , can you clarify if / why is that needed for axiomatic similarity ? (if I remove the mentioned changes all the tests pass but then I'm not sure if that has an impact on the Axiomatic similarities because of the missing tests)
          Hide
          ypeilin Peilin Yang added a comment -

          Hi Tommaso Teofili, thanks for your feedback.
          For the `NDV`, `FDV` and `DDV` you've mentioned, could you please directly annotate them in the PR?
          It is more clear that what line or which part of the PR you refer to.

          Show
          ypeilin Peilin Yang added a comment - Hi Tommaso Teofili , thanks for your feedback. For the `NDV`, `FDV` and `DDV` you've mentioned, could you please directly annotate them in the PR? It is more clear that what line or which part of the PR you refer to.
          Hide
          teofili Tommaso Teofili added a comment -

          sorry for the confusion, forget about the NumericDocValues related comment, that came from another leftover patch I had applied locally.
          Therefore it would just be good to have some tests for the axiom similarities, everything else looks good to me.

          Show
          teofili Tommaso Teofili added a comment - sorry for the confusion, forget about the NumericDocValues related comment, that came from another leftover patch I had applied locally. Therefore it would just be good to have some tests for the axiom similarities, everything else looks good to me.
          Hide
          ypeilin Peilin Yang added a comment -

          ok, will add test cases

          Show
          ypeilin Peilin Yang added a comment - ok, will add test cases
          Hide
          ypeilin Peilin Yang added a comment -

          Hi Tommaso Teofili I just added the test cases.
          But when I run `ant test` it fails for some other tests.
          Do you know a easier way to just test the test cases I added?

          Show
          ypeilin Peilin Yang added a comment - Hi Tommaso Teofili I just added the test cases. But when I run `ant test` it fails for some other tests. Do you know a easier way to just test the test cases I added?
          Hide
          teofili Tommaso Teofili added a comment - - edited

          when running 'ant clean test' under lucene/core the only error I see is in TestAxiomaticSimilarity#testIllegalQL which fails because the test has a wrong String in the expected.getMessage().contains("...") check (note also that testSaneNormValues uses BM25Similarity, I have locally changed it to AxiomaticF2EXP).
          Other than that it seems the TestAxiomaticSimilarity actually tests only AxiomaticF2EXP, shouldn't it also test the other Axiomatic extensions?

          You can check the different test options on the wiki Running Tests

          Show
          teofili Tommaso Teofili added a comment - - edited when running 'ant clean test' under lucene/core the only error I see is in TestAxiomaticSimilarity#testIllegalQL which fails because the test has a wrong String in the expected.getMessage().contains("...") check (note also that testSaneNormValues uses BM25Similarity , I have locally changed it to AxiomaticF2EXP ). Other than that it seems the TestAxiomaticSimilarity  actually tests only AxiomaticF2EXP , shouldn't it also test the other Axiomatic  extensions? You can check the different test options on the wiki Running Tests
          Hide
          ypeilin Peilin Yang added a comment -

          Thanks for pointing this out.

          For the test cases, since all of the variations extend from the base Axiomatic class and all the constructors all basically the same (except AxiomaticF3EXP where a queryLen is needed that is why there is a QL test) so I just pick F2EXP to test.

          Does this make any sense to you?

          Show
          ypeilin Peilin Yang added a comment - Thanks for pointing this out. For the test cases, since all of the variations extend from the base Axiomatic class and all the constructors all basically the same (except AxiomaticF3EXP where a queryLen is needed that is why there is a QL test) so I just pick F2EXP to test. Does this make any sense to you?
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 4236da27d1b1cbced6c3fed4b3d3094fe796fa7e in lucene-solr's branch refs/heads/master from Tommaso Teofili
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=4236da2 ]

          LUCENE-7466 - added axiomatic similarity, patch from Peilin Yang

          Show
          jira-bot ASF subversion and git services added a comment - Commit 4236da27d1b1cbced6c3fed4b3d3094fe796fa7e in lucene-solr's branch refs/heads/master from Tommaso Teofili [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=4236da2 ] LUCENE-7466 - added axiomatic similarity, patch from Peilin Yang
          Hide
          teofili Tommaso Teofili added a comment -

          thanks Peilin Yang, I've applied your patch (with minor fixes to javadoc and unused imports).

          Show
          teofili Tommaso Teofili added a comment - thanks Peilin Yang , I've applied your patch (with minor fixes to javadoc and unused imports).
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 033614692a743a07227fc96fc46bffb00b407db6 in lucene-solr's branch refs/heads/branch_6x from Tommaso Teofili
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=0336146 ]

          LUCENE-7466 - added axiomatic similarity, patch from Peilin Yang
          (cherry picked from commit 4236da2)

          Show
          jira-bot ASF subversion and git services added a comment - Commit 033614692a743a07227fc96fc46bffb00b407db6 in lucene-solr's branch refs/heads/branch_6x from Tommaso Teofili [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=0336146 ] LUCENE-7466 - added axiomatic similarity, patch from Peilin Yang (cherry picked from commit 4236da2)
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 4ce7ce08442e191ca6c013bda2e052a91f27b2f4 in lucene-solr's branch refs/heads/branch_6x from Tommaso Teofili
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=4ce7ce0 ]

          LUCENE-7466 - adjusted changes.txt to reflect added axiomatic sim

          Show
          jira-bot ASF subversion and git services added a comment - Commit 4ce7ce08442e191ca6c013bda2e052a91f27b2f4 in lucene-solr's branch refs/heads/branch_6x from Tommaso Teofili [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=4ce7ce0 ] LUCENE-7466 - adjusted changes.txt to reflect added axiomatic sim
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit c3f172a40830b31d005dbb7c6bd518ea236aa5fb in lucene-solr's branch refs/heads/master from Tommaso Teofili
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=c3f172a ]

          LUCENE-7466 - adjusted changes.txt to reflect added axiomatic sim

          Show
          jira-bot ASF subversion and git services added a comment - Commit c3f172a40830b31d005dbb7c6bd518ea236aa5fb in lucene-solr's branch refs/heads/master from Tommaso Teofili [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=c3f172a ] LUCENE-7466 - adjusted changes.txt to reflect added axiomatic sim
          Hide
          mikemccand Michael McCandless added a comment -

          Can this issue be resolved now?

          Show
          mikemccand Michael McCandless added a comment - Can this issue be resolved now?
          Hide
          teofili Tommaso Teofili added a comment -

          sure, thanks Mike.

          Show
          teofili Tommaso Teofili added a comment - sure, thanks Mike.
          Hide
          ypeilin Peilin Yang added a comment -

          Sure. Please feel free to close the issue.

          On Mon, Nov 28, 2016 at 1:05 PM Michael McCandless (JIRA) <jira@apache.org>

          Show
          ypeilin Peilin Yang added a comment - Sure. Please feel free to close the issue. On Mon, Nov 28, 2016 at 1:05 PM Michael McCandless (JIRA) <jira@apache.org>
          Hide
          mikemccand Michael McCandless added a comment -

          I'm confused: why does the issue say it's Open yet I only see a Reopen Issue or Close Issue buttons here?

          Show
          mikemccand Michael McCandless added a comment - I'm confused: why does the issue say it's Open yet I only see a Reopen Issue or Close Issue buttons here?
          Hide
          teofili Tommaso Teofili added a comment - - edited

          well .. that's weird, I had set it to resolved back on Nov 20th (click on the 'All' tab), but then when you commented I saw it was still unresolved and therefore assumed it was reopened by someone else.
          Now it looks resolved because you can close and reopen, but also unresolved as per current resolution value ...

          Show
          teofili Tommaso Teofili added a comment - - edited well .. that's weird, I had set it to resolved back on Nov 20th (click on the 'All' tab), but then when you commented I saw it was still unresolved and therefore assumed it was reopened by someone else. Now it looks resolved because you can close and reopen, but also unresolved as per current resolution value ...
          Hide
          mikemccand Michael McCandless added a comment -

          OK it was definitely in some bizarro state But I think I fixed it by reopening and then resolving again!

          Show
          mikemccand Michael McCandless added a comment - OK it was definitely in some bizarro state But I think I fixed it by reopening and then resolving again!

            People

            • Assignee:
              teofili Tommaso Teofili
              Reporter:
              ypeilin Peilin Yang
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development