Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 4.0-ALPHA
    • Fix Version/s: 4.0-ALPHA
    • Component/s: core/search
    • Labels:
    • Lucene Fields:
      New, Patch Available

      Description

      The DFR normalizations H1 and H2 are parameter-free. This is in line with the original article, but not with the thesis, where H2 accepts a c parameter, nor with information-based models, where H1 also accepts a c parameter.

      1. LUCENE-3566.patch
        3 kB
        David Mark Nemeskey
      2. LUCENE-3566.patch
        3 kB
        David Mark Nemeskey
      3. LUCENE-3566.patch
        9 kB
        Robert Muir

        Activity

        Uwe Schindler made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Robert Muir made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Hide
        Robert Muir added a comment -

        Thanks David!

        Show
        Robert Muir added a comment - Thanks David!
        Robert Muir made changes -
        Attachment LUCENE-3566.patch [ 12512960 ]
        Hide
        Robert Muir added a comment -

        I thought we had done this already: but realized I forgot about it!

        I added the solr factory/parsing stuff to the patch. Will commit shortly.

        Show
        Robert Muir added a comment - I thought we had done this already: but realized I forgot about it! I added the solr factory/parsing stuff to the patch. Will commit shortly.
        David Mark Nemeskey made changes -
        Attachment LUCENE-3566.patch [ 12502756 ]
        Hide
        David Mark Nemeskey added a comment -

        Patch re-based on trunk.

        Show
        David Mark Nemeskey added a comment - Patch re-based on trunk.
        Hide
        Robert Muir added a comment -

        Yeah I agree... maybe in the patch we can expose the parameter to the factory in solr (DFRSimilarityFactory has a param-parsing method for Normalization reused by IB, too) ?

        Show
        Robert Muir added a comment - Yeah I agree... maybe in the patch we can expose the parameter to the factory in solr (DFRSimilarityFactory has a param-parsing method for Normalization reused by IB, too) ?
        Hide
        David Mark Nemeskey added a comment -

        i didnt think H1 took params (the thesis says 'Therefore, the constant of C is 1 assuming H1', then defines it without C). did the IB paper make a mistake?

        Good question. Perhaps it was a mistake; however, according to my colleague, who had experimented with the IB method in our own engine and proposed to add the parameter to Lucene, a well chosen c can improve the results. Well, duh really; nevertheless, as long as we have defaults, shouldn't be a problem.

        Show
        David Mark Nemeskey added a comment - i didnt think H1 took params (the thesis says 'Therefore, the constant of C is 1 assuming H1', then defines it without C). did the IB paper make a mistake? Good question. Perhaps it was a mistake; however, according to my colleague, who had experimented with the IB method in our own engine and proposed to add the parameter to Lucene, a well chosen c can improve the results. Well, duh really; nevertheless, as long as we have defaults, shouldn't be a problem.
        Robert Muir made changes -
        Fix Version/s 4.0 [ 12314025 ]
        Fix Version/s flexscoring branch [ 12316437 ]
        Affects Version/s 4.0 [ 12314025 ]
        Affects Version/s flexscoring branch [ 12316437 ]
        Hide
        Robert Muir added a comment -

        editing fix version to 4.0, since flexscoring branch was merged, i think we can safely do any scoring improvements in mainline trunk

        Show
        Robert Muir added a comment - editing fix version to 4.0, since flexscoring branch was merged, i think we can safely do any scoring improvements in mainline trunk
        Hide
        Robert Muir added a comment -

        +1, lets add these.

        i didnt think H1 took params (the thesis says 'Therefore, the constant of C is 1 assuming H1', then defines it without C). did the IB paper make a mistake?

        either way, it wont hurt anything to add the parameter, just confusing

        Show
        Robert Muir added a comment - +1, lets add these. i didnt think H1 took params (the thesis says 'Therefore, the constant of C is 1 assuming H1', then defines it without C). did the IB paper make a mistake? either way, it wont hurt anything to add the parameter, just confusing
        David Mark Nemeskey made changes -
        Status In Progress [ 3 ] Open [ 1 ]
        David Mark Nemeskey made changes -
        Status Open [ 1 ] In Progress [ 3 ]
        David Mark Nemeskey made changes -
        Lucene Fields New [ 10121 ] New,Patch Available [ 10121,10120 ]
        David Mark Nemeskey made changes -
        Field Original Value New Value
        Attachment LUCENE-3566.patch [ 12502748 ]
        Hide
        David Mark Nemeskey added a comment -

        Patch.

        Show
        David Mark Nemeskey added a comment - Patch.
        David Mark Nemeskey created issue -

          People

          • Assignee:
            David Mark Nemeskey
            Reporter:
            David Mark Nemeskey
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Due:
              Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 1h
              1h
              Remaining:
              Remaining Estimate - 1h
              1h
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development