Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-6896

Fix/document various Similarity bugs around extreme norm values

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 5.5, 6.0
    • None
    • None
    • New

    Description

      Spinoff from LUCENE-6818:

      Ahmet Arslan found problems with every Similarity (except ClassicSimilarity) when trying to test how they behave on every possible norm value, to ensure they are robust for all index-time boosts.

      There are several problems:
      1. buggy normalization decode that causes the smallest possible norm value (0) to be treated as an infinitely long document. These values are intended to be encoded as non-negative finite values, but going to infinity breaks everything.
      2. various problems in the less practical functions that already have documented warnings that they do bad things for extreme values. These impact DFR models D, Be, and P and IB distribution SPL.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            rcmuir Robert Muir
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment