Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-8477

Improve handling of inner disjunctions in intervals

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 8.1
    • None
    • None
    • New

    Description

      The current implementation of the disjunction interval produced by Intervals.or is a direct implementation of the OR operator from the Vigna paper.  This produces minimal intervals, meaning that (a) is preferred over (a b), and (b) also over (a b).  This has advantages when it comes to counting intervals for scoring, but also has drawbacks when it comes to matching.  For example, a phrase query for ((a OR (a b)) BLOCK (c)) will not match the document (a b c), because (a) will be preferred over (a b), and (a c) does not match.

      This ticket is to discuss the best way of dealing with disjunctions.

      Attachments

        1. LUCENE-8477.patch
          19 kB
          Alan Woodward
        2. LUCENE-8477.patch
          16 kB
          Alan Woodward
        3. LUCENE-8477.patch
          40 kB
          Alan Woodward
        4. LUCENE-8477.patch
          45 kB
          Alan Woodward

        Issue Links

          Activity

            People

              romseygeek Alan Woodward
              romseygeek Alan Woodward
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m