Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-7254

DocIDSetBuilder is no good for points

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 6.1, 7.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      For the postings lists, I think this approach works well in dense cases (e.g. whole DISI's are added, things are coming in order, etc).

      However in the points case, it holds back range performance significantly. There are a couple of problems here:

      • expensive cardinality computation (this is a 2% hit) when its totally unnecessary. we can use index statistics to help here.
      • lots of conditional stuff in add(). This includes growing checks / bitset switching checks and so on (which happens even if you are smart and call grow, but this stuff all adds up).

      I dont think we should try to create a magical shared API that is both efficient for postings lists of unstructured stuff and at the same time point collection for structured fields, instead we should just do things differently for points and iterate from there.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              rcmuir Robert Muir

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment