Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-10302

PriorityQueue: optimize where we collect then iterate by using O(N) heapify

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None
    • New

    Description

      Looking at LUCENE-8875 (LargeNumHitsTopDocsCollector.java ) I got to wondering if there was faster-than O(N*log(N)) way of loading a PriorityQueue when we provide a bulk array to initialize the heap/PriorityQueue. It turns out there is: the JDK's PriorityQueue supports this in its constructors, referring to "This classic algorithm due to Floyd (1964) is known to be O(size)" – heapify() method. There's another that may or may not be the same; I didn't look too closely yet. I see a number of uses of Lucene's PriorityQueue that first collects values and only after collecting want to do something with the results (typical / unsurprising). This lends itself to a builder pattern that can look similar to LargeNumHitsTopDocsCollector in terms of first having an array used like a list and then move over to the PriorityQueue if/when it gets full (it may not).

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              dsmiley David Smiley
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: