Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-10611

KnnVectorQuery throwing Heap Error for Restrictive Filters

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • 9.3
    • None
    • None

    Description

      The HNSW graph search does not consider that visitedLimit may be reached in the upper levels of graph search itself

      This occurs when the pre-filter is too restrictive (and its count sets the visitedLimit). So instead of switching over to exactSearch, it tries to pop from an empty heap and throws an error

       

      To reproduce this error, we can increase the numDocs here to 20,000 (so that nodes have more neighbors, and visitedLimit is reached faster)

       

      Stacktrace:

      The heap is empty
      java.lang.IllegalStateException: The heap is empty
      at __randomizedtesting.SeedInfo.seed([D7BC2F56048D9D1A:A1F576DD0E795BBF]:0)
      at org.apache.lucene.util.LongHeap.pop(LongHeap.java:111)
      at org.apache.lucene.util.hnsw.NeighborQueue.pop(NeighborQueue.java:98)
      at org.apache.lucene.util.hnsw.HnswGraphSearcher.search(HnswGraphSearcher.java:90)
      at org.apache.lucene.codecs.lucene92.Lucene92HnswVectorsReader.search(Lucene92HnswVectorsReader.java:236)
      at org.apache.lucene.codecs.perfield.PerFieldKnnVectorsFormat$FieldsReader.search(PerFieldKnnVectorsFormat.java:272)
      at org.apache.lucene.index.CodecReader.searchNearestVectors(CodecReader.java:235)
      at org.apache.lucene.search.KnnVectorQuery.approximateSearch(KnnVectorQuery.java:159) 

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              kaivalnp Kaival Parikh
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h
                  1h