Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-9625

Benchmark KNN search with ann-benchmarks

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None
    • New

    Description

      In addition to benchmarking with luceneutil, it would be good to be able to make use of ann-benchmarks, which is publishing results from many approximate knn algorithms, including the hnsw implementation from its authors. We don't expect to challenge the performance of these native code libraries, however it would be good to know just how far off we are.

      I started looking into this and posted a fork of ann-benchmarks that uses KnnGraphTester class to run these: https://github.com/msokolov/ann-benchmarks. It's still a WIP; you have to manually copy jars and the KnnGraphTester.class to the test host machine rather than downloading from a distribution. KnnGraphTester needs some modifications in order to support this process - this issue is mostly about that.

      One thing I noticed is that some of the index builds with higher fanout (efConstruction) settings time out at 2h (on an AWS c5 instance), so this is concerning and I'll open a separate issue for trying to improve that.

      Attachments

        Activity

          People

            Unassigned Unassigned
            sokolov Michael Sokolov
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 10m
                10m