[LUCENE-10421] Non-deterministic results from KnnVectorQuery? - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 9.1
Component/s: None
Labels:
- vector-based-search

Lucene Fields:

New

Description

Nightly benchmarks have been upset for the past ~1.5 weeks because it looks like KnnVectorQuery is giving slightly different results on every run, even on an identical (deterministically constructed – single thread indexing, flush by doc count, SerialMergeSchedule, LogDocCountMergePolicy, etc.) index each night. It produces failures like this, which then abort the benchmark to help us catch any recent accidental bug that alters our precise top N search hits and scores:

 Traceback (most recent call last):
 File “/l/util.nightly/src/python/nightlyBench.py”, line 2177, in <module>
  run()
 File “/l/util.nightly/src/python/nightlyBench.py”, line 1225, in run
  raise RuntimeError(‘search result differences: %s’ % str(errors))
RuntimeError: search result differences: [“query=KnnVectorQuery:vector[-0.07267512,...][10] filter=None sort=None groupField=None hitCount=10: hit 4 has wrong field/score value ([20844660], ‘0.92060816’) vs ([254438\
06], ‘0.920046’)“, “query=KnnVectorQuery:vector[-0.12073054,...][10] filter=None sort=None groupField=None hitCount=10: hit 7 has wrong field/score value ([25501982], ‘0.99630797’) vs ([13688085], ‘0.9961489’)“, “qu\
ery=KnnVectorQuery:vector[0.02227773,...][10] filter=None sort=None groupField=None hitCount=10: hit 0 has wrong field/score value ([4741915], ‘0.9481132’) vs ([14220828], ‘0.9579846’)“, “query=KnnVectorQuery:vector\
[0.024077624,...][10] filter=None sort=None groupField=None hitCount=10: hit 0 has wrong field/score value ([7472373], ‘0.8460249’) vs ([12577825], ‘0.8378446’)“]

At first I thought this might be expected because of the recent (awesome!!) improvements to HNSW, so I tried to simply "regold". But the regold did not "take", so it indeed looks like there is some non-determinism here.

I pinged msokolov@gmail.com and he found this random seeding that is most likely the cause?

public final class HnswGraphBuilder {

  /** Default random seed for level generation * */
  private static final long DEFAULT_RAND_SEED = System.currentTimeMillis();

Can we somehow make this deterministic instead? Or maybe the nightly benchmarks could somehow pass something in to make results deterministic for benchmarking? Or ... we could also relax the benchmarks to accept non-determinism for KnnVectorQuery task?

Attachments

Issue Links

is related to

LUCENE-10423 Remove uses of wall-clock time in codebase

Open

links to

GitHub Pull Request #686

Activity

People

Assignee:: Unassigned

Reporter:: Michael McCandless

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 14/Feb/22 14:19

Updated:: 27/Sep/22 09:16

Resolved:: 25/Feb/22 06:35

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

40m