[LUCENE-10527] Use bigger maxConn for last layer in HNSW - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Task
Status: Closed
Priority: Minor
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 9.2
Component/s: None
Labels:
None

Lucene Fields:

New

Description

Recently I was rereading the HNSW paper (https://arxiv.org/pdf/1603.09320.pdf) and noticed that they suggest using a different maxConn for the upper layers vs. the bottom one (which contains the full neighborhood graph). Specifically, they suggest using maxConn=M for upper layers and maxConn=2*M for the bottom. This differs from what we do, which is to use maxConn=M for all layers.

I tried updating our logic using a hacky patch, and noticed an improvement in latency for higher recall values (which is consistent with the paper's observation):

Results on glove-100-angular
Parameters: M=32, efConstruction=100

As we'd expect, indexing becomes a bit slower:

Baseline: Indexed 1183514 documents in 733s 
Candidate: Indexed 1183514 documents in 948s

When we benchmarked Lucene HNSW against hnswlib in ~~LUCENE-9937~~, we noticed a big difference in recall for the same settings of M and efConstruction. (Even adding graph layers in ~~LUCENE-10054~~ didn't really affect recall.) With this change, the recall is now very similar:

Results on glove-100-angular
Parameters: M=32, efConstruction=100

k            Approach                                                  Recall     QPS
10           luceneknn dim=100 {'M': 32, 'efConstruction': 100}        0.563     4410.499
50           luceneknn dim=100 {'M': 32, 'efConstruction': 100}        0.798     1956.280
100          luceneknn dim=100 {'M': 32, 'efConstruction': 100}        0.862     1209.734
500          luceneknn dim=100 {'M': 32, 'efConstruction': 100}        0.958      341.428
800          luceneknn dim=100 {'M': 32, 'efConstruction': 100}        0.974      230.396
1000         luceneknn dim=100 {'M': 32, 'efConstruction': 100}        0.980      188.757

10           hnswlib ({'M': 32, 'efConstruction': 100})                0.552    16745.433
50           hnswlib ({'M': 32, 'efConstruction': 100})                0.794     5738.468
100          hnswlib ({'M': 32, 'efConstruction': 100})                0.860     3336.386
500          hnswlib ({'M': 32, 'efConstruction': 100})                0.956      832.982
800          hnswlib ({'M': 32, 'efConstruction': 100})                0.973      541.097
1000         hnswlib ({'M': 32, 'efConstruction': 100})                0.979      442.163

I think it'd be nice update to maxConn so that we faithfully implement the paper's algorithm. This is probably least surprising for users, and I don't see a strong reason to take a different approach from the paper? Let me know what you think!

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

image-2022-04-20-14-53-58-484.png
20/Apr/22 21:53
28 kB
Julie Tibshirani
Screen Shot 2022-05-18 at 4.26.14 PM.png
18/May/22 23:26
387 kB
Julie Tibshirani
Screen Shot 2022-05-18 at 4.26.24 PM.png
18/May/22 23:26
108 kB
Julie Tibshirani
Screen Shot 2022-05-18 at 4.27.37 PM.png
18/May/22 23:27
111 kB
Julie Tibshirani

Issue Links

links to

GitHub Pull Request #872

GitHub Pull Request #887

Activity

People

Assignee:: Mayya Sharipova

Reporter:: Julie Tibshirani

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 20/Apr/22 22:14

Updated:: 28/Aug/22 16:39

Resolved:: 19/May/22 12:47

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

4h 40m