Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Incomplete
-
None
-
None
Description
We all agree to do the following improvement to Multi-Probe NN Search:
(1) Use approxQuantile to get the hashDistance threshold instead of doing full sort on the whole dataset
Currently we are still discussing the following:
(1) What hashDistance (or Probing Sequence) we should use for MinHash
(2) What are the issues and how we should change the current Nearest Neighbor implementation
Attachments
Issue Links
- contains
-
SPARK-18334 What hashDistance should MinHash use?
- Resolved
-
SPARK-18409 LSH approxNearestNeighbors should use approxQuantile instead of sort
- Resolved
-
SPARK-30120 LSH approxNearestNeighbors should use BoundedPriorityQueue when numNearestNeighbors is small
- Resolved
- Is contained by
-
SPARK-18392 LSH API, algorithm, and documentation follow-ups
- Resolved