Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Incomplete
-
None
-
None
-
None
Description
MinHash currently is using the same `hashDistance` function as RandomProjection. This does not make sense for MinHash because the Jaccard distance of two sets is not relevant to the absolute distance of their hash buckets indices.
This bug could affect accuracy of multi probing NN search for MinHash.
Attachments
Issue Links
- Is contained by
-
SPARK-18454 Changes to improve Nearest Neighbor Search for LSH
- Resolved
- is related to
-
SPARK-5992 Locality Sensitive Hashing (LSH)
- Resolved
- relates to
-
SPARK-18392 LSH API, algorithm, and documentation follow-ups
- Resolved
- links to